Fiscal Policy Asymmetries: A Threshold Vector Autoregression Approach

4 downloads 0 Views 2MB Size Report
Abstract. This paper estimates the effects of government spending shocks and tax shocks on U.S. economic activity using a threshold vector autoregression ...
Fiscal Policy Asymmetries: A Threshold Vector Autoregression Approach Steven Fazzari

James Morleyy

yy

Irina Panovska

Abstract This paper estimates the e¤ects of government spending shocks and tax shocks on U.S. economic activity using a threshold vector autoregression (TVAR) model. We …nd evidence of asymmetry in the e¤ects of …scal policy across regimes, de…ned by the state of the business cycle and the nature of the shock. The e¤ects of government spending on output in the low regime are large and persistent, while they are small in the high regime. Consumption increases in both regimes, but the increase is smaller and less permanent in the high regime. Investment increases across regimes, but the increase in negligible in the high regime. Tax cuts cuts have larger short-run e¤ects but smaller long run e¤ects when the economy is in the low regime. In the high regime, tax cuts always have larger e¤ects than spending shocks.

JEL Codes: C32, E32, E62 Keywords: Fiscal Policy, Threshold, Vectorautoregression, Non-linear Models

Washington University in St. Louis y University of New South Wales ;yy Washington University in St. Louis. Corresponding author: [email protected]

1

1

Introduction

The "Great Recession" and the …scal stimulus package of more than $700 billion dollars have lead to reignited interest in multiplier e¤ects and the e¢ ciency of …scal policy in general. There is a plethora of working papers written in 2009 and 2010, all attempting to answer the question whether government spending has permanent e¤ects on output and whether it has any e¤ects on consumption and on real wages. The literature, both old and new, is very divided on this issue. There are four main strands in the related empirical literature: papers based on traditional Keynesian models, VAR based models, DSGE models, and models based on the narrative approach introduced by Ramey and Shapiro (1998). In traditional Keynesian models the interest rate is usually held …xed close to 0 over the whole forecasting horizon, and the multipliers1 obtained from those models are typically very large (always greater than 1, sometimes as big as 4). In particular, the …scal stimulus package was designed based on a study by Romer and Bernstein (2009) that estimated a …scal multiplier e¤ect that was approximately equal to 1:6. Studies based on the VAR methodology, where government spending is assumed to be predetermined typically …nd that output, consumption, and real wages increase. In the models used by Blanchard and Perotti (2002) and Perotti (2008) the response of output and consumption to government spending is positive and persistent, and the response of investment is negative. The magnitude of the response depends on the identi…cation of the model. Blanchard and Perotti (2002) and Perotti (2008), use institutional information to identify the shocks, and they get multipliers that are about 1.3. Mountford and Uhlig (2005) use an alternative approach that utilizes sign restrictions, and they get a small positive government spending multiplier (0.5) and a consumption multiplier that is very close to 0.These VAR -based studies are often criticized because they do not explicitly allow for non-linear responses. Some DSGE models allow for asymmetric responses that depend on the interest rate by holding the interest rate pegged at zero for a certain period, and comparing the impulse responses when the interest rate is pegged and when it is not. The signi…cant drawback of these models is that they automatically allow the interest rate to adjust and to revert to its "usual" dynamics after a certain (exogenously …xed) number of periods. DSGE simulations yield multipliers that are above 1 only when the interest rate is …xed exogenously. When the interest rate is allowed to 1

Di¤erent authors use di¤erent de…nitions for multipliers. In this paper, the multiplier is de…ned as the maximum cumulative response of a one-time spending shock that is equal to 1% of GDP.

2

increase, the multiplier is well below 1. Almost all of the DSGE models are based on a New Keynesian model with Calvo pricing frictions, and the model allows the interest rate to revert to the "natural" interest rate determined by standard neoclassical …rst order conditions in the long run. Papa (2005) uses an RBC model with Calvo pricing and …xed interest rates and she …nds that output, real wages,and consumption rise, and investment falls, but the magnitude of the responses is very sensitive to the parametrization of the model. Cogan et. al (2009) employ the same model with interest rates that are …xed for 4 periods …nd a multiplier that is only 0.4 A third strand of the literature uses the narrative approach to identify exogenous government spending shocks. Ramey and Shapiro (1998), Eichenbaum et al. (1999) and Ramey (2009) show that when using the Ramey-Shapiro or the Ramey military spending variable as a proxy for government spending, the increase in output is small, and the increase in consumption is very temporary Ramey (2009) argues that and that the VAR-based multipliers are large only because they fail to capture the importance of timing of government shocks. and because the combined narrative data Granger causes the government spending shocks. However, Perotti (2007) shows that lagged government spending, tax, and GDP shocks also predict the Ramey-Shapiro narrative dates. Van Brusselen (2009) provides a very extensive survey of the empirical literature, and he shows that the estimated government multipliers are very sensitive to the choice of the parameters and to the selection of the model. He points out that within the same class of models (DSGE with Calvo pricing), the government multiplier can vary between -3.7 and 3.7, depending on how the increase in spending is …nanced, how long is the interest rate pegged, and whether the economy is closed or open. In this paper we adopt a VAR-based approach that allows the timing of shocks to a¤ect the multiplier, but we do not do not make any assumptions that interest rates are pegged, or that there is any price or wage stickiness. In particular, we employ a threshold vector autoregression model, in which the system’s dynamics change back and forth between two regimes ( high growth regime and a low growth regime). Our main goal is to examine if the e¤ects of government spending and tax cuts are di¤erent across business cycles, and if there is a di¤erence between the e¤ects of small and large shocks. This paper is similar in spirit to a recent paper by Mittnik and Semmler (2009), but we do not …x the threshold and the switching variable exogenously, and we explicitly test for non-linearity. Both the threshold and the threshold variable are estimated from the data using maximum likelihood estimation. By 3

using generalized impulse response functions (GIRF) we attempt to isolate the relative e¤ects of …scal policy and address the question whether …scal policy has di¤erent e¤ects on output during expansions and recessions. The results demonstrate that there is strong evidence in favor of nonlinearity, and that government spending has larger e¤ects on output during recessions. The increase in consumption is larger and more persistent when the system starts in a low regime. Investment increases during the low regime, and the increase does not die out even after 20 quarters. There is no empirical evidence of crowding out when the system starts in the high regime, but the positive e¤ects of spending shocks on investment die out after 5 quarters. In the low regime, tax shocks have larger impact on consumption and investment in the short run, but lower cumulative e¤ects. In the high regime, tax shocks have larger short-run and long-run e¤ects. The rest of the paper is organized as follows. Section 2 introduces the baseline empirical model, the estimation method, and the test for linearity. Section 3 presents the results obtained using the estimated model, and extends the baseline model to models that include consumption, components of consumption, investment, components of investment, and interest rates. Section 4 concludes. Appendix 1 explains the method for obtaining the generalized impulse response function, and the estimated generalized impulse response functions are included in Appendix 2. The results of the linearity tests are included in Appendix 3.

2

Empirical Model, Estimation, and Testing

Conventional VAR models are not capable of capturing non-linear dynamics such as regime switching and asymmetric responses to shocks. To capture state dependencies and asymmetric responses to shocks, I specify a nonlinear threshold vector autoregression (TVAR) model that is a multivariate extension of the threshold vector autoregression model proposed by Tong (1978,1983). A TVAR model is a relatively simple way to capture nonlinearities such as regime switching and asymmetry to shock responses. Threshold models work by splitting the time series endogenously into di¤erent regimes. Within each regime the time series is described by a linear model. The TVAR model used in this paper is speci…ed as : Yt =

1 0

+

1 1

(L)Yt

1

2

+(

0

(L)Yt 1 )I[qt

d

> ] + "t

(1)

In the baseline model, Yt is a vector containing the …rst di¤erence of the logarithm of real government spending, the …rst di¤erence of the

4

logarithm of net taxes,and the …rst di¤erence of the logarithm of real GDP. In the extended model, Yt is a vector containing the …rst di¤erence of the logarithm of real government spending, the …rst di¤erence of the logarithm of net taxes,and the …rst di¤erence of the logarithm of real GDP net of government spending, and one of the following measures of slack: capacity utilization, demeaned capacity utilization (allowing for structural breaks) , output gap, unemployment, unemployment growth, demeaned unemployment, and growth in the demeaned unemployment series. We demeaned the capacity utilization and the unemployment series allowing for exogenous structural breaks in the mean of each series. 2 1 1 (L) are lag polynomial matrices;"t is the vector of dis1 (L) and turbances with mean zero and covariance matrix . The covariance does not change over time or over regimes. qt d is the threshold variable which determines the prevailing regime; is the threshold parameter at which the regime switching occurs. I[ ] is an indicator function that is equal to 1 when the threshold variable is above and equal to 0 when the threshold variable is below :The integer d is the delay lag. We assumed that "t is iid Gaussian. Government spending and net taxes are de…ned as in Blanchard and Perotti (2002). The period considered is 1967Q1-2009Q4. All output components were measured in real terms and were seasonally adjusted by the source. The capacity utilization series was converted to quarterly frequency using simple arithmetic means, and we used the last observation for the quarter to convert interest rates to quarterly frequency. The data on output,components, and government spending was obtained from NIPA-BEA, and the capacity utilization data was obtained from the Federal Reserve Statistical Releases website. The interest rate series and the CPI data were obtained from the Federal Reserve of St. Louis website, and the expected in‡ation series was obtained from Macroeconomic Advisers. We used di¤erences because the logarithms of real GDP and output components exhibit non-stationarity. Johansen cointegration tests suggest the absence of cointegrating relationship between government spending and taxes, and between output and output components (consumption, government spending, taxes, investment, and consumption). We obtained very similar results when we allowed spending and taxes to be cointegrated, and when we allowed output components to be cointegrated with output. We only considered a model with two regimes. It is certainly possible to extend the model given by Equation 1 to accommodate more than two regimes, but that would make the computation using numerical methods very burdensome because of the large number of parameters that need to 5

be estimated. Government spending was ordered …rst and taxes were ordered second in all models,i.e. government spending is assumed to respond to economic conditions only with a lag, but economic conditions are allowed to respond immediately to government spending. Changing the order of spending and taxes does not a¤ect our results signi…cantly. The number of lags was based on AIC (for the linear VAR model). Both AIC and SIC selected 4 lags. Unlike Mittnik and Semmler (2009), who allow the number of lags to vary across regimes,we assumed that the number of lags was equal in each regime. Since we estimated the threshold from the data, and we selected the threshold variable using maximum likelihood, computing the AIC for each threshold and for each threshold variable would have been too computationally burdensome. Economic theory implies several possible choices for the threshold variable. Ricardian equivalence suggests that when the de…cit is high, the e¤ects of government spending are negative because agents expect future increase in taxes. The DSGE models used in the literature suggest that the e¤ects of government spending depend on the interest rates, and Neo-Keynesian models suggest that the dynamics may depend on the state of the economy. Since we did not want to …x the switching variable a priori, we chose a large set of possible threshold variables, and selected the optimal threshold variable using maximum likelihood estimation. The threshold variables I considered were: 1. output: lagged output growth, long output di¤erences, moving averages of output di¤erences2 2. lagged output gap 3. lagged capacity utilization: levels, and demeaned levels, …rst differences, and …rst di¤erences of the demeaned series 4. lagged unemployment: level, di¤erence, demeaned level, di¤erences in the demeaned series 5. de…cits and debt to GDP ratio 6. interest rates and changes in interest rates There is evidence that both capacity utilization and unemployment have structural breaks in the mean, which would make the series nonstationary and therefore unsuitable for use in the TVAR. Standard tests 2 Since we were already estimating a large number of parameters, the weights for the moving averages were …xed exogenously.We considered an arithmetic mean of the Pl 1 past 4 di¤erences, and ql;t d = l d+1 threshold_var t j for l = 1; d = 4. j=d

6

rejected the null of no structural break in capacity utilization and in unemployment. Since there is no consensus whether unemployment has a unit root or just exogenous structural breaks in the mean, we used the level, …rst di¤erences, the demeaned levels, and the di¤erences of the demeaned levels as possible switching variables. The threshold, the coe¢ cients, the threshold variable , and the delay parameter are estimated using concentrated maximum likelihood. By de…nition, the estimator b = ( b ; b ; b) jointly maximizes the log likelihood. Since "t is assumed to be Gaussian, the ML estimators can be obtained by using least squares estimation. For this maximization is restricted to a bounded set = [ ; ] that covered the sample range of the threshold variable. The estimator is obtained using the following 3-step procedure: 1. Conditional of and the threshold variable, the model is linear in and . Estimating the linear model by splitting the sample into two subsamples yields the conditional estimators b and b . The estimated threshold value(conditional on the threshold variable and the delay lag) can be identi…ed uniquely as b = arg maxllikn ( jq; d) 2

(2)

n

where is approximated by a grid search on n = \ fq1 ; q2 ; :::; qn g. To ensure identi…cation, the bottom and top 15% quantiles of the threshold variable are trimmed. 2.The optimal delay lag for a …xed threshold variable is chosen as the value that maximizes the log likelihood: db = arg max llikn (b; djq)

(3)

b q) q = arg maxllikn (b; d;

(4)

d=f1;2;3;4g

where b is obtained using equation 2 for each d. 3. Finally, the optimal threshold variable is selected as the variable that maximizes the log likelihood: q2Q

Where b is obtained using Equation 2 for each q and d; d is obtained using Equation 3 and Q is the set that contains all the possible variables listed above. We considered up to 4 lags for the di¤erences and the long di¤erences. To test for the presence of nonlinear e¤ects, we extend the procedure introduced by Hansen(1996, 1997) to a multivariate setting. In particular, we want to test the null hypothesis H0 : 1o = 2o = 0 that the 7

coe¢ cients are equal against subsamples against the alternative that at least one of the elements of the matrices 1o , 2o is not 0. This testing problem is tainted by the di¢ culty that the threshold is not identi…ed under the null. However, if the errors are iid, a test with near-optimal power against alternatives distant from the null hypothesis is the sup LR test: LR = supfLRn ( )g

(5)

2

where b1 ( ) LRn ( ) = 2fln L

b0 g ln L

(6) b0 is the likelihood ratio statistics against H1 when is known.ln L b and ln L1 ( ) are the estimated values of the likelihood function under the null and under the alternative hypothesis for each :Since is not identi…ed under the null, the asymptotic distribution of the test statistics is not chi-square. Hansen (1996, 1997) shows that the asymptotic distribution can be approximated by a bootstrap procedure. Following Hansen’s procedure, we tests the null allowing heteroscedasticity in the error term3 . The test for linearity was constructed as follows: 1. Estimate the model under the null hypothesis of linearity and obbo : tain the likelihood function L 2. Estimate the model under the alternative for each possible threshb1 ( ). old value and obtain the likelihood function L b1 ( ) 3. Form the LR statistics LR = supf2(L 2

4. Obtain the bootstrap distribution

b0 g L

(a) Generate new bivariate independent variable Yt under the null hypothesis. (b) Estimate the model under the null using the generated data, and obtain the likelihood function L0 .

3

(c) Estimate the model under the alternative and obtain the likeb1 ( ). lihood function L

Monte Carlo simulations (not included), show that Hansen’s test with = 0:05 is true to size in …nite samples, but the power is low. Following Hansen’s suggestions, we also used a bootstrap LM test (constructed following the same procedure as the LR test, except we only simulate under the null). Hansen shows that the LM test has better power in …nite samples, without compromising the size of the test. The results from the LM and the LR test were almost identical for all models.

8

(d) Form the LR statistics LR = supf2(L1 ( ) 2

L0 )g.

(e) Obtain the bootstrapped p-value as the percentage of bootstrap samples for which LR > LR. When testing the trivariate model for nonlinearity, treating the switching variable as exogenous, the null is rejected at the 5% level for all of the switching variables that we consider, except for interest rate, de…cit, and debt to GDP ratio. Furthermore, it is very likely that the switching variables that are not included in the TVAR are not exogenous, so it not completely clear if the null is rejected because there is strong evidence of non-linearity, or because of omitted variable bias. To remedy this problem, we extend the model to include a measure of slack as a fourth variable. Section 3 summarizes the results we obtained for each measure of slack. To estimate the e¤ects of shocks to spending and taxes , we used the estimated linear and non-linear model to construct impulse response functions for output and output components. For the non-linear model, we constructed two sets of impulse responsesimpulse responses when the economy is assumed to remain in one state forever, and impulse responses when the economy is allowed to evolve because the switching variable is allowed to respond to shocks in government spending and taxes. When the economy is assumed to remain in one state forever (or for a given time horizon), the system is linear within a state, so the impulse response functions can be obtained by using the estimated VAR coe¢ cients for the given regime. Since the system is linear within regimes, there are no sign or size asymmetries, and the IRF does not depend on the initial state. If we allow the system to evolve and move between regimes, the impulse response function depends on the initial state, and possibly on the size and the sign of the shock. Following Koop, Pesaran and Potter (1996), to obtain the responses when the system evolves, we use generalized impulse response functions. We de…ne the IRF to be the change in conditional expectation of Yt+k as a result of an exogenous shock "t : E[Yt+k j

t 1 ; "t ]

E[Yt+k j

t 1]

(7)

where t 1 is the information set at time t-1. Calculating GIRF requires specifying the nature of the shock "t and the initial conditions t 1 . The conditional expectations E[Yt+k j t 1 ; "t ] and E[Yt+k j t 1 ] are computed by simulating the model. The impulse response functions are computed using the following algorithm (a detailed description is provided in the Appendix). First, shocks for periods 0 to q are drawn 9

from the residuals of the estimated TVAR model and, for given initial values of the variables, fed through the estimated model to produce a simulated data series. The result is a forecast of the variables conditional on initial values and a particular sequence of shocks. Next, the same procedure is repeated with the same initial values and residuals, except that the shock to government spending or taxes in period 0 is …xed at 1%, -1% or 2% of GDP (for that particular starting value of GDP). The shocks are fed though the model and a forecast is produced just as above. The di¤erence between this forecast and the baseline model is the impulse response function for a particular sequence of shocks and initial values. Impulse response function are computed for …ve hundred draws from the residuals and averaged to produce impulse response function conditional only on a particular history. These impulse response functions are then averaged over a particular subset of initial values (in the rest of the paper, "recessions" denote the periods when the switching variable was below the threshold, and "expansions" denote the periods when the switching variable was above the threshold).A potential problem with this approach is that taking random draws from the estimated residuals does not take into account heteroscedasticity, which may a¤ect the impulse response function. To check the robustness of the results, we adopted the approach used by Weise (1999) and used two permutations of the simulation model. In the …rst permutation, we set all shocks over the forecasting horizon to be equal to zero. In the second, the residuals were regressed on a fourth-order polynomial of the threshold variable, draws were made from the residuals from this regression, and the "true" residuals were reconstructed before being fed through the estimated model.

3

Estimation Results

The baseline model, described in section 2, included the di¤erences of log government spending, log net taxes, log output, and a measure of slack. We initially considered …ve possible measures of slack: the gap between potential output and real output (as measured by the CBO), demeaned capacity utilization (allowing for structural breaks in the mean), unemployment, di¤erences in unemployment, demeaned unemployment (allowing for structural breaks in mean), and di¤erences in demeaned unemployment. Figures 1 and 2 in Appendix II show the demeaned series versus the raw series. The tests for breaks in mean and the demeaning procedure are described in Appendix III. Tables 1 and 2 in Appendix III show the results of the tests for structural break, and give the summary statistics for the raw series and for the demeaned series. The estimated 10

VAR when using unemployment in levels was not stable, and standard unit root and stationarity tests did could not reject the unit root null, even when using the demeaned unemployment series, so we only used the …rst di¤erence in unemployment as a possible measure of slack. There was evidence of non-linearity when using all measures of slack we considered. Tables 1-7 in Appendix III show the results of the linearity tests for each model we considered. When using output gap, the best switching variable was gapt 2 ; and the estimated threshold was 5:52: The 95% CI obtained by inverting the LR was [3:12; 5:72]:The periods that were included in the estimated low regime included NBER recession dates, and the dates that Davig and Leeper’s (2009) identi…ed as periods of active …scal and active monetary policy. When using demeaned capacity utilization as a measure of slack, there was strong evidence of non-linearity when using lagged output growth as a switching variable. There was no evidence of non-linearity at the 5% level when using lagged capacity utilization, but the p-value was still low; The p-value when using lagged capacity utilization as a switching variable was between 0.7 and 0.11, depending in the lag. The best switching variable in that case was cap c t 1 ; and the estimated threshold was 1:79: In this model, the high regime coincided with periods of very rapid expansions. All other periods were estimated to be in the low regime. When using di¤erences in unemployment as a measure of slack, the best switching variable was unt 1 ; and the estimated threshold was 0:266:The high regime dates coincided with the dates selected when using output gap as a measure of slack. We estimated the models and obtained impulse responses for all three models: the model using capacity utilization as the switching variable, the model using output gap as the switching variable, and the model using di¤erences in unemployment as the switching variable. To account for the possibility that unanticipated shocks may have di¤erent e¤ects from anticipated shocks, we also re-estimated all models by including the Ramey exogenous military spending variable in the TVAR. As in Ramey (2009), we ordered the military spending variable …rst. Adding the Ramey spending variable did not a¤ect the results of the linearity tests or the IRFs signi…cantly. To estimate the e¤ects of government spending and tax cuts on output components and on interest rates, we substituted output with the variable of interest in the baseline VAR, and we also estimated the effects of spending and tax cuts on output components by adding a …fth variable to the baseline model. The point estimates for the threshold 11

and the median impulse responses were very similar in both speci…cations, but the 95% con…dence bands were very wide in the speci…cation with …ve variables because there were too few observations per regime to estimate a TVAR with …ve variables. The results given below were obtained using the four-variate model. There were two issues that we needed to consider when estimating the models that included output components and interest rates. 1. Selection of the switching variable: even though there was strong evidence in favor of non-linearity in the baseline model, we wanted to explicitly test if there is nonlinearity in the other models. We estimated each model individually, gridding over all delay parameters, lags and thresholds, then we compared the results to the results obtained in the baseline model. Appendix III presents the results of the linearity tests for all models. There was evidence of non-linearity in all the models that included output components, and the periods belonging to the low and the high regime were almost the same as those we obtained from the baseline model. In the models that included consumption and output gap as a measure of slack, there was no evidence of nonlinearity at the 5% level, but the p-value for the linearity test when using output gap as the switching variable was between 0.07 and 0.09, and the estimated threshold was very close to the estimated threshold from the baseline model (5:49 for the models including consumption vs 5:62 for the baseline model). There was strong evidence in favor of non-lineairty when using di¤erences in unemployment and lagged capacity utilization, and the both the LR and the LM tests are true to size in …nite samples, but have low power, so the failure to reject the null is most likely due to the low …nite-sample power of the tests. We could reject the null for all other models that included output components, and the estimated switching variables and thresholds were very close to the estimates from the baseline model. There was evidence of non-linearity in the models that included interest real rates constructed using in‡ation, but the non-linearity only a¤ected the behavior of G and T. The response of interest rates constructed using in‡ation was not signi…cantly di¤erent from 0 in the linear model. Even when considering horizons that were much longer than 20 quarters (up to 100 quarters), the response of interest rates was not signi…cantly di¤erent from 0. There was evidence of non-linearity in the models that included interest rates constructed using expected in‡ation and di¤erenced 12

unemployment, but the likelihood function was bimodal, with one peak at 0:266; and one peak around 0:433; and the 95% con…dence interval for the estimated threshold was very wide. Even though we could reject the null of linearity, the 95% and the 68% bands for real interest rates constructed using expected in‡ation included 0 at all horizons in both regimes. 2. Selection of the threshold: The estimated threshold values were very similar in all models. In the models that included output components, the estimated thresholds were very close to the estimated thresholds from the baseline model. Furthermore, the 95% CI for the estimated thresholds always included the estimated thresholds from the baseline model.

3.1

Linear Model

In all models, government spending was ordered …rst. This imposes a timing restriction: output and output components can respond to G within a quarter, but government spending does not respond to Y and output components within the same quarter. Changing the order government spending and taxes did not a¤ect the results signi…cantly. When constructing the impulse response functions to government spending, we assumed that government spending increases by 1% of GDP, and when constructing the impulse response functions to tax shocks, we assumed that taxes increase by 1% of GDP. Government spending and taxes are allowed to respond to their own shocks. Since the model is linear, the response to a tax cut equal to 1% of GDP is exactly equal in magnitude to the response to a tax increase equal to 1% of GDP. Responses to Government Spending The response of government spending to government spending does not depend on the measure of slack used. Spending increases by 1% of GDP on impact, the response peaks at 1.2% after 6 quarters, and then it dies out. The cumulative response peaks at 1.8% of GDP after 9 quarters, and it stays ‡at after that. Taxes increase in response to an increase in G, but the response dies out rather quickly, and is not signi…cantly di¤erent from zero, even when using asymptotic con…dence intervals. Even when considering horizons that are a lot longer (up to 100 quarters), taxes do not increase in response to G. GDP responds positively to an increase in government spending, but the magnitude and the persistence of the response depend on the measure of slack used. When using demeaned capacity utilization or di¤erences in unemployment, the IRF of GDP net of government spending 13

to an increase in G equal to 1% of GDP peaks at 0.3% of GDP, and is attained after 6 quarters. After that the IRF stays ‡at, but the effects of G do not die out. The cumulative response increases, and it peaks at 2% after 20 quarters. The response of total output peaks at 3.8% after 20 quarters. When using output gap as a measure of slack, the IRF of private GDP peaks at 0.1% after 8 quarters, and it dies out after 10 quarters. The cumulative IRF peaks at 0.3%, and it stays ‡at afterwards. The cumulative response of total GDP peaks at 2.1% after 10 quarters, and it stays ‡at at roughly 2% after 10 quarters. Regardless of the measure of slack used, the response of private GDP is always non-negative, and the cumulative response of total GDP is greater than 1%. Private consumption increases in response to a spending shock, but the magnitude and the persistence of the increase depend on the measure of slack. When using capacity utilization or di¤erences in unemployment, the response of total consumption peaks at 0.5% of GDP after 8 quarters, and then it slowly dies out. The cumulative response of consumption increases throughout the 20 quarters, but the increase becomes small after 12 quarters. The maximum cumulative response of consumption is equal to 0.75% of GDP, and is attained after 20 quarters (the cumulative response after 12 quarters is 0.70% of GDP). Following Ramey (2010), we also estimated the response of consumption of non-durables and services. The response of consumption of nondurables and services is positive, but it depends on the measure of slack used. When using capacity utilization or di¤erences in unemployment, the response of consumption of non-durables and services peaks at 0.4% of GDP after 8 quarters, and then it slowly dies out. The cumulative response increases throughout the 20 quarters, but it becomes relatively ‡at after 12 quarters. After 3 years, the cumulative increase in consumption of non-durables and services is equal to 0.6% of GDP. When using output gap as a measure of slack, the response of consumption of non-durables and services peaks at 0.3% after 8 quarters, it becomes 0 after 15 quarters, and then it turns weakly negative. The cumulative response after 20 quarters is 0.4%. The responses we obtained show that most of the increase in gross consumption comes from the increase in consumption of non-durables and services. Even when controlling for slack, still got the "investment paradox" described by Blanchard and Perotti (2002) and by Perotti (2008). In our model, gross consumption increases in response to government spending, but gross investment decreases. The magnitude of the decrease depends on the measure of slack used. When using capacity utilization, cumu14

lative response of investment is -0.07% of GDP after 4 quarters, and then the response dies out over the next 2 years. When using gap as a measure of slack, the cumulative decrease is larger and more persistent. Investment decreases by 0.1% of GDP by the 8th quarter, and the decrease is permanent. To be consistent with Ramey (2010) and with Romer and Romer (2009), we also considered consumption of durables plus …xed investment as a measure of investment. When using this alternative measure of investment, we do not get the "investment paradox". The cumulative response of investment peaks at 0.13% of GDP when using di¤erences in unemployment or capacity utilization as a measure of slack, and at 0.03% when using output gap as a measure of slack. Regardless of the measure of slack, the cumulative response is positive. The di¤erences in responses of gross investment and in investment components indicate that the investment paradox may occur because of the response of inventories and residential investment. Furthermore, the responses of consumption and investment indicate that most of the increase in private output comes from consumption. Interest rates do not respond to increases in government spending. Even after 100 quarters, the response of real interest rates is not signi…cantly di¤erent from 0. Responses to Tax Shocks Government spending increases in response to an increase in taxes equal to 1% of GDP. The response peaks at 0.3% of GDP after 3 years, but both the asymptotic and the simulated con…dence bands include 0. Taxes increase in response to taxes: the impact response in period 0 is equal to 1% of GDP and after 1 period it is equal to 0.5% of GDP, but then the response decreases rapidly. After 12 quarters, the cumulative response turns weakly negative. The response of output is weakly positive at …rst, but then it turns negative after 3 quarters. The response of private GDP is equal to .8% of GDP after 20 quarters, and it is very persistent. The cumulative response of private GDP is -2.3% after 20 quarters. Since the response of G was not signi…cantly di¤erent from 0, the cumulative response of total output was roughly equal to -2.3% after 20 quarters. Consumption and investment both decrease in response to a tax increase. The cumulative response of gross consumption is -0.75% of GDP after 20 quarters, and the cumulative response of consumption of services and non-durables is -0.57% of GDP. The cumulative response of gross investment is -2.4% after 20quarters, and the cumulative response of …xed investment plus consumption of non-durables is equal to -1.3% of GDP. When controlling for slack, the responses of output and output 15

components to taxes were almost identical to the responses obtained by Romer and Romer (2009). The responses we obtained in the linear model were larger than those obtained by Blanchard and Perotti. When controlling for slack, the di¤erences between the responses of consumption and gross output to a tax cut and an increase in government spending are very small in the linear model. Tax cuts increase …xed investment and consumption of non-durables more than increases in spending. Responses to Output Shocks Government spending does not increase signi…cantly in response to a positive output shock. Net taxes increase in response to an output shock, but the response dies out after 3 years.

3.2 3.2.1

Responses by State Low State

Responses to Government Spending Shocks The responses of government spending and taxes were similar to the responses of spending and taxes in the linear model. When the economy was in the low state (i.e. when the output gap or unemployment di¤erences were above the threshold, or when capacity utilization was below the estimated threshold),output increased in response to a spending shock. The cumulative of response of GDP was bimodal. The …rst peak was at 1.6% after 4 quarters. After 4 quarters, the IRF becomes weakly negative, but the cumulative e¤ects are still positive. The IRF of output starts increasing again after 3 years, and the cumulative response of output is 1.9% after 16 quarters. The cumulative IRF stays relatively ‡at after 4 years. Most of the short run increase in output comes from an increase in consumption. The consumption of non-durables and services increases by 1.7% of GDP during the …rst 6 quarters, then the e¤ects die out. The response of gross investment and of consumption of durables plus …xed investment is also bimodal. Investment rises during the …rst year, then it decreases rapidly during the second year. After the third year, investment rises again. The cumulative response of gross investment is equal to 0.2% of GDP after 20 quarters. Figures 11-16 in Appendix 3 show the responses of output components to G. Interest rates do not rise in response to government spending shocks, even when we looked at horizons that were as long as 100 quarters. Responses to Taxes Government spending increases during the …rst 4 quarters, and then it decreases during the next four years. The response of taxes is similar to the response of taxes in the linear model. 16

The maximal cumulative response of output was 2.3% after 3 years, then it dies down to 1.6% after 5 years. Consumption increases in response to a tax cut. Gross consumption increases by 2.6% of GDP after 3 years, and consumption of non-durables and services increases by 2.4% of GDP after 3 years. The cumulative response of consumption after 5 years is 1.8%. Gross investment increases by 3% during the …rst 3 years, then the response dies out to 1% after 5 years. The right-hand side of …gures 11-16 in Appendix 3 show the responses of output components to G. 3.2.2

High State

Responses to Government Spending Shocks In the high state, the response of G to G is slightly less persistent than the response of G to G in the low state, but the shape is very similar. Taxes increase weakly, but the response of taxes is not signi…cantly di¤erent from 0 at any horizon, even when we considered horizons that were as long as 100 quarters. The responses of output are much lower in the high state. The cumulative response of private GDP peaks at 0.58% after 4 quarters, and then it dies out over the next four years. The cumulative response of GDP peaks at 1.2% after 4 quarters, and then it also dies. These results are consistent with the results obtained by Uhlig (2005), by Ramey (2009), and with the results obtained using the neoclassical model. Government spending has positive e¤ects in the short run, and very small e¤ects in the long run. Both measures of consumption we considered also increase in the short run. Gross consumption increases by 0.9% of GDP after 2 periods, and then the e¤ects dwindle down very quickly. Most of the increase comes from the increase in the consumption of non-durables and services (0.8% of GDP after 2 quarters). The cumulative response of investment is slightly smaller than in the low state (0.3% of GDP) , but there is no evidence of crowding out, even when looking at horizons that were signi…cantly longer than 20 quarters. Interest rates rise slightly in response to G, but the e¤ects were never signi…cantly di¤erent from 0. Responses to Taxes Government spending increases in response to a tax increase, but the response is not signi…cantly di¤erent from 0. Taxes increase by 1% during the initial period, then the response dies out quickly. A decrease in taxes increases output by 1.35% after 20 quarters. The e¤ects die out very slowly. This estimate is very close to the estimates by Romer and Romer, who use narrative tax data to identify tax shocks. 17

The response of gross investment is not signi…cantly di¤erent from 0. Consumption of durables and …xed investment increase by 1% of GDP after 20 quarters. The increase is gradual and the maximum cumulative e¤ect is attained after 5 years. Gross consumption increases by 1.6% of GDP after 5 quarters, but the e¤ects die out after 2 years. Consumption of don-durables and services increase by 1.3% after 4 quarters, but the e¤ects die out after 2 years. The response of interest rates is never signi…cantly di¤erent from 0.

3.3

Responses When the System is Allowed to Switch Between Regimes4

The impulse response functions when the system was allowed to switch between regimes were constructed using the method described in Appendix I. When the system is allowed to switch between regimes, the responses depend on the history, the size, and the sign of the shock. The shape and the magnitude of the responses of output were slightly lower in the low regime and slightly higher in the high regimes than those obtained , and there was evidence in favor of sign and size asymmetry, but the bootstrap con…dence intervals were very wide and we were not able to draw inference. We are currently working on a Bayesian model in order to get more precise estimates for the VAR coe¢ cients and for the IRFs.

4

Conclusion

The empirical model presents strong evidence in favor of nonlinearity and suggests that the timing of policy shocks matters. The e¤ects of government spending shocks are large and persistent during recessions,but the multiplier is slightly very close to one. Consumption is crowded in by government spending both in recessions and in expansions. The increases in consumption is di¢ cult to reconcile with standard neoclassical and RBC models. Most of the increase in private GDP comes from the increase in consumption. The results we obtained are consistent with the results obtained by Blanchard and Perotti (2002), Perotti (2008) and Papa (2005), but are at odd with the results obtained using simulated DSGE models and with the results obtained using only Ramey’s spending variable. When controlling for slack, there is no evidence that government spending crowds out investment. Investment increases in both regimes, but the increase is small and very temporary in the high 4

This is where I will add a whole section. We are currently working on the Bayesian model that will let us get more precise estimates for the generalized IRF.

18

regime. The e¤ects of tax shocks were very similar to those obtained by Romer and Romer (2009) using the narrative approach. Our estimates are larger than the estimates typically obtained in the VAR literature. In the low regime, the e¤ects of tax shocks were larger during the …rst 2 years, but slightly smaller after 5 years. In the high regime, the e¤ects of tax cuts were always larger than the e¤ects of spending shocks. The analysis with the TVAR approach used here is still at an early stage; In this version of the paper, I do not control for the response of the interest rates directly. An Bayesian extension of the model that allows us to obtain more precise estimates for the IRFs when the system evolves is in progress. The preliminary results for output are similar in shape and magnitude to those we obtained when the system does not evolve, but there was evidence in favor of sign and size nonlinearity. The responses presented here indicated that the timing and the size of policy matter, and that the TVAR model may be a useful tool for capturing those asymmetries.

References [1] Andrews, D. and W. Ploberger (1994): "Optimal Tests when a Nuisance Parameter is Present Only Under the Alternative." Econometrica, Vol. 62, No. 6, pp. 1383-1414. [2] Atanasova, C. (2004): "Credit Market Imperfections and Business Cycle Dynamics:A Nonlinear Approach." Studies in Nonlinear Dynamics and Econometrics, Vol 7, Issue4, Article 5. [3] Auerbach, A. J. (2009):"Implementing the New Fiscal Policy Activism." American Economic Review: Papers and Proceedings, 99-2, pp. 543-549. [4] Aiyagari, R., L.Christiano, and M. Eichenbaum (1992) “The Output, Employment and Interest Rate E¤ects of Government Consumption.”Journal of Monetary Economics 30 , pp. 73–86. [5] Blanchard, O., and R. Perotti (2002): "An Empirical Characterization of the Dynamic E¤ects of Changes in Government Spending and Taxes on Output." The Quarterly Journal of Economics, Vol, 117. No 4, pp. 1329-1368. [6] Calvo, G. A. (1983): "Staggered Contracts in a Utility Maximizing Framework." Journal of Monetary Economics 12, pp. 383-398. [7] Christiano, L. M. , M. Eichenbaum, and S. Rebelo (2009): "When is the Government Spending Multiplier Large?" Northwestern University Working Paper. [8] Cogan, J. F. , T. Cwik, J.B. Taylor, and V. Wieland (2009):" New Keynesian Versus Old Keynesian Government Spending Multipliers." Stanford University Working Paper. 19

[9] Davig, T. and E. M. Leeper (2009): "Expectations and Fiscal Stimulus." CAERP Working Paper 006-2009. [10] Donayre, L. , Y. Eo and J. Morley (2009): "Fidicual Distribution Con…dence Intervals for TAR Processes with an Application to Credit-Market Crises." Washington University in St. Louis Working Paper. [11] Edelberg, W., M. Eichenbaum, and J. Fisher (1999): "Understanding the E¤ects of a Shock to Government Purchases." Review of Economic Dynamics 2, pp. 166-206. [12] Eo, Y. and J. Morley (2009): "Likelihood-Based Con…dence Sets for the Timing of Structural Breaks." Washington University in St. Louis Working Paper. [13] Fatas, A., and I. Mihov (2001)“The E¤ects of Fiscal Policy on Consumption and Employment: Theory and Evidence,”CEPR Discussion Paper No. 2760. [14] Feldstain, M. (2009): "Rethinking the Role of Fiscal Policy." American Economic Review: Papers and Proceedings, 99-2, pp. 556-559. [15] Hamilton, J. D. (1999): Time Series Analysis. Princeton, New Jersey: Princeton University Press. [16] Hansen, B. E. (1992): "The Likelihood Ratio Test Under NonStandard Conditiona: Testing the Markov-Switching Model of GNP." Journal of Applied Econometrics 7, pp. S61-S82. [17] Hansen, B. E. (1996): "Inference when a Nuisance Parameter is not Identi…ed under the Null Hypothesis." Econometrica 64. pp .413430. [18] Hansen, B. E. (1997): "Inference in TAR Models." Studies in Nonlinear Dymanics and Econometrics 2(1), pp. 1-14. [19] Hooker, M. A. (1996): "How Do Changes in Military Spending Affect the Economy? Evidence from State-level Data." New England Economic Review, pp. 3-15. [20] Hooker, M. A. and M.M. Knetter (1997): "The E¤ects of Military Spending on Economic Activity: Evidence from State Procurement Spending." Journal of Money, Credit, and Banking, 29(3), pp. 400421. [21] Hoppner, F. and K. Wesche (2000): "Non-Linear E¤ects of Fiscal Policy in Germany: A Markov-Switching Approach." Bonn Econ Discussion Papers 9/2000. [22] Kim, C-J.., and C. R. Nelson (1999): State Space Models with Regime-Switching. Classical and Gibbs-Sampling Approaches with Applications. Cambridge, MA: The MIT Press. [23] Koop, G. and S. Potter (2001):"Are Apparent Findings of Nonlinearity due to Structural Instability of Economic Time Series?" 20

[24]

[25] [26] [27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

Journal of Money, Credit, and Banking 31, pp. 317-334. Koop, G. M.H. Pesaran, and S.M. Potter (1996): Impulse Response Analysis in Nonlinear Multivariate Models." Journal of Econometrics 74, pp. 119-147. Mittnik, S. and W. Semmler (2009): "Regime Dependence of Aggregate Demand E¤ects." Working Paper, Preliminary Draft. Mountford, A. and H. Uhlig (2009): "What are the E¤ects of Fiscal Policy Shocks?" Journal of applied Econometrics 26. pp. 960-992. Mulligan, C.B. (2010) : "Simple Analytics and Empirics of the Government Spending Multiplier and other Keynesian Paradoxes. " University of Chicago Working Paper. Owyang, M. and S. Zubairy (2009): "The Regional Variation in the Response to Government Spending Schoks." Federal Reserve Bank of St. Louis Working Paper 2009-006A. Papa, E. (2005): "New Keynesian or RBC Transmission? The Effects of Fiscal Shocks in Labour Markets." CEPR Discussion Paper Series No 5313. Perottti, R. (2008): "In Search of the Transmission Mechanism of Fiscal Policy." NBER Macroeconomic Annual 2007, ed. by D. Agemogly, K. Rogo¤, and M. Woodford, pp. 169-226. University of Chicago Press, Chicago, Illinois. Ramey, V. A. (2010): "Identifying Government Spending Shocks: It’s All in the Timing." Quarterly Journal of Economics (forthcoming). Ramey, V. A. and M. Shapiro (1998): “Costly Capital Reallocation and the E¤ects of Government Spending,”Carnegie Rochester Conference on Public Policy. Romer, C. amd J. Bernstein (2009): "The Job Impact of the American Recovery and Reinvestment Plan." Executive O¢ ce of The President of the United States, Council of Economic Advisers Report, January 8, 2009. Romer, C. and D. Romer (2009): "The Macroeconomic E¤ects of Tax Changes: Estimates Based on a New Measure of Fiscal Shocks." American Economic Review, forthcoming. Taylor, J. B. (2009): "The Lack of Empirical Rationale for a Revival of Discretionary Fiscal Policy." American Economic Review: Papers and Proceedings, 99-2, pp. 550-555. Teräsvita, T. (1994): "Speci…cation, Estimation, and Evaluation of Smooth Transition Autoregressive Models." Journal of the American Statistical Association 89, pp. 208-218. Tong, H. (1978): "On a Threshold Model." Pattern Recognition and Signal Processsing, ed. by C.H. Chen. Amsterdam,: Kluwer. 21

[38] Tong, H. (1983): "Threshold Models in Non-linear Time Series Analysis." New York: Springer-Verlag. [39] Tsay, R.S.(1998): "Testing and Modeling Multivariate Threshold Models." Journal of the American Statistical Association, 93, 11881202. [40] Van Brusselen, P. (2009): "Fiscal Stabilization Plans and the Outlook for the World Economy." NIME Policy Brief 01-2009. [41] Weise, C. (1999): "The Asymmetric E¤ects of Monetary Policy: A Nonlinear Vector Autoregressive Approach." Journal of Money, Credit, and Banking 31, pp. 85-108 [42] Woodford, M. (2010) "Simple Analytics of the Goverment Expenditure Multiplier." NBER Working Paper No. 15714 .

Appendix I: Generalized Impulse Response Function The procedure to compute the generalized impulse response functions (GIRFs) follows Koop, Pesaran and Potter (1996). GIRF is de…ned as the e¤ect of a one-time shock on the forecast of variables in the model. The response of a variable following a shock must be compared against a baseline "no shock" scenario. GIRFy (k; "t ;

t 1)

= E [Yt+k j"t ;

t 1]

E [Yt+k j

t 1]

(A.1)

where k is the forecasting horizon, "t is the shock and t 1 are the initial values of the variables in the model. Generalized impulse response function must be computed by simulating the model. The nonlinear model is assumed to be known, i.e. sample variablility is ignored. The shock to the i th variable of Y occurs in period 0, and responses are computed for l periods ahead. The shock to government spending is normalized to be equal to 1% (or 2% for large shocks) of GDP (at the time shock occurs). The GIRFy function is generated using the following algorithm. 1. Pick a history rt 1 , where r = 1; 2; :::; R. The history is the actual value of the lagged endogenous variables at a particular date. 2. Pick a sequence of (m-dimensional) shocks "t+k , k = 0; 1; :::; l . The shocks are drawn with replacement from the estimated residuals of the asymmetric model. The shocks are assumed to be jointly distributed, so if date t’s shock is drawn, all m residuals for date t are collected. r 3. Using t 1 and "t+k , simulate the evolution of Yt+k over l + 1 periods. Denote the resulting baseline path Yt+k ( rt 1 ; "l+k ), k = 0; 1; :::; l.

4. Substitute "i0 for the i0 element of "t+k and simulate the evolution of Yt+k over l+1 periods. Denote the resulting path Yt+k ("i0 ; rt 1 ; "t+k ) for k = 0; 1; :::; l. 5. Repeat steps 2 to 4 B times. 6. Repeat steps 1 to 5 R times and compute the median, the lower 2:5%, and the upper 2:5% of the di¤erence between the pro…le Yt+k ("i0 ; rt 1 ; "t+k ) and the benchmark path Yt+k ( rt 1 ; "l+k ). In this paper (as in Koop et al.) B was set at 100 and R at 500.

1

Appendix II: Graphs Switching Variables

Capacity Utilization- Raw Data and Demeaned Series Accounting for Structural Breaks in the Mean 1967Q1:2009Q4

2

Figure 1: Unemployment- Raw Data and Demeaned Series Accounting for Structural Breaks in the Mean 1967Q1:2009Q4

3

Responses in the Linear Model Responses to G

Response of T to G-Linear Model (Not converted to % of GDP)

4

Response of Private GDP to G- Linear Model

5

Responses of Consumption to G

6

Responses of Consumption of Non-Durables+ Consumption of Services to G- Linear Model

Cumulative Responses of gross Investment to Cumulative Responses of Consumption of Durables+ Fixed Investment- Linear Model G-Linear Model

7

Responses to T

Response of G to T-Linear Model (Not converted to % GDP)

8

Response of T to T-Liner Model (Not converrted to % GDP)

9

Response of Private GDP to T

10

CumulativeResponse of Gross Consumption to T

11

Cummulative Response of Consumption of Non-Durables + Services to T

12

Cumulative Response of Gross Investment to T

13

Cumulative Response of Fixed Investment+ Consumption of Durables to T- Linear Model

14

Responses to Y

Responses of G and T to Y-Linear Model

15

Responses When the System Stays in One State Responses to G

Responses of G to G (Not converted to % of GDP)

Responses of T to G (Not converted to % of GDP)

16

Cummulative Responses of Y to G

Cummulative Responses of Gross Consumption to G

17

Cumulative Responses of Gross Investment to G Responses to T (Increase in T equal to 1% of GDP)

Responses of G to T (Not Converted to % of GDP)

18

Responses of T to T (Not Converted to % of GDP)

Cumulative Responses of Y to T

Cumulative Responses of Gross Consumption to T

19

Cumulative Responses of C2 to T

Cumulative Responses of I to T

Tables Testing for Structural Breaks There is evidence that both capacity utilization and unemployment have structural breaks in the mean, which would make the series non-stationary and therefore unsuitable for use in the TVAR. If there is a structural break in the mean, they cannot be used as switching variables. If the demeaned series turns out to be stationary, and the breaks in mean are considered exogenous, the demeaned series can be used as possible switching variables. Sequential application of the Quandt-Andrews test identi…es 5 breaks in capacity utilization. The Bai-Perron test for multiple structural iden20

ti…es 2 breaks, but the BIC and the LWZ criterion suggested by Bai and Perron select a model with 5 breaks. The break dates selected by the BaiPerron procedure in the model with 5 breaks coincided with the break dates obtained by using sequential application of the Quandt-Andrews test. We decided to err on the side of caution, and used the model with 5 breaks. The same breaks were identi…ed when using monthly and when using quarterly data, and when changing the trimming percentage from the standard 15% to 5%. Using the demeaned series with 2 breaks does not a¤ect the results of the linearity tests or the estimated coe¢ cients of the TVAR drastically. For unemployment, both the Quandt-Andrews test and the BaiPerron test identify 4 breaks in the mean, and the same break dates were estimated using both tests. 5 When changing the trimming percentage to 5%, both tests identi…ed an additional break date in 2007Q4, but the interval for the estimated date was very wide and it covered the end of the sample. To avoid identi…cation problems, we did not include the last break.

1976Q1 1982Q2 1990Q3 1996Q3 2006Q3

p-value for Sequential Quandt-Andrews p-value for Bai-Perron 0.000