Nov 29, 2009 - determine the marginal tax rate for different levels of interest deduction. ... The corporate marginal income tax rate (MTR) is an important input ...
Simulating Corporate Marginal Income Tax Rates and Implications for Corporate Debt Policy John R. Graham Duke University and NBER Hyunseob Kim Duke University November 29, 2009
Abstract We study several important tax-related issues related to the measurement and use of corporate marginal tax rates. First, we develop an AR(1) method to simulate corporate marginal income tax rates and demonstrate that the AR(1) model improves upon the extant random walk and bin approaches. The new AR(1) approach captures firm-specific features, including mean-reversion in taxable income. Second, we calculate marginal tax benefit curves, which are the functions that determine the marginal tax rate for different levels of interest deduction. We integrate under tax benefit functions calculated using the AR(1), random walk, and bin approaches and find roughly similar implications for all methods, namely that the tax benefits of debt average about 13% of book asset value, gross of all costs. Third, we investigate the “kink” in benefit functions based on each of the three methods, that is, where the kink represents the point where marginal tax benefits begin to decline. The kink is sometimes used to infer how conservatively companies use debt. Our analysis indicates that implications about debt conservatism based on the AR(1) model are relatively robust to specification choices, while implications based on the bin model are sensitive to the specification. Overall, we find that all three models provide roughly similar implications about how aggressively companies appear to use debt once reasonable modifications are made to the debt conservatism metric. Importantly, controlling for a company’s interest usage, we find that variables commonly used to proxy for the costs of debt only partially explain that company’s kink (i.e., common cost variables only partially explain why some firms appear to use debt conservatively).
We thank Alon Brav, Cam Harvey, Mark Leary, Per Olsson, Michael Roberts, Bob Winkler and the Texas Tax Reading Group for helpful feedback. Kim gratefully acknowledges financial support from the Kwanjeong Educational Foundation. Any errors are our own.
1. Introduction The corporate marginal income tax rate (MTR) is an important input into capital structure, compensation, the cost of capital, capital spending, and many other corporate decisions. Financial economists commonly measure the economic MTR as the present value tax consequences of earning an extra dollar of income today (Scholes et al., 2008). The tax code allows firms to carry losses that occur today back in time, or alternatively to carry losses many years into the future (e.g., 3 year carryback period and 15 year carryforward period in the 1980 to 1994 sample period studied in the main analysis of this paper). Due to these dynamic features of the tax code, to determine the current-period economic MTR, it is necessary to forecast taxable income 18 years into the future, as well as keep track of the recent history of a firm’s taxable income. Therefore, the forecasting model one uses to predict future taxable income becomes an integral part of the process of estimating current period marginal tax rates. In this paper, we study several important tax-related issues: simulating marginal tax rates, estimating the entire tax benefit function and integrating under it to estimate the tax benefits of debt, and inferring from a firm’s location on its benefit function whether it appears to use debt conservatively. Along the way, we highlight whether implications vary depending on which of three taxable income forecasting methods underlie the tax-related analysis: the well-known random walk approach (Shevlin, 1990; Graham, 1996a and b); the recently proposed bin method (Blouin, Core, and Guay, 2009, which sorts companies into six income growth bins and five asset bins and draws changes in income from these bins to forecast future income); and a new AR(1) method that we develop in this paper. We demonstrate that while forecasting taxable income accurately is important, using firm-specific data when estimating marginal tax rates is at least as important. The first tax issue that we investigate involves simulating corporate marginal income tax rates. Historically, tax rate simulation methods have been based on random walk forecasts of taxable income (Shevlin, 1990; Graham, 1996a and b). Blouin et al. (2009) argue that the random walk method is flawed because, among other things, it does not allow for mean reversion in taxable income. Lack of mean reversion could be problematic because once a firm experiences an extremely negative or positive outcome, the random walk model may allow the firm to “get stuck” at this extreme. Blouin et al. develop a nonparametric approach that draws income forecasts for bins of firms that are grouped by similar asset size and return on assets. They demonstrate that the distribution of forecasted income produced by the bin method closely matches the realized
1
distribution of income. While using taxable income data with nice distributional properties is desirable, we demonstrate that even more important is using a firm-specific process to simulate income (versus using a bin method that treats all companies in a given bin identically). In a nutshell, based on the bin method, each company in a given bin is treated too similarly, with the end result of producing tax rates that are too “mid range” for the entire bin. In contrast, firmspecific processes allow for more extreme tax rates for companies in extreme circumstances. To highlight the importance of using firm-specific information, we follow Blouin, Core, and Guay (BCG) and sort companies into six income bins (two equal sized bins for loss firms and four equal sized bins for profitable firms). Consider the distribution of MTRs for companies in the lowest income bin (see Figure 3, Panel B, which is explored in detail in Section 5). Even for these companies that have the largest losses among all firms, the bin method produces very few nearzero marginal tax rates. Instead, most of the bin-estimated MTRs for these severe loss firms are around 7% or 8%. Intuitively, the bin model treats all firms in the loss bin equally in terms of forecasting future income growth, and thus even the “worst-quality” loss firms obtain mid-range forecasts of future taxable income, as do the “best-quality” firms in this bin. 1 In contrast, the random walk (Figure 3, Panel A) and AR(1) (Panel C) methods produce near-zero MTRs for many of these extreme loss firms because each company is modeled with firm-specific information. Of course, it is difficult to make absolute statements about which tax rate simulation method is best without comparing them to a benchmark of the “true” MTR. The word true is in quotations because we do not know the true distribution of future taxable income, and therefore not possible to know the true marginal tax rate for a given firm in a given year. Consequently, we use three benchmarks for the true MTR in an attempt to triangulate the qualities of the tax simulation approaches. 2 The first benchmark measure is based on actual realizations (or “perfectforesight” forecasts) of future taxable income; after all, if the key input into marginal tax rates is taxable income forecasts, why not use actual income realizations? One potential drawback of this 1
The dynamic nature of the tax code accentuates this tendency for the bin method to produce MTRs clustered near the center. For example, one firm in a given bin might be assigned a very low draw of income growth for t+1, followed by a very high draw for t+2. Because the tax code allows smoothing of income through time, such a forecast is not much different from an outcome in which the firm instead received two mid-range forecasts for t+1 and t+2. In either case, a mid-range MTR is a likely outcome. 2 Blouin et al. (2009) imply that bin MTRs dominate random walk MTRs but notably, they base their inference only on estimated distributions of taxable income, an input into the tax rate calculations. At no point do they investigate the properties of corporate marginal income tax rates, which is the final product of their analysis. Hence, these authors do not evaluate how well taxable income forecasts are used in marginal tax rate calculations, and therefore they do not address the importance of using firm-specific information in calculating marginal tax rates.
2
benchmark is that realizations from t+1 onward reflect just the states of the world that happened to occur; they may not fully reflect the expectations managers considered when they made their forecasts in year t. Consequently, we also consider two benchmarks that are based on distributions of future income. The second benchmark creates a year t distribution of future income that is based on the estimated data generating process of realized future income. The third benchmark gets away from realizations altogether and is based on security analysts’ earnings forecasts as of year t. To gauge whether a tax rate model’s output is accurate in the context of estimating MTRs, we compare these three benchmark MTRs (based on realized future firm-year income, distributions of future realized income, and analysts’ forecasts) to MTRs estimated by each of the three tax rate simulation models (random walk, BIN, and AR(1) ). Panel D of Figure 3 presents the benchmark distribution of MTRs based on perfectforesight income forecasts for low income companies. This distribution is u-shaped with the left side of the u much taller than the right side (that is, many more near-zero MTRs than tax rates near the top statutory rate). The random walk method (Panel A) produces a similar distribution of MTRs for these low-income firms. The AR(1) method (Panel C) produces an even more similar distribution to this benchmark. Notably, the bin method’s distribution is hump-shaped, rather than u-shaped like the benchmark. Intuitively, relative to the benchmark the bin method appears to introduce too much mean reversion in taxable income for many of the near-zero MTR firms, and therefore produce tax rate estimates clustered nearer the middle of the distribution, far away from zero. As described in more detail below, the bin MTRs reflect this same disconnect in comparison to the other two benchmark MTRs. When we examine errors in predicting the benchmark MTRs observation by observation, we find that among the three simulation approaches the AR(1) tax rates are closest to the benchmarks. The bin model produces large forecasting errors among loss firms that have low benchmark MTRs as well as among highly profitable firms that have high benchmark MTRs. These errors occur because the bin model induces more mean reversion in taxable income forecasts than do the benchmark models. One overall conclusion of our paper is that when compared to benchmarks of the true marginal tax rate, the AR(1) method performs better than the bin and the random walk methods. The second tax issue that we investigate involves simulating an entire marginal tax benefit function. These tax benefit functions map different levels of interest deductions (on the x-axis) into
3
simulated marginal tax rates (on the y-axis). That is, benefit functions are a collection of estimated marginal tax benefits (i.e., corporate income tax rates) for various levels of interest deductions. One can estimate the tax savings benefit of debt by integrating under these functions up to the point of actual interest used by a given firm-year. All three simulation methods imply that the tax benefits of debt equal about 13% of book asset value. More specifically, annual interest deductions reduce the average Compustat company’s federal tax payments by about 1.4% of asset value. Capitalizing this annual savings with the cost of debt (which averaged 11% during our sample period) as discount rate leads to present value tax savings of 13% of asset value. This is particularly interesting right now because the Volker commission is studying, among other things, whether to eliminate debt interest tax deductibility. Our estimates imply that, all else held constant, such an action would reduce the value of the typical U.S. company by about 13%. We also estimate that the cost of capital for the typical company would have increased by about 100 basis points from 1980 to 1994 if interest deductibility had been eliminated. The third focus of our paper is to investigate whether companies appear to use debt conservatively. Previous work (e.g., Graham, 2000; Blouin et al., 2009) examines where firms operate on their tax benefit functions relative to the “kink” (the point where marginal tax benefits begin to decline) in the function. The kink is measured proportional to the company’s actual debt usage so, for example, a kink of three indicates that a company could triple interest deductions before the marginal tax benefits begin to decline. The larger the kink, the more conservative a company’s actual debt usage appears to be. 3 Graham and Blouin et al. compare this measure of conservative debt policy to variables that are used in the literature to measure costs and nondebt tax shields of debt. Blouin et al. (2009) make two claims relative to kink. First, they argue that the average kink based on the bin method is much smaller than kink based on the random walk method, and therefore they argue that companies do not use debt as conservatively as previously thought. We demonstrate that their conclusion only holds if one makes a strong assumption to define kink, and 3
Saying that a company uses debt conservatively is not the same as saying that it uses too little debt. It is possible that a firm that is conservative and uses little debt does so because it faces high costs of debt. Recent research investigates whether “conservative debt policy” companies appear to be in fact “underlevered.” For example, Almeida and Philippon (2007) demonstrate that once one considers that the marginal value of money is higher in distress states, expected distress costs can be large enough to offset estimated net (of personal tax effects) tax benefits of debt, implying that the observed average magnitude of debt usage is not out of line with companies choosing debt optimally in aggregate. However, note that their research does not address cross-sectional implications; that is, it does not address whether in the cross-section high kink firms possess the characteristics of high cost companies.
4
that under this assumption the BCG bin kink is far from the “true” benchmark kink. That is, Blouin et al. define a kink to occur when the tax benefit function falls by 50 basis points from the yintercept value of the function (which is the MTR a company would have if it had no interest deductions). 4 If one loosens up this constraint, and defines kink based on when the function falls by 100 or 200 basis points (i.e., which is reasonable given that this is less than one-tenth of the top statutory tax rate), the kinks produced by all three simulation methods are similar at around 2.0 or 2.5. Therefore, debt conservatism implications drawn from kink are similar across the models for reasonable specifications. Blouin et al. (2009) also claim that common cost measures such as earnings volatility are correlated with kink. In particular, they claim that companies that have high kinks also have characteristics such as high earnings volatility, which would imply that these companies are not underlevered but instead face high costs that explain their conservative debt usage. We show that for the most part, the firm characteristics Blouin et al. study are correlated with the amount of interest the company uses in ways that are not surprising given the existing capital structure literature. We control for these well-known patterns in the data by matching companies based on actual interest used. Once interest usage is controlled for, we find only modest relation between firm characteristics and a company’s kink. That is, comparing two companies matched on interest usage, only some of the characteristics of the firm with the larger kink are associated with higher costs. Our conclusion is that the jury is still out in terms of finding cost variables that explain observed cross-sectional kink patterns that imply that some companies use debt conservatively. This is an important topic for future research. The paper proceeds as follows. Section 2 discusses the income properties and simulation features most relevant to accurately estimating MTRs. Section 3 details the taxable income simulation models that we examine in this paper to compute tax rates. Section 4 discusses sample construction and measurement issues related to forecasting taxable income. Section 5 compares the random walk, bin, and AR(1) models in the context of forecasting taxable income and estimating marginal tax rates. Section 6 creates tax benefit curves (as in Graham, 2000) and examines the tax benefits of debt and kink properties. Conclusions are offered in the final section. 4
As we describe below, there are two issues here: 1) focusing narrowly on a 50 basis point drop-off to define kink, and 2) Blouin et al. (2009) define kink relative to a different reference point than does previous research. Regarding the second point, BCG define kink relative to the no-interest MTR (i.e., the y-intercept of the benefit function) while Graham (2000) defines kink relative to a drop off in tax benefits from one interest increment to the next. Therefore, the BCG kink is by definition smaller than the extant kink measure, even if the exact same income forecasts were used.
5
2. Properties of Taxable Income and Desirable Features of Forecasting Models 2.1. Mean Reversion in Taxable Income The previous literature documents that annual accounting income or profitability meanreverts (Brooks and Buckmaster, 1976; Freeman, Ohlson, and Penman, 1982; Fama and French, 2000). That is, abnormally high (low) earnings tend to decrease (increase) towards a long-term mean. Economic forces such as competition in product markets (Bhattacharya, 1978), transitory component in earnings, and accounting conservatism (Basu, 1997) can lead to mean reversion. The mean-reverting property of income is particularly relevant to simulating marginal tax rates because of the dynamic features of the tax code. 5 Under these features, mean-reverting taxable income implies that a currently profitable firm has a non-trivial probability that its MTR is lower than the top statutory rate (e.g., it may be able to carry back future losses to obtain a refund for taxes paid today), and current-period loss firms can have MTRs greater than zero (because an extra dollar earned today is one fewer dollar carried forward to shield future taxes).
2.2. Long-run Behavior of Taxable Income As discussed in the introduction, estimating the marginal tax rate involves simulating taxable income over a long horizon (e.g., 18 years for the 1980-1994 period studied in this paper). As a result, in contrast to many income forecasting applications, the forecasting model for simulating MTRs needs to capture long-term as well as near-term behavior of taxable income. During 1980 to 1994, the tax code allowed firms to carry current-period losses back three years or forward up to 15 years in the future. 6 Hence, while losses that occur more than three years in the future have no direct impact on the year t MTR for profitable firms, positive future taxable income in the ensuing 15 years could have material impact on the year t MTR for loss firms. 7 For example, if a current-period loss firm earns profits 10 years from today that are larger than accumulated NOL carryforwards at that time, assuming a tax rate of 34% and a discount rate of 5
Note that mean reversion in taxable income would not affect MTRs if no loss carryback or carryforward were allowed. In that case, the MTR would simply be the top statutory rate (zero) for current period profitable (loss) firms. 6 In 2008, the carryback and carryforward periods were 2 and 20 years for most companies, respectively. In 2009, the carryback period was temporarily extended to five years as a stimulus measure to help loss companies. 7 Here we implicitly assume that profits (losses) in the current period are not completely canceled out by past losses (profits) via loss carryforwards (carrybacks). In fact, this assumption is supported by data. For example, over the 19801994 period, only 8.8% of profitable firms have loss carryforwards and 54.6% loss firms have no past profits against which they can carryback current losses.
6
7%, the firm’s current-period MTR would be 17.3% (= 34%/1.0710), which is significantly greater than the tax rate without loss carryforwards (0%). Due to this asymmetry in treating tax losses between the carryback and carryforward rules, MTR estimates tend to be more sensitive to income forecasts for loss firms than for profitable firms. As a result, while accuracy of forecasting future taxable income over the long-run is generally important when simulating marginal tax rates, the MTR impact of long-term forecasts is more pronounced for firms that have losses in the current period compared to those have profits.
2.3. Features of Income Forecasting Models to Estimate MTRs Given the mean-reverting property of taxable income and the long-horizon income forecasts used to estimate MTRs, in this section we discuss desirable features of income forecasting models in terms of accurately estimating marginal tax rates.
2.3.1. Modeling Transitory Component of Taxable Income The mean-reverting property of taxable income discussed in Section 2.1 implies that a significant portion of income is transitory. 8 Therefore, a forecasting method should ideally capture mean reversion in taxable income by modeling transitory components of income. If instead the model assumes that all shocks are permanent, then a large positive or negative shock to taxable income could cause forecasted income to “get stuck” at extreme levels, potentially biasing MTR estimates. Thus, given that income forecasts over a long horizon can affect the MTR, modeling transitory components of income is important when simulating MTRs.
2.3.2. Modeling Income Processes Conditional on Firm-specific Information Given that each firm operates in distinct economic conditions (e.g., product market competition, cyclicality of business, firm size), it is desirable to allow individual companies to follow heterogeneous income processes. For instance, the speed of mean reversion and volatility of income can differ by industry because of differing levels of “cyclicality” or “riskiness.” Even individual firms in the same industry can exhibit different characteristics of mean reversion and
8
See Gorbenko and Strebulaev (2009) for the implications of modeling transitory as well as permanent components of earnings for capital structure choices.
7
income volatility because of heterogeneous product market circumstances, for example. Thus, including firm-specific information in the forecasting process is desirable. One caveat to using firm-specific information is that in some cases data limitations (such as few historical observations) can lead to small-sample issues such as large standard errors. Hence, increased noise in estimated income processes is one disadvantage of using firm-specific approaches. Thus, ideally, the forecasting model should strike a balance between the advantage of conditioning on firm-specific information and potential noise due to small samples.
2.3.3. Modeling Scaled Income In general, using scaled income (e.g., taxable income divided by total assets) instead of dollar-level income is preferable. As Blouin et al. (2009) point out, historical level data can be “stale” relative to capturing current levels of income due to firm growth and inflation. This problem is relevant when estimating MTRs because one needs to forecast taxable income for a long horizon (e.g., 18 years). Especially, given that the assumption of statistical stationarity holds better for scaled income than for levels, researchers who want to forecast taxable income using stationary time-series models would prefer to use scaled income.
3. Simulation Models for Corporate Marginal Income Tax Rates In this section, we first describe how the random walk (RW) and bin models forecast taxable income (which is used as an input to estimate marginal tax rates). Then, we develop a firstorder autoregressive (AR(1)) model as an alternative to the two extant methods. Before presenting the three income forecasting models, we describe the marginal tax rate simulation procedure. The corporate income MTR is defined as the present value of current and expected future taxes associated with earning one additional dollar of income earned. Since current losses can be used to offset taxes that are paid in the past (carryback) or that will be paid in the future (carryforward), taxable income forecasts are required to compute current-period MTRs. To incorporate uncertainty about future income into the process, the simulation procedure is repeated a number of times (following the literature, 50 simulations are used in this paper). One then averages across the 50 simulated MTRs to estimate the expected MTR for a given firm-year.
3.1. Random Walk Model
8
Pioneered by Shevlin (1990) and Graham (1996a and b), the random walk (RW) model is widely used to forecast taxable income in estimating corporate marginal tax rates. The method simulates future taxable income using the following firm-specific random walk with drift process: 𝑇𝑇𝑇𝑇𝑖𝑖,𝑡𝑡+1 = 𝜇𝜇𝑖𝑖 + 𝑇𝑇𝑇𝑇𝑖𝑖,𝑡𝑡 + 𝜀𝜀𝑖𝑖,𝑡𝑡+1 , 𝜀𝜀𝑖𝑖,𝑡𝑡+1 ~𝑖𝑖𝑖𝑖𝑖𝑖 𝑁𝑁(0, 𝜎𝜎𝑖𝑖2 )
(1)
where TIi,t is taxable income of firm i in time t, and μi and σi2 are the firm-specific drift and volatility of income, respectively. These two parameters are estimated using historical data over the life of the firm up to time t (i.e., using the sample mean and sample variance of changes in taxable income). In the implementation of the RW model, we modify the estimate of μi by constraining it to be non-negative, following Graham (1996b). 9 In the random walk model, every shock (εi,t+1) to taxable income is permanent and the shocks are independently and identically distributed, and thus mean reversion of income is not present. In addition, given that the drift parameter (μi) is estimated using historical data, profitable (loss) firms often have positive (zero) drift estimates. Therefore, income forecasts are likely to stay positive (negative) for profitable (loss) firms, unless future taxable income is perturbed by random shocks (εi,t+1) towards losses (profits). As a result, marginal tax rate estimates based on income forecasts of the RW model are likely clustered around the top statutory rate (zero) for profitable (loss) firms.
3.2. Bin Model Blouin, Core, and Guay (2009) develop a bin method for simulating taxable income. The bin method forecasts taxable income by estimating the probability of changes in scaled income (ROA) and size (average total assets) based on the empirical distributions of the changes for firms
9
Graham’s (1996a and b) approach sets the drift term in the random walk model to the maximum of the historical drift and zero. (We sometimes refer to his approach as a modified random walk model). Due to this adjustment, negative income firms in Graham’s model are less likely to drift to ever more negative income levels, though it is still the case that loss firms will retain nonpositive income unless they drift positively or are randomly perturbed above zero. Graham (1996b) notes that the “constrained” or modified random walk model performs better than the “unconstrained” model in predicting benchmark perfect-foresight marginal tax rates. In these papers, Graham also models investment tax credits and the alternative minimum tax, which Blouin et al. (2009) and we ignore. Finally, also note that Graham (1996b) experiments with a firm-specific AR(1) process to forecast taxable income but finds no improvement over the modified random walk approach. In this paper, we use an AR(1) process to forecast ROA (taxable income divided by total assets), whereas Graham (1996b) modeled the level of taxable income. As discussed below, we find improvement over the random walk approach.
9
with similar scaled income and size characteristics. Contrary to the RW model, this approach does not use a firm-specific parametric model of a given company’s income process, but assigns all firms at time t into six (according to income - two for loss firms and four for profitable firms) by five (according to asset size) bins, a total of 30 groups. Then, it infers the distribution of one-year changes in scaled taxable income and average total assets from historical data on changes in those two variables from time t-2 to t-1 for each of the 30 profitability-size sorted bins. In other words, this approach assumes that all firms in a given scaled income/average total assets bin face the same distribution of changes in the two variables. For each firm in a given bin at time t, the bin model randomly picks a (ΔROA, Asset Growth) pair from the empirical distribution for that bin, and assigns the picked data pair to the firm as forecasts of a change in scaled income and asset growth for time t+1 (Δ𝑅𝑅𝑅𝑅𝑅𝑅𝑖𝑖,𝑡𝑡+1 , 𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝑡𝑡ℎ𝑖𝑖,𝑡𝑡+1 ).
Then, the firm’s income scaled by assets (i.e., ROA) for time t+1 is forecasted by adding
the forecasted change in ROA to ROA for time t. Similarly, average total assets in time t+1 are forecasted by multiplying average total assets at time t by the forecasted asset growth from t to t+1. Lastly, multiplying the forecasted ROA by average t+1 total assets produces a taxable income forecast for t+1. The bin income simulation process is summarized as follows: 𝑅𝑅𝑅𝑅𝑅𝑅𝑖𝑖,𝑡𝑡+1 = 𝑅𝑅𝑅𝑅𝑅𝑅𝑖𝑖,𝑡𝑡 + Δ𝑅𝑅𝑅𝑅𝑅𝑅𝑖𝑖,𝑡𝑡+1,
(2-a)
𝑇𝑇𝑇𝑇𝑖𝑖,𝑡𝑡+1 = 𝑅𝑅𝑅𝑅𝑅𝑅𝑖𝑖,𝑡𝑡+1 × 𝑇𝑇𝑇𝑇𝑖𝑖,𝑡𝑡+1 ,
(2-c)
𝑇𝑇𝑇𝑇𝑖𝑖,𝑡𝑡+1 = 𝑇𝑇𝑇𝑇𝑖𝑖,𝑡𝑡 × 𝐴𝐴𝐴𝐴𝑖𝑖,𝑡𝑡+1 ,
(2-b)
where ROAi,t is taxable income divided by average total assets (return on assets) of firm i in time t, ΔROAi,t+1 is a change in return on assets from t to t+1, TAi,t+1 is average total assets in t+1, AGi,t+1 is asset growth from t to t+1, and TIi,t+1 is taxable income in t+1. This one-period forecasting procedure is repeated for the number of years required to implement the statutory tax-loss carryback and carryforward periods. 10 For brevity, we only present this intuitive description of the bin income simulation model and refer interested readers to Section 4 of Blouin et al. (2009) for further details.
10
Firms can carryback and carryforward their losses for 3 years and 15 years, respectively, during the main sample period that we study (1980-1994). In August 1997 the periods became 2 and 20 years, respectively.
10
Abstracting from forecasting the growth of total assets, the main difference between the bin and the traditional RW models is that for the bin model the distribution of shocks (ΔROAi,t+1) varies depending on the current level of income (while in the random walk model in equation (1), the distribution of the random shock (εi,t+1) is firm-specific). To the extent that the bin model captures the actual difference in the distribution of income changes conditional on the current level of income, the model allows for mean reversion in taxable income. However, by treating all income shocks as permanent, like in the RW, a large shock in the bin approach can have persistent impact on subsequent income forecasts. In addition, the bin model assumes that the distribution of the shock is identical across all firms in a given bin, ignoring firm-specific information relevant to income forecasting. These two factors could introduce biases into MTR simulations.
3.3. First-order Autoregressive Model We develop a new model of forecasting taxable income that allows for mean reversion in taxable income conditional on firm-specific information. Specifically, for each firm-year observation (indexed by i), we use a first-order autoregressive (AR(1)) process in equation (3) to simulate taxable income scaled by beginning-year total assets (ROA). 11 𝑅𝑅𝑅𝑅𝑅𝑅𝑖𝑖,𝑡𝑡+1 = 𝜇𝜇𝑖𝑖 + 𝜌𝜌𝑖𝑖 𝑅𝑅𝑅𝑅𝑅𝑅𝑖𝑖,𝑡𝑡 + 𝜀𝜀𝑖𝑖,𝑡𝑡+1 , 𝑇𝑇𝑇𝑇𝑖𝑖,𝑡𝑡+1 = 𝑅𝑅𝑅𝑅𝑅𝑅𝑖𝑖,𝑡𝑡+1 × 𝑇𝑇𝑇𝑇𝑖𝑖,𝑡𝑡+1 ,
𝜀𝜀𝑖𝑖,𝑡𝑡+1 ~𝑖𝑖𝑖𝑖𝑖𝑖 𝑁𝑁(0, 𝜎𝜎𝑖𝑖2 ),
(3-a) (3-b)
where ROAi,t is taxable income divided by beginning-year total assets (return on assets) of firm i in time t, μi is the drift parameter, ρi is the first-order autoregressive parameter, εi and σi represent random shocks and the volatility of shocks, respectively, TAi,t+1 is beginning-year total assets at t+1, and TIi,t+1 is taxable income in t+1. In order to forecast taxable income using the model, we estimate the parameters of the AR(1) model in equation (3-a) following two steps. First, we estimate the parameters using ordinary least squares (OLS) regressions for each firm-year’s rolling time-series of historical data. To reduce the influence of outliers, we exclude observations from the regression if the absolute value of current or lagged ROA is larger than two. This OLS estimator for AR(1) parameters is efficient as well as 11
In his paper that presents the RW model, Shevlin (1990) also experiments with a firm-specific AR(1) model to allow for mean reversion in taxable income. But, we note that as in Graham (1996b), Shevlin models the level of taxable income using an AR(1) process, while we model taxable income scaled by total assets as an AR(1) process.
11
consistent because it is a conditional maximum-likelihood (ML) estimator given a deterministic initial value and Gaussian errors (Hamilton, 1994). Note that we need at least four observations to estimate the three parameters (μi, ρi, and σi) properly. Thus, one disadvantage of the first-step AR(1) approach is that it requires four historical observations for a firm-year to remain in the sample. Moreover, some firms have only a few historical time-series observations, which can cause a small-sample bias and/or large standard errors as discussed in Section 2.3.2. 12 In the second step, we address the four observation data requirement and small sample issues by exploiting the panel structure of our data. Specifically, we form panels of firm-year observations by scaled income group and industry13 and estimate the AR(1) parameters in equation (3-a) for each of the scaled income-industry panels using Blundell and Bond’s (1998) system GMM estimator. 14,15 We allow parameters to vary conditional on scaled income and industry as we expect the speed of mean reversion and the volatility of income shocks to differ depending on current-period profitability and industry. We use the firm-specific parameter estimates from the first step in most cases. In about onefourth of the observations we instead use the second-step panel estimates if the first step approach leads to one of the following conditions being met: (i) the absolute value of the firm-specific estimate of the autoregressive parameter (ρi) is bigger than or equal to one (i.e., the estimated firmspecific process is non-stationary), (ii) the volatility of random shocks (σi) is larger than one 16, (iii) the long-run mean of the scaled income given by
𝜇𝜇 𝑖𝑖
1−𝜌𝜌 𝑖𝑖
is too large in absolute value, 17 or (iv) firm-
specific estimates are not available due to a small number of historical observations.
12
In an unreported analysis, we adjusted the OLS estimator for a potential small-sample bias using Kendall’s (1954) correction. This adjustment does not alter qualitative results and thus we only report results without the bias correction. 13 We use six income groups (two for loss firms and four for profitable firms) and classify industries based on the twodigit SIC code. The results are robust to the choice of income groups and industry classifications. 14 The Blundell-Bond estimator is designed for dynamic panel data models to address econometric issues in the presence of individual firm fixed effects. It is shown to have improved finite-sample properties and efficiency compared to estimators based on first differencing or firm fixed effects as well as the pooled OLS estimator (Hsiao, 2003). Lemmon, Roberts, and Zender (2008) show that the Blundell-Bond estimator mitigates biases in AR(1) coefficient estimates based on pooled OLS or fixed effects estimators in the capital structure regression context. 15 An alternative to this two-step approach is using the Blundell-Bond panel estimator for all firm-years. However, we take the two-step approach because we want to exploit firm-specific information in the data by allowing income processes to vary by firm whenever possible. 16 We impose this restriction so that firms exhibit economically reasonable magnitude of income volatility. However, the quantitative results are very similar when we change the value of the cutoff for the volatility within a reasonable range. 17 We use 0.6 as the cutoff to classify an absolute value of the long-run mean as “too large.” However, our simulation results are robust to various values of the cutoff.
12
With the estimated parameters we simulate scaled taxable income (ROA) using equation (3-a) and create a taxable income forecast as simulated ROA multiplied by beginning-year total assets as in equation (3-b). As described in Appendix A, we use the clean-surplus formula to increase assets going forward. The AR(1) model has the following attractive features for simulating taxable income and estimating MTRs. First, the AR(1) approach conditions on firm-specific characteristics to model mean-reverting taxable income. As discussed in Section 2, income processes are heterogeneous across firms and thus exploiting firm-specific information is advantageous to accurately forecast income. Second, previous studies empirically support using an AR(1) model to forecast scaled income (e.g., ROA and ROE). Specifically, Dechow, Hutton, and Sloan (1999) show that an AR(1) process captures the time-series behavior of return on equity (ROE) reasonably well when sample firms are pooled in the auto-regression. They also show that adding a second lag (i.e., using AR(2)) to the model does not help much in predicting ROE. Third, our two-step AR(1) approach not only allows incorporating firm-specific information into income processes but reduces biases and noise that could be induced by small samples. 18
3.4. Benchmark Marginal Tax Rates As discussed in the introduction, we evaluate the merits of competing income forecasting models relative to benchmarks. Furthermore, we focus on the resulting marginal tax rate estimates, not the forecasts of an input (taxable income) into the MTR algorithm. Therefore, to compare how well the different forecasting methods predict the true MTR, we need a benchmark for the “true” marginal tax rate. We note that the true MTR is not observable because we do not know the distribution of future taxable income that corporate managers have in mind when they make their decisions. Therefore, in the following analysis we proxy for the true MTR using three benchmarks: marginal tax rates based on (i) perfect-foresight forecasts of future taxable income, (ii) estimated distributions of future income based on realized income, and (iii) analysts’ forecasts of future earnings.
3.4.1. Benchmark Based on Perfect-foresight Forecasts 18
Kothari (2001, p.147) supports the idea underlying our two-step approach to estimating the AR(1) parameters: “One weakness of cross-sectional estimation (of autoregressive models) is that firm-specific information on the time-series properties is sacrificed. However, this is mitigated through a conditional estimation of the cross-sectional regression.”
13
Our first proxy for the true MTR relies on the realization of future taxable income (hereafter we refer to the “benchmark based on perfect-foresight forecasts” or BM-PF), which is similar to that used in Graham (1996b). Creating the benchmark MTR based on perfect foresight is initially straightforward: We use realized future taxable income as a perfect forecast of the future, whenever it is available. For instance, assume that we want to forecast 18 years of future taxable income for IBM from fiscal year 1995 to 2012. We use realized income as a (perfect) forecast of the income for the first 13 years (1995-2007). Then we simulate taxable income using forecasting models (i.e., RW, BIN, or AR(1)) for the remaining five years after 2007 (2008-2012). Our primary benchmark is constructed by forecasting the remaining “non-realized” years of taxable income using the bin and AR(1) methods. More specifically, we use the average of income forecasts by the bin and AR(1) models to fill in the non-realized data. To check the robustness of results, we also construct benchmarks based on realized income by using just the bin model (or, separately, just the AR(1) model) to fill in the non-realized data. Given that qualitative results are similar with any of these approaches, below we present the benchmark MTR based on perfectforesight forecasts when realized data are available, supplemented with the average from the bin and AR(1) forecasting models when realized data are not available.
3.4.2. Benchmark Based on Estimated Distribution of Realized Income An ideal benchmark MTR would be based on the distribution of future income that managers have in year t as they make tax-related decisions. Hence, one disadvantage of the benchmark introduced in the previous section is that it is based on just one realization (instead of a distribution) of future taxable income. While this problem may be mitigated to some extent in large samples (Graham, 1996b), we construct a second benchmark MTR based on the estimated distribution of future income. We estimate the distribution of future taxable income by assuming that the data-generating process (DGP) for the time-series of future realized income follows an AR(1) model (referred to as “benchmark based on AR(1) distributions” or BM-AR). That is, we estimate AR(1) processes of taxable income using realizations of future income. To mitigate the potential concern that the estimated distribution of future income relies on a single realization of the income stream, we employ the Blundell-Bond (1998) approach to estimate the AR(1) parameters as described in
14
Section 3.3, instead of the firm-by-firm time-series OLS estimates. Then, we create the benchmark MTR based on forecasted taxable income tied to these “true” AR(1) processes for each firm-year.
3.4.3. Benchmark Based on Analysts’ Forecasts of Earnings Our third proxy for the true MTR is based on security analysts’ earnings forecasts (hereafter we refer to it as “benchmark based on analysts’ forecasts” or BM-AN). One important advantage of using analysts’ forecasts to construct benchmark MTRs is that they are not based on realizations of future income. Also, analysts base their earnings forecasts on economic analysis, as well as statistical analysis of income processes. Institutional Brokers Estimate System (I/B/E/S) analysts do not forecast taxable income, but they do forecast earnings. Therefore, we make the following adjustment to create a forecast comparable to before-interest taxable income (“TIBIT” as described in Section 4) which is used by the simulation methods.19 Specifically, we construct a measure of before-interest taxable income ear nings per share (EPS ) × number of shares outstanding
using analysts’ forecast as (
1 – top statutory tax rate
+ interest expense).
This approach creates forecasts of future taxable income for years for which analysts’ forecasts of EPS are available. Lastly, we supplement the remaining years for which analysts do not provide earnings estimates using the average of taxable income forecasted by the bin and AR(1) models.
4. Sample Construction In our analysis, we focus on firm-year observations with available Compustat data from 1980 to 1994 (so that we have a long ‘realized’ data period against which to compare). Consistent with Blouin et al. (2009), we exclude banks, insurance companies, REITs, ADRs, and foreign registrants. Additionally, we exclude firm-years with total assets less than $1 million because when we scale taxable income by total assets or compute the growth rate of assets, small values of assets can create very influential or erroneous observations (Fama and French, 2000). 20 We require that at least three years (t, t-1, and t-2) of historical taxable income are available to estimate the parameters of the random walk model and to form the income bins for the bin model. These data requirements result in a sample of 78,959 firm-year observations (the main sample). 19
The item closest to TIBIT in the I/B/E/S database is “EBIT” (earnings before interest and taxes). However, EBIT is available only from 1994, which is the last year of our 1980 to 1994 sample period. 20 Blouin et al. (2009) do not impose this restriction on firm size in their analysis even though they also scale by assets.
15
Following Blouin et al. (2009), we measure taxable income before interest expense is deducted. We also exclude extraordinary components of income from our forecasting analysis because they are non-recurring by nature and thus including them could be problematic when forecasting future taxable income. 21 The resulting measure of taxable income is “taxable income before interest expenses and transitory items” or “TIBIT” as in Blouin et al. (2009). The income is pre-financing (i.e., before interest expenses are deducted) in that it measures taxable income before all financing decisions (Graham, Lemmon, and Schallheim, 1998). Readers interested in further details of the definition of taxable income are referred to Appendix A of Blouin et al. (2009). At times, we focus on subsamples with additional restrictions to facilitate the comparison of different MTR measures with the benchmarks. Specifically, when we compare simulated MTRs with those produced by the BM-PF and BM-AR benchmarks, we require three years of realized future taxable income (i.e., t+1, t+2, and t+3) data. We impose this restriction because a minimum three years of realized future income helps ensure that our benchmark is properly based on realized income (considering that the three-year carryback period is important when determining current-period MTRs). Adding this data requirement yields 63,851 firm-year observations. Also, when we examine benchmark MTRs based on analysts’ forecasts of earnings, the sample contains the 29,059 firm-year observations that have I/B/E/S analysts’ earnings forecasts available.
5. Analysis of Income Forecasting Models in Comparison with Benchmark Models In this section, we analyze simulated taxable income and estimated MTRs using the three income simulation models (RW, BIN, and AR(1)) in comparison to the benchmark models (BMPF, BM-AR, and BM-AN). The analysis focuses on a given model’s ability to predict true (i.e., benchmark) marginal tax rates.
5.1. Analysis of Simulated Taxable Income We begin by replicating the first two tables in Blouin et al. (2009). Appendix Table 1 replicates BCG Table 1, which shows summary statistics on the six income groups used for the bin income simulation. Panels A and B present statistics for our main sample and the sample
21
We use taxable income before interest including extraordinary components to estimate historical income and taxes paid.
16
constructed as in BCG (i.e., without the minimum $1 million asset size restriction), respectively. Appendix Table 2 replicates parts of BCG Table 2, and compares the distributions of RW, BIN, and AR(1) taxable income relative to the distribution of realized future taxable income. Overall, our appendix tables replicate the corresponding tables in BCG in various dimensions, including the summary statistics, the median forecast errors, and percentile numbers for the simulated taxable income distribution. 22 In Figure 1 we examine the ability of RW, BIN, and AR(1) to predict the sign of future taxable income (i.e., taxable profits or losses). Given that mean reversion in taxable income affects the marginal tax rate through loss carrybacks and carryforwards, the model’s ability to forecast the likelihood of future tax losses or profits is important for simulating MTRs. If future taxable income switches to profits (losses) for loss (profitable) firms, there would be a reasonable chance that MTRs of the firms will be greater than zero (less than the top rate). Given the long-horizon nature of income forecasting in the MTR estimation (see Section 2.2), we examine each model’s performance up to eight years in the future. Panel A of Figure 1 shows that for the lowest income group (income bin 1), the AR(1) model predicts the frequency of actual taxable profits consistently as well (t+2) or better (t+3 onward) than the bin and the RW models. For these loss firms, the bin model too often predicts a switch to profits (compared to the actual frequency of positive future income), while the RW model does not switch to profits as often as observed in actual data. Panel B of Figure 1 presents the predictive ability of the three models among firms in the highest-income group (bin 6). For this group of firms, AR(1) again performs consistently better than BIN and RW in predicting the frequency of positive income throughout the forecast horizon. For firms with high current-period profits, the bin model too often predicts a switch to future losses (compared to the actual frequency of future losses), while the RW model does not switch to losses as often as observed in actual data. Overall, Figure 1 shows that the AR(1) method is the most accurate of the three methods in predicting the frequency of future taxable profits/losses. Figure 1 also suggests that the bin approach too often switches the sign of future income, consistent with too much forecasted mean reversion.
22
Recall that in our main analysis we restrict total assets of firms to be larger than or equal to $1 million, while BCG do not. As shown in Panel B of Appendix Table 1, our replication of BCG is even better if we do not impose this size restriction on sample.
17
Before evaluating estimated MTRs, we first clarify that each of the three approaches (random walk, bin, AR(1)) is simply a model to forecast future taxable income, which is an input into predicting current-period MTRs. It is important to keep in mind that the ultimate goal is to estimate MTRs. Therefore, while it is helpful to understand which method produces the best forecast of a key input (i.e., taxable income) into the MTR algorithm, 23 the objective of the entire exercise is to estimate marginal tax rates. Given the nonlinear nature of the MTR algorithm (bounded between zero and about 50%, affected by tax-loss carrybacks and carryforwards, etc.) it is not sufficient to study just taxable income. We need to understand which model produces the best final product (MTRs) – and this needs to be verified by examining the distribution of the final MTR product itself. Blouin et al. (2009) never compare simulated MTRs to benchmark MTRs. We turn to this analysis in the next section.
5.2. Distribution of Marginal Tax Rates Table 1 examines distributions of MTRs based on the RW, bin, and AR(1) models. Note that in our analysis of MTR distributions, we focus on subsamples described in Section 4, which consist of observations for which (i) at least three years of future income data are available (Panel A) or (ii) analysts’ earnings forecasts from I/B/E/S are available (Panel B). Panel A of Table 1 summarizes the basic MTR distribution results (analogous to BCG Table 3). In general, the bin model estimates very few near-zero marginal tax rates. For example, the MTR is 3% at the first percentile of the BIN distribution of MTRs. In contrast, AR(1) and RW predict tax rates with a value of zero for some firms (0% MTR at the first percentile). Hence, the distribution of the MTR estimated using BIN is less skewed towards the left tail than the AR(1) or RW distributions. This overall distributional comparison might seem to imply that, except for the firms that potentially have zero tax rates, the MTRs produced by RW, AR(1), and BIN are similar. To address this issue, we examine the full distribution of the MTR using histograms by income group, as well as for the full sample. Figure 2 presents histograms of the MTR distributions for the full sample. 24 The histograms illustrate that the bin method estimates the distribution of MTRs differently than the other two methods. The distributions of the RW and AR(1) MTRs are clustered 23
Once a taxable income forecast is chosen, all three methods use the same coding to convert the inputs into marginal tax rate estimates. 24 For ease of visual comparison, we rescale computed MTRs in all histograms (but, not in any tables) so that the statutory rate for the top income group is always 34% (which is the 1988-1993 top rate), and separately scale the “claw back” tax rates for mid-range income (that are higher than 34%) to the 1988-1993 claw-back maximums.
18
around the top rate (34%) and also have a small spike near zero. In contrast, the BIN MTR has fewer observations near the top rate, very few observations near zero, and more observations with intermediate values. The intuition is straight-forward. The bin model induces more mean reversion in taxable income relative to the RW model and potentially the AR(1) model, and therefore BIN predicts that many profitable (unprofitable) firms become less (more) profitable in the future. Thus, BIN estimates below-top (non-zero) marginal tax rates more frequently for the profitable (loss) firms compared to the other two models. This point is clearer in Figure 3. For the least profitable group of firms (bin 1), both RW and AR(1) predict near-zero MTRs for one- to two-fifths of the observations. In stark contrast, BIN predicts near-zero MTRs for about 1% of the lowest income firms. Similarly, when we compare histograms in Figure 4 for highly profitable firms, AR(1) and RW predict that more than 80% of firms have near-top tax rates, while BIN predicts that only about half of these extremely profitable firms have tax rates at the statutory maximum. However, we note that the analysis so far only compares the distributions of the AR(1) and the RW MTRs relative to that of the BIN MTR, and it would be preferable to make absolute evaluations based on comparisons to a “true” benchmark. To compare the MTR distributions with the benchmark distribution, we turn to the fourth column in Panel A of Table 1, which presents the distribution of benchmark MTRs based on perfect forecasts of future income (BM-PF). Comparing the third and fourth columns indicates that the overall distribution of the MTR computed using the benchmark model, BM-PF, is most similar to the AR(1) distribution. In both of the distributions, some fraction of firms has MTRs of zero, and the 10th, 25th, and other percentile numbers are quite similar. In contrast, relative to the benchmark, the bin method produces too few zero MTR observations and thus the numbers at the first and 10th percentiles are different from those of the benchmark. The distribution of the RW MTR is also close to the perfect-foresight benchmark distribution. Furthermore, the overall distribution of benchmark MTRs based on estimated distributions of future income in the fifth column (BM-AR) is also quite similar to the distributions of AR(1) and RW tax rates (as well as being similar to the BM-PF benchmark). The similarity between the distributions of the AR(1) and benchmark MTRs is clear in Panels C through E of Figure 2, which show the distributions for the full sample. The histograms of AR(1) and benchmark (i.e., BM-PF and BM-AR) MTRs share the common u-shape, with a
19
majority of the observations clustered around the top rate. Panels C through E in Figures 3 and 4 emphasize the same point in a more salient way. 25 The AR(1) MTR distribution is very similar to the benchmark MTR distributions based on perfect-foresight forecasts of income and estimated income distributions, particularly conditional on extremely negative or positive income. In contrast, the histograms of the BIN MTRs are different from those of the benchmark MTRs, implying that the bin model produces a distribution of the marginal tax rate that is different from the true distribution (and much more clustered near mid-range MTRs). We note that the histograms of the RW MTRs are also close to those of the benchmark MTRs across samples. In sum, the results in this section are consistent with results on each model’s ability to predict future profits or losses in Section 5.1: The distribution of MTRs produced by AR(1) is closest to that of the true MTRs, while BIN MTRs are much more clustered near the center of the distributions because the bin approach induces excessive income mean reversion both for profitable and loss firms. Interestingly, although the RW model does not allow mean reversion in income, the MTR distributions for RW are similar to those for the benchmarks. One plausible explanation for this result is that the bias that the lack of mean reversion introduces to MTR estimates is mitigated by the fact that loss firms tend to have sizable loss carryforwards from previous periods while profitable firms generally do not have loss carryforwards (see Section 2.2 for related discussion). Also, of course, the RW method has the advantage of using firm-specific information. Panel B of Table 1 compares distributions of simulated MTRs to the benchmark distribution based on analysts’ forecasts (BM-AN). This analysis indicates that the AR(1) distribution of MTRs is closest to the benchmark distribution among the three forecasting models in terms of the mean, skewness, kurtosis, and percentile numbers of distributions. Moreover, Panel F in Figures 2, 3, and 4 suggests that distributions of AR(1) MTRs are closest to the benchmark distributions based on analysts’ forecasts both conditionally (on income) and unconditionally. Given that the BM-AN benchmark is based on an income distribution that differs from the other two benchmarks, the similarity of its marginal tax rate distributions to those for AR(1) provides assurance that the AR(1) model produces MTR estimates that are close to the true MTR. 25
In a separate analysis, we correct for the survivorship bias caused by the disappearance of firms from Compustat in the BM-PF benchmark. Since firms that disappear from Compustat for negative reasons (e.g., liquidation and bankruptcy) would probably have MTRs close to zero, supplementing those firms’ missing future income forecasts using the bin approach (combined with the AR(1) approach like we do in the analysis in the paper) could bias those MTRs toward higher values. In untabulated analysis we find that correcting this bias in the benchmark makes the distribution of the benchmark MTR even closer to that of the AR(1) MTR, and even less similar to that of the BIN MTR, particularly for the lowest income group.
20
5.3. Predictive Ability - Firm-by-firm Marginal Tax Rates The results in the previous section show that the MTR distribution created using AR(1) is closer to benchmark MTR distributions than are the RW and BIN approaches. While this might suggest that the AR(1) model is the closest to the true model, one important question remains: Which forecasting model predicts firm-by-firm marginal tax rates most accurately? The similarity in distributions only suggests that firm-by-firm values of each observation in those distributions may also be similar (because individual observations could “switch places” leaving the distribution looking close to accurate in appearance but with individual observations having large errors). Hence, we now examine whether the AR(1) model indeed predicts firm-by-firm benchmark marginal tax rates more accurately than the bin and RW models. Panel A of Table 2 presents the fractions of firm-year observations with an absolute prediction error (|estimated MTR - benchmark MTR|) less than 2% for each of the three benchmark MTRs (BM-PF, BM-AR, and BM-AN). 26 For the full sample and all subsamples, AR(1) has the greatest proportion of observations with small errors in terms of predicting benchmark MTRs. Focusing on the first three columns in which the true MTR is proxied by the MTR based on perfect forecasts of income (BM-PF), the absolute prediction error by AR(1) is larger than 2% in only 22% of the firm-year observations for the full sample, while the errors by BIN and RW are larger than 200 basis points in absolute value for 35% and 29% of the observations, respectively. The rate of “close” prediction is notably higher for AR(1) than BIN for all income groups. RW performs almost as well as AR(1) in the two highest income groups, but not as well as in the other income groups. The middle and rightmost columns of Table 2, Panel A also indicate that the AR(1) method produces more small-error observations than the other methods for both of the benchmarks based on the estimated distribution of future income (BM-AR) and analysts’ forecasts (BM-AN). For all three benchmarks, then, AR(1) hits closer to the benchmark target than do the RW and bin approaches. Figure 5 presents similar evidence from a different perspective. It shows the mean forecasting error of the three methods by subsample. We sort benchmark BM-PF MTRs into ten equal-sized deciles and then calculate the average forecasting error for each decile. Previous analysis indicates that BIN (RW) induces too much (little) mean reversion in taxable income. 26
The results are robust to different cutoffs (e.g., 1%, 3%, or 5%).
21
Consistent with this result, Panel B of Figure 5 shows that BIN predicts low MTRs poorly for low income groups (i.e., large error relative to the benchmark MTR). In contrast, Panel C shows that among high income firms, the RW model produces too few low MTRs. The same implications can be detected in the full sample analysis shown in Panel A. In contrast to the other two models, the AR(1) model produces smaller errors than BIN (RW) for low MTRs in the lowest (highest) income group. Importantly, the mean forecast error of AR(1) for the full sample (Panel A of Figure 5) is the smallest in absolute magnitude among the three models in nine of the ten deciles. This result also holds when we use BM-AR or BM-AN as benchmark for the true MTR (unreported). In sum, consistent with results for income forecasts and distributions of MTRs in previous sections, AR(1) performs consistently well in predicting firmby-firm benchmark MTRs for various samples, even in cases where the bin or RW models perform relatively poorly.
5.4. Robustness Check – 1970s Sample We examine whether the key findings hold for a different sample by performing the analysis using firm-year observations from 1970 to 1979. 27 Panel D of Figure 5, which parallels Panel A of Figure 5, presents the average error in predicting the perfect-foresight benchmark MTR (BM-PF) for the full sample in the 1970s. Notably, the patterns in Panel A (for 1980 to 1994) are also apparent in Panel D (for the 1970s): The AR(1) model has the smallest mean error in predicting the benchmark MTRs in every decile. In untabulated analyses, AR(1) also shows superior performance in predicting the benchmark MTRs for various income subsamples. Panel B of Table 2 presents the proportion of firm-years for which each model predicts the benchmark MTR with a small error (i.e., a prediction error within ± 2% ). Although the margin is smaller than that for the main sample from 1980 to 1994 28 (see Panel A of Table 2), AR(1) continues to perform better than the other two models in predicting benchmark MTRs in all the income bins. In short, this analysis suggests that among the three income forecasting models that we examine, the AR(1) model estimates marginal tax rates that are closest to the benchmark MTR for the 1970s sample, as it does for the main sample from 1980 to 1994. 27
The NOL carryforward period was five years from 1970 to 1975 and changed to seven years in 1976. The carryback period was three years throughout the 1970s. 28 Given that MTR estimates are more sensitive to income forecasts as the length of carryback or carryforward periods increases, this result may be a consequence of a shorter carryforward period for 1970-1979 compared to our main sample period, 1980-1994 (five to seven vs. 15 years).
22
6. Marginal Tax Benefit Functions So far, we have examined how different income simulation models predict firm-by-firm benchmark marginal tax rates as well as the distribution of the benchmark MTRs. We now turn our attention to a second tax-related issue, which is based on tax benefit functions like those calculated in Graham (2000). In particular, we investigate what effect, if any, the tax rate measurement issues described above have on inference based on benefit functions in aggregate or for the typical firm in the sample. A marginal tax benefit function measures the tax savings benefit associated with deducting interest expense for the next dollar of interest deduction. For example, a profitable firm might save $0.34 (the top rate during our sample) in taxes for each of the initial dollars of interest it deducts from taxable income, so its tax benefit function would be flat at a value of 0.34 for initial interest deductions. As the firm (hypothetically) increases its level of interest deductions, at some point the deductions become large enough that the probability that the firm will experience nontaxable states increases and thus the expected tax benefit of the incremental deductions begins to fall. At this point, the marginal benefit function becomes downward sloping (i.e., the marginal benefit of the next dollar of interest falls below $0.34) because (ignoring carrybacks) interest deductions are often not fully valued in states where interest completely offsets taxable income (or taxable income is negative). This point where the marginal tax benefit function begins to slope downward is known as the “kink” in the function. If a firm were to lever up to the kink in its function, the firm could continue to accrue full tax benefits (i.e., not reduced by the possibility of entering nontaxable states of the world) up to that amount of debt. (This statement of course ignores all costs of debt.) The kink measure can be interpreted as a measure of debt conservatism: If a firm’s kink is bigger than one, the company is not exploiting its interest tax shields to the point of declining marginal benefit; if another firm’s kink is less than one, the firm uses debt aggressively enough that it sometimes experiences nontaxable states (and does not reap the full tax benefit on the last dollars of interest deduction). It is important to point out that the kink measure of debt conservatism does not in and of itself indicate that a firm is behaving irrationally or is underlevered in a normative sense. It is possible that the costs of debt are high for a firm with a large kink (see van Binsbergen, Graham, and Yang, 2009). Alternatively, it is possible that some
23
firms with large kinks are suboptimally levered but that is difficult to prove and is well beyond the scope of this paper.
6.1 Tax Benefits of Debt and Cost of Capital In this section, we examine the value of tax benefits of debt by integrating the area under the tax benefit function. First, we create tax benefit functions using each of the RW, bin, and AR(1) models and the perfect-foresight benchmark (BM-PF) model. Second, we compute the value of tax benefits of debt by integrating the area under the benefit function up to the actual amount of interest, and divide the computed tax savings by book value of assets. Panel A of Table 3 presents the estimated average tax benefits of debt scaled by book assets for the full sample by year. The first column in the panel shows that based on the benchmark (BM-PF) benefit function, the annual tax savings due to deducting debt interest amount to 2% to 0.9% of firm value from 1980 to 1994. The decrease in the top statutory tax rate from 46% to 34% due to the 1986 Tax Reform Act is evident in the data. 29 The capitalized value of tax benefits of debt for the typical (i.e., average) Compustat firm is about the 12.7% of asset value for the benchmark over 1980 to 1994. (Capitalized savings are calculated as a perpetuity using each year’s Moody’s corporate bond interest rate as discount rate.) Importantly, for all three simulation methods (RW, bin, AR(1)) the tax benefits of debt also average about 12.7%. Thus, regardless of which simulation approach produces the best point MTR estimates (the topic studied in Section 5), the tax benefit of debt implications do not vary by simulation method. Though we have not presented specific tax benefit functions yet, this result suggests that the tax benefit functions are similar for all simulation methods, at least on average. The aggregate value of interest tax benefits is particularly interesting given that the Volker commission is currently (in December 2009) considering whether to eliminate the tax deductibility of debt interest as part of broad tax reforms. Based on our estimates of the capitalized value of tax benefits, the change in the tax code would reduce the value of the typical U.S. firm by about 13% of asset value, which is equivalent to a $4 trillion decrease in value for the aggregate Compustat firms in 2008.
29
Given that the annual tax savings estimated using the three simulation models (i.e., RW, BIN, or AR(1)) are very similar to estimates based on the benchmark model, we only report the benchmark of annual tax savings.
24
Similarly, as shown in Panel B of Table 3, the cost of capital would have increased by about 100 basis points for the typical Compustat firm from 1980 to 1994 if the interest deductibility had been eliminated. For example, under the assumption of a 7% equity premium and the average marginal tax rate for the sample (34%), the weighted cost of capital for the typical company would have increased from 13.0% to 14.1%.
6.2. Average Tax Benefit Functions Results in the previous section indicate that different simulation models produce similar estimates for the value of tax benefits of debt. In this section, we extend this analysis by comparing the average tax benefit functions estimated using the three forecasting models with the average benchmark (BM-PF) benefit function. Specifically, using each of the simulation models, we compute the average marginal tax benefit function across all firm-years in the sample for each level of interest deduction (from 0.0 to 8.0 times the actual level of interest). Figure 6 plots the average tax benefit curves. In the figure, we show average benefit curves for firm-years having marginal tax rates greater than or equal to 15% (Panels A, B, and C – high MTR sample) and smaller than 15% (Panel D – low MTR sample). 30 The average benefit functions in Panel A (for the full, high-MTR sample) indicate that the RW, bin, and AR(1) benefit curves are very close to the benchmark (BM-PF) benefit function as well as to each other. Although the benefit function based on the bin model lies slightly below those based on the other models, the differences are negligible. Panels B and C further indicate that all four tax benefit functions are roughly similar, on average, for the least and most profitable firms. Turning to the low MTR (i.e., smaller than 15%) sample, Panel D shows that the tax benefit function estimated using the bin model is the largest. This is not surprising because earlier we showed that the bin model induces too much mean reversion in taxable income for loss firms and thus tends to forecast higher MTRs than the benchmark as well as the other models. In sum, results in Figure 6 corroborate our argument that using different income forecasting models on average leads to similar implications about the tax benefit of debt. To the extent that any of the methods stands out, this occurs in Panel D for the bin tax benefit functions, which is noticeably
30
We choose the 15% cut point following Graham (2000), who sets kink to zero for firms having MTRs less than 15%. However, the quantitative results hold when we use other cut points such as 10% and 20%.
25
above the benchmark curve, in contrast to the close proximity of the curves estimated using the RW and AR(1) models.
6.3. Debt Conservatism The analysis in the previous sections indicates that conclusions based on tax benefit functions are similar for all three simulation methods. Given this, it would be surprising if implications about debt conservatism (i.e., where a firm operates on its benefit function relative to the kink) vary much across the three methods. To investigate whether different simulation models lead to differing implications about how aggressively companies use debt, we focus on the kink in marginal tax benefit functions. We do this using benefit functions calculated using the three different income simulation models as well as a benchmark model. 31 We start with the definition of “kink” used in the previous literature (i.e., the point where the benefit curve first declines from one interest increment to the next by at least 0.5% as in Graham, 2000, p. 1915). 32 Given that the choice of the 0.5% cutoff is ad-hoc, we also examine cut-offs of 1.0%, 1.5%, and 2.0%. These are all reasonable given that they are all well below one-tenth of the maximum tax rate. That is, even the largest of these alternative cutoffs only requires the benefit function to fall by about 1/20th of the maximum starting point to declare the marginal benefit function as downward sloping. As we show below, smaller cut-offs (like 0.5%) can create kink measures that are overly sensitive and misleading in terms of categorizing firm behavior relative to benefit functions. To illustrate the sensitivity of kink to the chosen cutoff, Panel A of Figure 7 presents tax benefit functions based on each of the three income simulation models, and the benchmark benefit function based on perfect-foresight forecasts (BM-PF), for a sample firm (AAR Corp. in fiscal year 1980). If we define kink using a 0.5% cutoff, AAR’s kinks computed using the bin, AR(1), RW, and benchmark BM-PF methods are 0.8, 1.8, 2.2, and 2.4, respectively. If one were to
31
In the analysis, we focus on benchmark benefit functions based on perfect-foresight forecasts of income (BM-PF), given that benchmark curves are very similar when we use an alternative benchmark measure (e.g., BM-AR). 32 Blouin et al. (2009) use a different definition of kink in their analysis. They define the kink in the benefit function as the first interest increment at which the firm’s after-interest MTR is at least 50 basis points lower than the firm’s before-interest MTR (i.e., marginal tax rate with no interest, the intercept of the benefit function on the y-axis). By definition, this approach produces smaller kinks than those in Graham (2000) and related papers (and therefore, the comparison in the Blouin et al. tables 5 and 6 to the Graham (2000) kink is apples to oranges). If we were to adopt the Blouin et al. definition in this paper, the relative comparison across simulation methods does not change, nor do the overall qualitative implications.
26
narrowly focus on these kink numbers, one might conclude that AAR uses debt aggressively based on the kink determined by the bin model but somewhat conservatively based on the other models. However, these numbers hide the fact that all four methods produce marginal tax benefit functions that are similar, as shown in the figure. That is, for all the benefit curves, AAR enjoys almost full tax benefits up to the double the actual interest deduction. It is visually clear that a reasonable measure of the point where marginal tax benefits “drop off” (i.e., the kink in benefit functions) would be about two, not 0.8 as implied by bin. Thus, the 0.5% cutoff leads to a kink that is “too small” for the bin approach. Note that for the AR(1), RW, and benchmark BM-PF models, the benefit functions in Panel A tend to stay flat before reaching a kink, and then drop steeply downward; therefore, the choice of the cutoff value to define kink makes much less difference for these three methods. In contrast, the BIN marginal benefit curve slowly begins to slope downward as interest deductions increase. Two conclusions can be drawn from these observations. First, at least for AAR Corp. in 1980, the random walk marginal benefit curve will produce a kink that is similar to the benchmark kink regardless of the chosen cutoff (including the 0.5% cutoff value). Second, for the bin model that produces benefit curves that slope downward gradually, the value of kink can be very sensitive to the chosen cutoff. In particular, a small cutoff might detect a very gradual downward slope in the benefit function and produce a kink that is numerically small, but this does not necessarily reflect aggressive debt policy. In contrast, if the cutoff bandwidth were larger, all three models appear to produce similar kinks. 33 In Panel B of Figure 7, we examine how gradually sloping benefit curves can generally affect estimates of kink. Specifically, we focus on observations for which the benchmark (BM-PF) kink is equal to 8.0 (the maximum possible) under the cutoff value of 0.5%, and plot average benefit curves created by the RW, bin, and AR(1) models. Panel B plots the average functions for which kinks estimated by each of the models (using the 0.5% cutoff value) is between 1.0 and 2.0. In Panel B the simulation models produce gradually sloping benefit curves which are not very different from the benchmark curve on average; therefore, the small kink implied by the simulation methods is due to using a small bandwidth to define kink – because a small decrease in the function is detected as the kink. While in Panel B the three benefit curves created by RW, BIN,
33
For example, if the cutoff were 200 basis points in Panel A of Figure 7, the kink for the bin, AR(1), RW, and benchmark models would be 1.8, 1.8, 2.2, and 2.4, respectively.
27
and AR(1), respectively, start to slope downward between 1.0 and 2.0 times of interest expense, the slope is very gradual – in fact, the marginal tax benefit of debt is still more than 25% at 8.0 times of interest. We explore this point in the full sample in Panel C of Figure 7 (and Table 4). In this analysis, we measure the kink in tax benefit functions using differing cutoff values (rather than focusing on a fixed 50 basis point cutoff value). By varying the cutoff used to define kink within a reasonable range, we aim to measure the “economic kink” in the benefit function, and avoid inadvertently detecting a point where the curve decreases only gradually. Panel C plots the average kink estimated using the three income simulation models and the benchmark model under various cutoff values from 0.5% through 2.0%. The panel shows that the value of kink based on the bin model is very sensitive to the choice of the cutoff value. Based on a 0.5% cutoff like that used in Blouin et al. (2009), the bin kink is far too low in comparison to the “true” benchmark kink of about 2.5. When we increase the cutoff bandwidth from 0.5% to 2.0%, the average kink more than doubles from 1.27 to 2.73 for the bin approach. In contrast, measured kink is not as sensitive for the AR(1) or RW models. When we use 0.5% as a threshold to define kink, each model computes a numerically different kink, and thus on the surface implications about corporate debt conservatism might appear to vary by simulation method. However, as we increase the cutoff to a reasonable value such as 1.5% or 2% in order to avoid the measurement issue described above, the average kink in benefit functions is around 2.5 for all models. Thus, it appears that on average U.S. public corporations use debt conservatively. 34 Therefore, Blouin et al.’s (2009) argument that firms do not seem to use debt as conservatively as previously thought appears to be driven by using the bin simulated MTR combined with their specific choice of cutoff (i.e., 0.5%); it is not driven by economically meaningful differences in the benefit functions in BCG (2009) versus those in previous studies. In sum, our evidence suggests that the bin model’s implication about corporate debt policy is sensitive to the cutoff specification to define kink, while implications from the random walk and AR(1) models are more robust to these specification issues. Furthermore, we find that when using 34
As stated before, because we have ignored the cost of debt in this analysis, this does not necessarily imply that these firms use debt suboptimally. Perhaps costs of debt justify what might appear, based on gross tax benefits, to be conservative debt policy. See van Binsbergen et al. (2009) for a study that deduces what the costs of debt must be to justify the observed debt choices and marginal tax benefit functions.
28
a definition of kink that is less sensitive to gradually sloping marginal benefit functions, the average kink for sample firms is between two and three, which is consistent with findings in Graham (2000).
6.4. High Kink and Debt Conservatism Benefit functions based on all three simulation models suggest that firms on average could increase their debt usage by two to three times before experiencing sharp decreases in tax benefits. This result begs the question of why some firms use debt conservatively (i.e., have high kinks), thereby leaving unexploited tax benefits on the table. Do these firms face high costs of debt? Graham (2000, Table 3) investigates this question by examining relations between kink and variables commonly used to proxy for the cost of debt. He finds that firms having high kinks are larger, more profitable, and pay more dividends compared to those with low kinks. Blouin et al. (2009) extend Graham’s analysis by examining additional variables that represent the ability to bear debt and the existence of non-debt tax shields. Panel A of Table 5 replicates part of Table 8 of Blouin et al., comparing firms that have kinks (based on the 0.5% cutoff value) greater than or equal to four (or as they call them, “Apparently Under-levered (AUL) Firms”) to a sample of companies that are in the top two income groups (groups 5 and 6) and have kinks less than two. The AUL companies are more likely than the comparison sample to hold more cash, have higher earnings volatility, are older and more likely to be in the technology and pharmaceutical industries. Thus, we are able to generally replicate the Blouin et al. analysis. 35 These characteristics generally appear to be consistent with the argument that AUL firms face higher costs, or do not have as much debt capacity, as do comparable firms. Notice that in Panel A IOB (interest expense divided by book assets) is much lower for the AUL firms than for the comparison sample. Therefore, it is hard to isolate whether the characteristics described in the previous paragraph identify companies with low interest, high kink, or both. It is not surprising that interest and kink are negatively correlated, nor that the characteristics in the panel are correlated with low interest debt policy given the vast capital structure literature that documents similar results (see Frank and Goyal, 2009, for a summary). 35
Note that our sample period from 1980 to 1994 is different from the 1980 to 2007 sample BCG (2009) study for the AUL analysis. We therefore do not present results for the proportion of foreign assets and the indicator for high stock option deductions which Blouin et al. examine because data on those variables are sparse during our sample period. We also note that our result for differences in accrual and cash tax rates is different from BCG’s (2009) Table 8, potentially due to the difference in sample period. However, our results for other variables are consistent with BCG’s.
29
In order to separate the low interest effect from the high kink effect, in Panel B we replicate the experiment except that for every AUL company we find an IOB-matched company among firms that have kink less than four and are in the top two income groups. 36 This matching produces a mean IOB of 0.7% for both the AUL and comparison sample. This allows us to isolate differences in kink across the two samples, holding interest usage constant. Holding interest constant, we investigate whether common measures of cost explain why some companies (the AUL firms) use debt conservatively (i.e., have kinks averaging 7) while other companies (the comparison firms) use debt more aggressively (have kinks averaging 1.3). The results in Panel B show that the AUL firms and the matched firms have relatively similar characteristics despite the large difference in kink. As we move from Panel A to Panel B, differences between the AUL and matched firms in book and market leverage, q, and age become insignificant or much less significant (based on Fama-MacBeth standard errors). Moreover, differences in earnings and assets volatilities, cash holdings, and the proportion of pharmaceutical firms reverse sign from Panels A to B, indicating that the AUL firms exhibit lower volatilities, hold less cash, and are less represented in the pharmaceutical industry than the matched firms. Only a few variables such as dividend payout and the proportion of technology-intensive firms maintain the original sign and significance in Panel B. Overall, then, the evidence is mixed and one would not conclude that common cost variables pervasively explain why high-kink AUL companies use debt less aggressively. Given our conclusion in Section 6.3 that kink based on the bin simulation model is very sensitive to the choice of the bandwidth to define kink, we base Panels C and D on the kink in perfect-foresight benchmark benefit curves (i.e., BM-PF), which is more robust to this measurement issue. In general, results in Panels C and D are similar to those in Panels A and B. Differences in most of the characteristics become insignificant or flip sign in Panel D, where we control for the level of interest (i.e., IOB); only ROA, q, dividend payout, and age are significant in Panel D (and generally speaking, the link between several of these variables and debt costs is not unambiguous anyway). Notably, the proportions of technology and pharmaceutical firms are higher for the matched sample than the AUL sample, as are earnings and asset volatility.
36
We sample matched firm-years with replacement given the relatively small number of firms having comparably high profitability in the match (i.e., kinks less than four) sample.
30
Overall, the analysis in Table 5 suggests that few of the characteristics examined by Blouin et al. (2009) seem to explain high-kink firms’ conservative usage of debt once the level of interest expenses is controlled for. That is, few of the characteristics distinguish firms that use debt conservatively (i.e., have high kinks) from those that use debt more aggressively (i.e., have low kinks), among firms that have relatively low levels of interest expense. Thus, the cross-sectional puzzle as to why some firms operate with high kinks has not been entirely resolved. This is an important area for future research.
7. Conclusion In this paper, we examine the properties of taxable income and features of income forecasting models relevant to simulating the marginal tax rate, which is an important input for many corporate decisions. We develop an AR(1) model of forecasting taxable income that incorporates the mean-reverting property of income and conditions on firm-specific information. We show that this model produces marginal tax rates that are similar to benchmark “true” MTRs. In contrast, the traditional random walk model of forecasting taxable income does not incorporate mean reversion, and hence some firms get stuck at very high or very low levels of taxable income and MTRs. At the other end of the spectrum, the bin model of forecasting taxable income allows mean reversion in taxable income but treats all firms in a given bin the same and assumes all shocks to income are permanent, producing too many MTRs that are clustered near the center of the distribution. In addition to the accuracy of simulating marginal tax rates, we also examine the implications of using different income forecasting models when calculating marginal tax benefit functions. We find that the average tax benefit function estimated using any of the simulation models is very similar to the benchmark benefit function. The common implication of these tax benefit functions is that many firms operate on the flat (or near-flat) portion of their tax benefit curve, which means that they could increase interest deductions by two or three times before beginning to experience a notable decline in marginal benefit (i.e., many firms have kinks greater than or equal to two). This is not to say that these firms necessarily use debt too conservatively because the cost of debt may justify conservative debt usage at these firms (see van Binsbergen et al., 2009). We also study whether variables commonly used to measure costs of debt help explain why some firms use debt conservatively. There is some evidence that conservative companies have
31
high growth options (as measured by q), consistent with these being high cost of debt companies. However, several other variables such as earnings volatility imply that high kink companies face lower costs than comparable companies, which indicates that more research is needed to understand the cross-sectional capital structure puzzle regarding the conservative use of debt.
32
Appendix A – Clean Surplus Formula We use the clean-surplus formula to increase assets going forward: 𝑇𝑇𝑇𝑇𝑖𝑖,𝑡𝑡+1 = 𝑇𝑇𝑇𝑇𝑖𝑖,𝑡𝑡 + 𝑇𝑇𝑇𝑇𝑖𝑖,𝑡𝑡 × �1 − 𝜏𝜏𝑖𝑖,𝑡𝑡 � − 𝐷𝐷𝑖𝑖,𝑡𝑡 ,
(4)
where TAi,t is beginning-year total assets at time t, TIi,t is forecasted taxable income in t, τi,t is an effective tax rate in t, and Di,t is dividend payout in t for firm i. The formula is commonly used to forecast book assets, given earnings and dividend policy forecasts (e.g., Lee, Myers, and Swaminathan, 1999). It implies that total assets at time t+1 are the sum of time-t total assets and after-tax income, after subtracting dividends paid out in time t. We use the top statutory tax rate value in time t for τi,t. 37 In order to estimate the firm-specific payout term (Di,t), we multiply an estimated payout ratio by the forecasted taxable income for positive income forecasts, and assume the same amount of dividends as in the previous year for negative income forecasts. Following Lee et al. (1999), we estimate the payout ratio for each firm year by dividing dividends from the previous year by after-tax income in the same year. For firms having negative earnings in the previous year, we divide the dividends by (0.06 × total assets) to estimate the payout ratio. 38 We set payout ratios less (greater) than zero (one) to zero (one). Note that for a given firm-year observation, the same estimated payout ratio is used for all 50 simulations. The clean-surplus formula should provide a reasonable approximation of growth in assets when firms do not change capital structure dramatically (e.g., by issuing a large amount of seasoned equity or debt, or paying out large dividends).
37
Alternatively, we could use a tax rate more specific to the level of expected taxable income. However, given the low threshold for the top tax rate ($0.075-0.1 million during the main sample period), such a change would not lead to significant differences in our analysis. In an unreported analysis, we confirm that our results are very robust to the tax rate used. 38 Lee et al. (1999) argue that “the long-run return-on-total-assets in the United States is approximately 6 percent.”
33
Appendix Table 1 - Summary Statistics for Six Income Groups The summary statistics are computed using Compustat firm-year observations from 1980 to 1994. Panel A presents statistics for the main sample used in this paper and Panel B presents our replication of BCG Table1. We scale all variables in this table by average total assetst. Each value shown is computed by first computing the variable’s median for each year, and then taking the average of the median values. TIBIT is taxable income before interest expense, special items, extraordinary items, and discontinued operations. TIBI is taxable income before interest expense. μt is the drift in the random walk model computed by the mean change in TIBITt for all available data up to and including year t. σt is the volatility in the random walk model computed by the standard deviation of the change in TIBITt for all available data up to and including year t. (+) t+k is the fraction of observations that have positive realized income in year t+k (among survivors). Income groups are formed by partitioning the negative (positive) income subsamples into two (four) equal-sized groups by ranking on TIBIT at year t-2 scaled by average total assets at year t-2. These cutoffs are used to partition TIBIT at year t scaled by average total assets at year t.
Panel A: Main Sample (N = 78,959) Income group
N
TIBITt
TIBITt - TIBITt-1
TIBI t-1 - TIBIT t
μt
σt
(+) t+1
(+) t+2
(+) t+3
1
8,608
-25.23%
-12.75%
10.77%
0.00%
17.45%
22%
32%
36%
2
7,622
-3.65%
-4.37%
3.17%
0.00%
8.65%
47%
56%
59%
3
17,182
4.38%
-0.72%
1.08%
0.12%
4.24%
85%
84%
85%
4
15,915
8.99%
0.64%
0.64%
0.49%
2.81%
94%
92%
92%
5
15,054
13.38%
1.79%
0.44%
0.85%
3.12%
96%
94%
92%
6
14,578
22.30%
4.91%
0.71%
1.86%
4.47%
96%
93%
92%
Panel B: BCG Sample (N = 81,595) Income group
N
TIBITt
TIBITt - TIBITt-1
TIBI t-1 - TIBIT t
μt
σt
(+) t+1
(+) t+2
(+) t+3
1
9,408
-28.75%
-11.84%
11.10%
0.00%
20.92%
21%
31%
35%
2
8,586
-4.06%
-4.34%
3.08%
0.00%
9.20%
47%
55%
58%
3
17,140
4.30%
-0.71%
1.05%
0.11%
4.34%
85%
84%
84%
4
16,267
8.94%
0.64%
0.64%
0.49%
2.84%
94%
92%
92%
5
15,378
13.40%
1.80%
0.42%
0.85%
3.15%
96%
94%
92%
6
14,816
22.44%
5.03%
0.64%
1.89%
4.59%
96%
93%
92%
Appendix Table 2 - Ability of RW, BIN and AR(1) Simulation Models to Capture Distribution of Five-year-ahead Realized Future Taxable Income This table examines the ability of the random walk, bin, and AR(1) income simulation models to capture the distribution of five-year-ahead future realized taxable income from 1980 to 1994. Simulated distribution is created by drawing from the random walk, bin, and AR(1) models, respectively. 100 draws per firm from each distribution is used to generate percentiles shown below. Then, TIBIt+k is compared with these bins. Panel A (Panel B) shows proportion of TIBIt+k that fall in designated percentiles for the full (highest-income) sample. Forecast error is TIBIt+k – simulated TIBIt+k divided by average total assets. Each value shown in the table is computed by first computing that variable’s value for each year, and then taking the average of the values.
N
Median Forecast Error
RW
56,119
0.24%
19.56%
Bin
56,114
-0.99%
6.66%
AR(1)
55,945
-2.59%
RW Bin AR(1)
11,299 11,299 11,274
-6.17% 2.62% -5.71%
Forecasting Model
%95th pctile (should be 5%)
Estimated Sim. SD/Actual SD
33.25%
20.41%
51.20%
50.86%
5.89%
93.18%
20.20% 32.33% Panel B: High-income Sample
16.52%
54.95%
21.38% 7.36% 14.93%
37.10% 90.89% 53.85%
33.47% 6.17% 23.20%
21.01% 48.65% 30.26%
35
References Almeida, Heitor and Thomas Philippon, 2007, The risk-adjusted cost of financial distress, Journal of Finance 62, 2557-2586. Basu, Sdipta, 1997, The conservatism principle and the asymmetric timeliness of earnings, Journal of Accounting and Economics 24, 3-37. van Binsbergen, Jules H., John R. Graham, and Jie Yang, 2009, The cost of debt, Working paper, Duke University. Bhattacharya, Sudipto, 1978, Project valuation with mean reverting cash flow streams, Journal of Finance 33, 1317-1731. Blouin, Jennifer, John E. Core, and Wayne R. Guay, 2009, Have the tax benefits of debt been overstated?, Journal of Financial Economics Forthcoming. Blundell, Richard and Stephen Bond, 1998, Initial conditions and moment restrictions in dynamic panel data models, Journal of Econometrics 87, 115–143. Brooks, LeRoy D. and Dale A. Buckmaster, 1976, Further evidence of the time series properties of accounting income, Journal of Finance 31, 1359-1373. Dechow, Patricia M., Amy P. Hutton, and Richard G. Sloan, 1999, An empirical assessment of the residual income valuation model, Journal of Accounting and Economics 26, 1-34. Fama, Eugene and Kenneth French, 2000, Forecasting profitability and earnings, Journal of Business 72, 161-175. Frank, Murray Z. and Vidhan K. Goyal, 2009, Capital structure decisions: Which factors are reliably important?, Financial Management 38, 1-37. Freeman, Robert N., James A. Ohlson, and Stephen H. Penman, 1982, Book rate-of-return and prediction of earnings changes: An empirical investigation, Journal of Accounting Research, 20, 639-653. Gorbenko, Alexander S. and Ilya A. Strebulaev, 2009, Temporary versus permanent shocks: Explaining corporate financial policies, Working paper, Stanford University. Graham, John R., 1996a, Debt and the marginal tax rate, Journal of Financial Economics 41, 4173. Graham, John R., 1996b, Proxies for the corporate marginal tax rate, Journal of Financial Economics 42, 187-221. Graham, John R., 2000, How big are the tax benefits of debt?, Journal of Finance 55, 1901-1941.
Graham, John R., Michael L. Lemmon, and James S. Schallheim, 1998, Debt, leases, taxes and the endogeneity of corporate tax status, Journal of Finance 53, 131-162. Hamilton, James D., 1994, Time-series Analysis, Princeton University Press, Princeton, N.J. Hsiao, Cheng, 2003, Analysis of Panel Data, Cambridge University Press, Cambridge, U.K. Kendall, M.G., 1954. Note on bias in the estimation of autocorrelation. Biometrika 41, 403-404. Kothari, S.P., 2001, Capital markets research in accounting, Journal of Accounting and Economics 31,105–232. Lee, Charles M. C., James Myers, and Bhaskaran Swaminathan, 1999, What is the intrinsic value of the Dow?, Journal of Finance 54, 1693-1741. Lemmon, Michael L., Michael R. Roberts, and Jaime F. Zender, 2008, Back to the beginning: Persistence and the cross-section of corporate capital structure, Journal of Finance 63, 1575-1608. Scholes, Myron S., Merle M. Erickson, Edward L. Maydew, and Terrence J. Shevlin, Taxes and Business Strategy: A Planning Approach, Upper Saddle River, NJ: Prentice Hall, 2008. Shevlin, Terry, 1990, Estimating corporate marginal tax rates with asymmetric tax treatment of gains and losses, The Journal of the American Taxation Association 11, 51-67.
37
Figure 1 - Ability of Income Simulation Models to Predict Sign of Future Income We simulate future taxable income 50 times from year t+1 to t+8 using the random walk, the bin, and the AR(1) models. This table compares the frequency of positive taxable income forecasts produced by each of the three simulation models with the actual frequency from year t+1 to t+8 for the lowest- (Panel A) and highest-income samples (Panel B). .
Panel A: Lowest-income Sample 60%
50% 40%
RW
30%
BIN
AR(1)
20%
Actual
10%
0%
t+1
t+2
t+3
t+4
t+5
t+6
t+7
t+8
Panel B: Highest-income Sample 100% 95% 90%
RW
BIN
85%
AR(1)
Actual
80% 75%
t+1
t+2
t+3
t+4
t+5
t+6
t+7
t+8
Figure 2 - Marginal Tax Rate Histograms for Full Sample This figure depicts the histograms of the marginal tax rate distribution for the full sample computed using the random walk, bin, and AR(1) models, and the benchmarks models based on perfect-foresight forecasts (BM-PF), estimated distributions of future income (BM-AR), and analysts’ forecasts (BM-AN). For the histograms (though not in tables in this paper), we rescale all MTRs such that the top statutory rate is always 34% (the 1988-1993 top rate), and restrict the sample to the firm-year observations that have and at least three years of realized future taxable income data (Panels A to E, N = 63,851) or analysts’ earnings forecasts (Panel F, N = 29,059).
Panel A: Random Walk Model
Panel B: Bin Model
80
80
70
70
60
60
50
50
40
40
30
30
20
20
10
10
0
0 0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
Panel C: AR(1) Model
Panel D: Perfect-foresight Benchmark
80
80
70
70
60
60
50
50
40
40
30
30
20
20
10
10 0
0
0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
Panel E: Distribution Benchmark
Panel F: Analyst Benchmark
80
80
70
70
60
60
50
50
40
40
30
30
20
20
10
10
0
0 0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
39
Figure 3 - Marginal Tax Rate Histograms for Lowest-income Sample This figure depicts the histograms of the marginal tax rate distribution for the lowest-income sample computed using the random walk, bin, and AR(1) models, and the benchmarks models based on perfect-foresight forecasts (BM-PF), estimated distributions of future income (BM-AR), and analysts’ forecasts (BM-AN). For the histograms (though not in tables in this paper), we rescale all MTRs such that the top statutory rate is always 34% (the 1988-1993 top rate), and restrict the sample to the firm-year observations that have and at least three years of realized future taxable income data (Panels A to E, N = 63,851) or analysts’ earnings forecasts (Panel F, N = 29,059).
Panel A: Random Walk Model
Panel B: Bin Model
45
45
40
40
35
35
30
30
25
25
20
20
15
15
10
10
5
5
0
0 0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
Panel C: AR(1) Model
Panel D: Perfect-foresight Benchmark
45
45
40
40
35
35
30
30
25
25
20
20
15
15
10
10
5
5
0
0 0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
Panel E: Distribution Benchmark
Panel F: Analyst Benchmark
45
45
40
40
35
35
30
30
25
25
20
20
15
15
10
10
5
5 0
0
0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
40
Figure 4 - Marginal Tax Rate Histograms for Highest-income Sample This figure depicts the histograms of the marginal tax rate distribution for the highest-income sample computed using the random walk, bin, and AR(1) models, and the benchmarks models based on perfect-foresight forecasts (BM-PF), estimated distributions of future income (BM-AR), and analysts’ forecasts (BM-AN). For the histograms (though not in tables in this paper), we rescale all MTRs such that the top statutory rate is always 34% (the 1988-1993 top rate), and restrict the sample to the firm-year observations that have and at least three years of realized future taxable income data (Panels A to E, N = 63,851) or analysts’ earnings forecasts (Panel F, N = 29,059).
Panel A: Random Walk Model
Panel B: Bin Model
90
90
80
80
70
70
60
60
50
50
40
40
30
30
20
20
10
10
0
0 0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
Panel C: AR(1) Model
Panel D: Perfect-foresight Benchmark
90
90
80
80
70
70
60
60
50
50
40
40
30
30
20
20
10
10
0
0 0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
Panel E: Distribution Benchmark
Panel F: Analyst Benchmark
90
90
80
80
70
70
60
60
50
50
40
40
30
30
20
20
10
10
0
0 0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
0.00 0.04 0.08 0.12 0.16 0.20 0.24 0.28 0.32 0.36 0.40 0.44 0.48
41
Figure 5 - Average Prediction Errors for Marginal Tax Rates Taxable income is simulated using each of the random walk, bin, and AR(1) models and the marginal tax rate is computed based on the simulated income. For each firm-year observation, the procedure is repeated 50 times to incorporate the uncertainty in income simulation, and the average of the 50 MTRs is taken as the estimate of the expected MTR. The benchmark MTR is computed using perfect-foresight forecasts of future income supplemented by the bin and AR(1) simulated income (BM-PF). For each firm-year observation, the prediction error is computed as “estimated MTR benchmark MTR”. We sort the benchmark MTRs into 10 equal-sized bins and report the average of the prediction errors for each bin and each method. We restrict the sample to the firm-year observations that have at least three years of realized future taxable income data. Panel A (Panel D) presents the average prediction error for the full sample from 1980 to 1994 (from 1970 to 1979) and Panels B and C present the average error for the lowest- and highest-income samples from 1980 to 1994.
Panel A: Full Sample from 1980 to 1994 10.0%
BIN AR(1) RW
Mean Prediction Error
8.0% 6.0% 4.0% 2.0% 0.0%
-2.0%
1
2
3
4
5
-4.0%
6
7
8
9
10
Benchmark MTR Group
Panel B: Lowest-income Sample from 1980 to 1994
Mean Prediction Error
10.0%
BIN AR(1) RW
5.0% 0.0%
-5.0%
1
2
3
4
5
6
-10.0% -15.0% -20.0%
Benchmark MTR Group
42
7
8
9
10
Panel C: Highest-income Sample from 1980 to 1994 6.0%
BIN AR(1) RW
5.0%
Mean Prediction Error
4.0% 3.0% 2.0% 1.0% 0.0%
-1.0% -2.0%
1
2
3
4
-3.0%
5
6
7
8
9
10
Benchmark MTR Group
Panel D: Full Sample from 1970 to 1979 4.0%
BIN AR(1) RW
Mean Prediction Error
3.0% 2.0% 1.0% 0.0%
-1.0%
1
2
3
4
5
6
-2.0% -3.0%
Benchmark MTR Group
43
7
8
9
10
Figure 6 - Average Tax Benefit Function This figure plots average marginal tax benefit functions based on the random walk, the bin and the AR(1) models, and the benchmark based on perfect-foresight forecasts of income (BM-PF). Panels A, B, and C (D) plot average benefit functions for observations having the MTR at zero interest deduction larger than or equal to (smaller than) 15%. For each level of interest deduction from 0.0 to 8.0, we compute the average of marginal tax benefits across all firm-years in the sample.
Panel A: Full Sample (High MTR) 45%
Tax Benefit
40% 35% 30%
BM-PF
25%
RW
20%
AR(1)
15%
BIN
10%
5%
0.0
1.0
2.0
3.0
4.0
5.0
Interest Deduction
6.0
7.0
8.0
Panel B: Lowest-income Sample (High MTR) 45%
Tax Benefit
40% 35% 30%
BM-PF
25%
RW
20%
AR(1)
15%
BIN
10%
5%
0.0
1.0
2.0
3.0
4.0
5.0
Interest Deduction
44
6.0
7.0
8.0
Panel C: Highest-income Sample (High MTR)
Tax Benefit
45% 40% 35%
BM-PF
RW
30%
AR(1)
BIN
25% 20%
0.0
1.0
2.0
3.0
4.0
5.0
Interest Deduction
6.0
7.0
8.0
Panel D: Full Sample (Low MTR) 16%
Tax Benefit
14% 12% 10%
BM-PF
8%
RW
6%
AR(1)
4%
BIN
2% 0%
0.0
1.0
2.0
3.0
4.0
5.0
Interest Deduction
45
6.0
7.0
8.0
Figure 7 - Kink and Debt Conservatism We define kink as the point where the tax benefit function begins to decline (i.e., the point where the tax benefit curve first declines by at least 0.5%, 1%, 1.5%, or 2% from one interest increment to the next). Kink can be interpreted as a measure of debt conservatism in the sense that a firm’s marginal tax benefit of debt remains (almost) the same even if the firm were to increase its debt level proportional to its kink, so a large kink means interest can be increased more without losing incremental benefits. We estimate the marginal tax benefit functions by computing the marginal tax rates determined by deducting 20% to 800% of actual interest expenses (interest expense plus one-third of rent expense) from taxable income. We set kinks above 8.0 to 8.0. Panel A plots four tax benefit functions computed using the random walk, the bin, the AR(1), and the benchmark BM-PF models for AAR Corp. in fiscal year 1980. Panel B plots the average of the random walk, bin, and AR(1) benefit functions when the benchmark (BM-PF) kink is 8 and kinks are between 1 and 2 for the three simulation models. Panel C plots the average kinks in the benefit function across all firms in the sample computed using the three simulation models and the BM-PF benchmark model for differing cutoff values.
Panel A: Tax Benefit Functions for AAR Corp. in 1980 50%
Tax Benefit
45%
40% 35% 30%
BM-PF
25%
RW
20%
AR(1)
15% 10%
5% 0%
BIN 0.0
1.0
2.0
3.0
4.0
5.0
Interest Deduction
6.0
7.0
8.0
Panel B: Average Benefit Functions (BM-PF Kink = 8 and 1 ≤ Other Kinks < 2) 40%
Tax Benefit
35% 30% 25%
BM-PF
20%
RW
15%
AR(1)
10%
BIN
5% 0%
0.0
1.0
2.0
3.0
4.0
5.0
Interest Deduction
46
6.0
7.0
8.0
Panel C: Average Kink for Differing Cutoffs 3.0
Average Kink
2.8 2.6 2.4 2.2
BM-PF
2.0
RW
1.8
AR(1)
1.6
BIN
1.4 1.2 1.0
0.5%
1.0%
1.5%
Kink Cutoff
47
2.0%
Table 1 - Distribution of Marginal Tax Rates Taxable income is simulated using each model and then the marginal tax rate is computed based on the simulated income. For each firm-year observation, this procedure is repeated 50 times to incorporate the uncertainty in income simulation, and the average of the computed MTRs is taken as the estimate of the expected MTR. Panel A compares the distributions of MTRs estimated using the random walk bin, and AR(1) income simulation models with the distributions of benchmark MTRs based on perfect-foresight forecasts of future income (BM-PF) and estimated distributions of future income (BM-AR). In this panel, we restrict the sample to the firm-year observations that have at least three years of realized future taxable. Panel B compares the distribution of the MTR estimated using the three income simulation models with the distribution of benchmark MTRs based on analysts’ earnings forecasts (BM-AN). In this panel, we restrict the sample to the firm-year observations that have analysts’ earnings forecasts data from I/B/E/S.
Panel A: Comparison to Benchmark MTRs Based on Perfect-foresight Forecasts and Estimated Distributions of Income RW
BIN
AR(1)
BM-PF
BM-AR
N
63,851
63,851
63,851
63,851
63,851
Mean
34%
34%
34%
35%
34%
Std Dev
13%
12%
13%
13%
14%
Skew
-1.2
-1.1
-1.4
-1.3
-1.3
Kurt
0.5
0.5
1.0
1.0
0.9
0%
0%
0%
0%
0%
0%
1%
0%
3%
0%
0%
0%
10% 25% 50%
10% 32% 34%
14% 32% 34%
9% 34% 35%
9% 34% 35%
7% 34% 35%
75%
46%
45%
46%
46%
46%
90%
46%
46%
46%
46%
46%
99%
50%
48%
50%
51%
51%
100%
51%
51%
51%
51%
51%
Panel B: Comparison to Benchmark MTRs Based on Analysts’ Income Forecasts
N
RW 29,059
BIN 29,059
AR(1) 29,059
BM-AN 29,059
Mean
35%
36%
36%
37%
Std Dev
11%
9%
10%
9%
Skew
-1.6
-1.3
-1.7
-1.7
Kurt
2.9
2.9
4.1
5.2
0%
0%
0%
0%
0%
1%
0%
5%
0%
0%
10%
24%
30%
30%
34%
25% 50% 75% 90% 99% 100%
34% 34% 46% 46% 46% 51%
33% 34% 45% 46% 46% 51%
34% 34% 46% 46% 46% 51%
34% 35% 46% 46% 46% 51%
49
Table 2 - Ability of RW, BIN, and AR(1) Simulation Models to Predict Benchmark Marginal Tax Rates Taxable income is simulated and the marginal tax rate is computed as described in Table 1. For each firm-year observation we compute the prediction error by “estimated MTR- benchmark MTR”. The fraction of observations with the absolute prediction error less than 2% is reported using each of the benchmarks for samples from 1980 to 1994 (Panel A) and from 1970 to 1979 (Panel B). In this table, we restrict the sample to firm-year observations that have at least three years of realized future taxable income data (analysts’ earnings forecasts data from I/B/E/S) when the benchmark is BM-PF or BM-AR (BM-AN).
Panel A: Samples from 1980 to 1994 Benchmark
BM-PF
BM-AR
BM-AN
Income group
N
RW
BIN
AR(1)
N
RW
BIN
AR(1)
N
RW
BIN
AR(1)
1
5,514
33%
16%
41%
5,514
33%
15%
42%
1,183
24%
16%
28%
2
5,567
38%
37%
47%
5,567
38%
36%
49%
1,642
40%
44%
52%
3
13,950
52%
52%
70%
13,950
52%
52%
70%
5,872
62%
66%
79%
4
13,493
82%
77%
89%
13,493
82%
77%
89%
6,259
88%
88%
93%
5
12,820
89%
82%
92%
12,820
89%
82%
92%
6,651
94%
93%
96%
6
12,507
92%
80%
92%
12,507
92%
80%
92%
7,452
97%
91%
97%
All
63,851
71%
65%
78%
63,851
71%
64%
79%
29,059
81%
80%
87%
Panel A: Samples from 1970 to 1979 (Robustness Check) Benchmark
BM-PF
Income group
N
RW
BIN
AR(1)
1
971
31%
32%
43%
2
981
51%
60%
62%
3
8,112
61%
74%
80%
4
8,078
90%
93%
94%
5
8,205
94%
96%
96%
6
8,282
97%
96%
98%
All
34,629
83%
87%
90%
50
Table 3 –Tax Benefits of Debt and the Impact of Eliminating Interest Tax Deductibility For each year, Panel A presents the mean value of the annual and capitalized (discounted using the average corporate bond rate from Moody’s) tax benefits of debt estimated using the RW, bin, and AR(1) models, and the benchmark model based on perfect-foresight forecasts (BM-PF) divided by total book assets. Panel B presents the impact of the potential elimination of interest tax deductibility on the weighted cost of capital for the average Compustat firm from 1980 to 1994 depending on the assumptions on the equity risk premium (4% vs. 7%) and corporate marginal tax rates (the mean top statutory rate vs. the sample mean MTR).
Panel A: Value of Tax Benefits of Debt by Year
Year 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 Total
Annual Savings BM-PF 2.0% 2.1% 1.9% 1.6% 1.6% 1.6% 1.5% 1.3% 1.2% 1.2% 1.1% 1.1% 1.0% 0.9% 0.9% 1.4%
Capitalized Savings RW BIN 15.4% 15.8% 13.8% 13.9% 12.3% 12.6% 12.7% 12.3% 12.2% 12.3% 13.0% 13.2% 15.8% 15.2% 14.4% 13.8% 11.3% 11.3% 12.5% 12.5% 11.3% 11.1% 11.8% 11.6% 11.7% 11.3% 12.2% 12.4% 11.5% 11.2% 12.7% 12.6%
BM-PF 15.6% 13.6% 12.4% 12.8% 12.1% 13.0% 15.8% 14.4% 11.3% 12.3% 11.3% 11.8% 11.8% 12.6% 11.4% 12.7%
Panel B: Impact of Eliminating Interest Tax Deductibility on Cost of Capital Equity Risk Premium Marginal Tax Rate
4.0%
7.0%
40% (mean top rate)
10.7% → 12.0%
12.8% → 14.1%
34% (sample mean)
10.9% → 12.0%
13.0% → 14.1%
51
AR(1) 15.5% 13.9% 12.6% 12.9% 12.4% 13.1% 15.8% 14.2% 11.2% 12.2% 11.1% 11.5% 11.4% 12.2% 10.8% 12.6%
Table 4 - Average Kink for Varying Bandwidths This table presents the average of kinks in individual firms’ benefit functions using differing cutoffs to define the kink in the tax benefit function for the random walk, bin, and AR(1) simulation models, and the benchmark based on perfect-foresight forecasts of future income (BM-PF). We define kink as the point where the tax benefit function begins to decline (i.e., the point where the tax benefit curve first declines by at least 0.5%, 1%, 1.5%, or 2% from one interest increment to the next). Kink can be interpreted as a measure of debt conservatism because a firm’s marginal tax benefit of debt remains (almost) the same even if the firm increases its debt level proportional to its kink. We estimate the marginal tax benefit function by computing the marginal tax rates determined by deducting 20% to 800% of actual interest expenses (interest expense plus one-third of rent expense) from taxable income. We set kinks above 8.0 to 8.0. Bandwidth
RW
BIN
AR(1)
BM-PF
0.5%
2.10
1.27
1.55
2.48
1.0%
2.46
2.04
2.15
2.56
1.5%
2.71
2.45
2.47
2.62
2.0%
2.86
2.73
2.69
2.66
52
Table 5 - Apparently Under-levered Firms and Debt Conservatism All values and standard errors are computed using the Fama-MacBeth procedure from 1980 to 1994. “Kink,” “Tech,” and “Pharm” are the average of the annual means, and the other variables are the average of annual medians. In Panels A and B (Panels C and D) “AUL” firms have BIN (BM-PF) kinks of four or greater. “Comparison sample” in Panels A and C represents firms that are in the top two income groups with kinks less than two, and “Matched Sample” in Panels B and D represents firms that are matched to the AUL firms based on interest over book assets (IOB) among firms that are in the top two income groups and have kink less than four. “Assets” is total book assets; “Book Leverage” is total debt over total book assets; “Market Leverage” is the ratio of debt to the market value of assets; “IOB” is interest expenses divided by book assets; “Kink” is the kink in tax benefit functions; “ROA” is taxable income scaled by book assets; “Div” is dividends scaled by book assets; “Cash” is the ratio of cash and cash equivalents to book assets; “Q” is the ratio of the market value of assets to book assets; “Tech” is one if the firm’s SIC is 3678, 7372, 7370, 3674, 3577, 3571, or 3572; zero otherwise; “Pharm” is one if the firm’s SIC is 2835, 2834, or 836; zero otherwise; “Earnings Vol.” is the volatility of the ratio of EBIT to average assets; “Age” is the number of years the firm appears in CRSP; “Tax Diff” is the difference between the accrual effective tax rate and the cash effective tax rate scaled by the top statutory tax rate; “Asset Vol.” is the volatility of the unlevered equity returns. Reported tstatistics are based on Fama-MacBeth standard errors.
Panel A: Replication of Blouin et al. (2009) Table 8 N Comparison Sample AUL Firms Difference
18,548 8,540
t-stat
Assets ($m) 80 88 7.71
Book Leverage 28.3% 2.4% -25.9%
Market Leverage 19.6% 1.3% -18.2%
IOB
Kink
ROA
Div
Cash
Q
Tech
Pharm 1.6% 3.3% 1.7%
Earnings Vol. 5.7% 6.8% 1.1%
4.7% 0.7% -3.9%
0.60 6.99 6.39
14.3% 18.4% 4.1%
0.4% 1.6% 1.3%
5.2% 18.0% 12.8%
1.31 1.74 0.43
4.4% 6.6% 2.2%
0.86
-39.89
-19.62
-22.87
389.81
12.44
6.19
17.39
5.60
6.22
7.11
Age 11.07 12.33 1.27
Tax Diff 4.0% 1.6% -0.02
Asset Vol. 34.3% 36.5% 2.2%
5.97
1.75
-2.39
1.09
Age 12.27 12.33 0.07
Tax Diff 5.3% 1.6% -3.7%
Asset Vol. 43.4% 36.5% -6.9%
0.07
-1.56
-2.57
Panel B: Comparison of “AUL” Firms with Matched Firms Based on IOB N Matched Sample AUL Firms Difference t-stat
8,538 8,540
Assets ($m) 30 88 58.22
Book Leverage 2.7% 2.4% -0.3%
Market Leverage 2.4% 1.3% -1.1%
IOB
Kink
ROA
Div
Cash
Q
Tech
Pharm
0.7% 0.7% 0.0%
1.31 6.99 5.68
15.1% 18.4% 3.3%
0.5% 1.6% 1.1%
23.0% 18.0% -5.0%
1.48 1.74 0.26
4.9% 6.6% 1.7%
4.0% 3.3% -0.7%
Earnings Vol. 8.0% 6.8% -1.2%
7.10
-0.34
-1.58
0.01
280.59
6.22
4.10
-2.66
2.39
4.67
-2.44
-3.26
53
Panel C: Replication of Blouin et al. (2009) Table 8 Using Benchmark BM-PF Kink
Comparison Sample AUL Firms Difference
N
Assets ($m)
Book Leverage
Market Leverage
IOB
Kink
ROA
Div
Cash
Q
Tech
Pharm
Earnings Vol.
Age
Tax Diff
Asset Vol.
8,427 18,815
45 122 77.17
36.3% 9.1% -27.2%
25.7% 5.2% -20.5%
6.6% 1.7% -4.9%
0.80 6.83 6.02
13.9% 15.6% 1.6%
0.0% 1.5% 1.5%
4.9% 11.5% 6.6%
1.25 1.54 0.28
5.2% 5.3% 0.1%
1.6% 3.1% 1.5%
7.2% 6.0% -1.2%
9.73 13.13 3.40
3.2% 3.2% 0.00
37.4% 34.6% -2.8%
9.36
-26.69
-17.38
-20.01
346.46
4.77
8.96
16.08
4.84
0.28
6.25
-5.26
5.12
-0.02
-1.27
t-stat
Panel D: Comparison of “AUL” Firms with Matched Firms Based on IOB Using Benchmark BM-PF Kink
Matched Sample AUL Firms Difference t-stat
N
Assets ($m)
Book Leverage
Market Leverage
IOB
Kink
ROA
Div
Cash
Q
Tech
Pharm
Earnings Vol.
Age
Tax Diff
Asset Vol.
18,623 18,631
30 122 92.53
8.8% 9.0% 0.1%
6.0% 5.1% -0.8%
1.7% 1.6% 0.0%
1.55 6.83 5.28
14.5% 15.6% 1.0%
0.2% 1.5% 1.4%
13.9% 11.6% -2.3%
1.36 1.54 0.19
6.2% 5.3% -1.0%
3.5% 3.1% -0.4%
8.6% 6.0% -2.6%
9.87 13.07 3.20
5.0% 3.2% -0.02
46.8% 34.6% -12.2%
13.18
0.15
-1.00
-0.23
244.01
2.82
6.98
-2.99
2.55
-2.66
-1.60
-6.12
5.14
-0.75
-4.73
54