Bayesian Selection of Risk Factors and Estimation of ...

1 downloads 0 Views 515KB Size Report
Jan 1, 2018 - risk-averse investor holds a portfolio without any residual risk and δ is the risk premium on the ..... a fast and efficient way to identify promising variables witch have high marginal probability of γj = 1. In .... Table 1: List of potential Risk Factors ...... around a long run average with episodes of high volatility.
Bayesian Selection of Risk Factors and Estimation of Risk Premiums in the APT model Rachida Ouysse



Robert Kohn†

November 11, 2008

Abstract Empirical tests of the arbitrage pricing theory using measured variables rely on the accuracy of standard inferential theory in approximating the distribution of the estimated risk premiums and factor betas. The techniques employed thus far are only accurate conditional on the factor structure being deterministic and correctly specified. We adapt recent advances in Bayesian analysis to an approximate factor model to investigate the role or otherwise of measured economic variables in the pricing of securities. Posterior distributions of functions of risk premiums and factor betas provide exact statistical tools to handle the severe smallsample problems and post model selection inference. We present new empirical evidence of time-varying factors’ risk premiums. Our findings show that expected compensation for bearing economy wide risk is time varying and changing with the business cycle. Risk premiums for exposure to economic factors are higher during recessions and financial distress with higher volatility during contraction phases. This study also provides strong evidence for market reward for ’Economic’ risk measured by unanticipated inflation, unemployment rate and changes in industrial production. Finally, we find no robust empirical evidence to support the viability of Fama and French size and book-to-market factors as pervasive risk factors in the pricing of Industry and size portfolio returns. JEL Classification: C1, C22, C52 Keywords: Arbitrage pricing theory, Risk premium, Variable selection, Posterior density, Bayes factors, MCMC.



Corresponding author, School of Economics, The University Of New South Wales, Sydney 2052 Australia. Email: [email protected]. † School of Economics, The University Of New South Wales, Sydney 2052 Australia. Email: [email protected]

1

2

1

Selection of Risk Factors

Introduction

Since the inception of the arbitrage pricing theory (APT) by Ross [43], there has been an extensive body of work investigating the pricing of securities and the risk premiums’ sensitivities to changes in the state of the economy. After the overwhelming empirical failure of the market beta to explain the cross-section variation in asset returns, multi-factor models like the APT became at the center of empirical investigation in the last two decades. The APT has the attractive feature of minimal assumptions about the nature of the economy. However despite its popularity, the APT tractability comes at the cost of certain ambiguities such as an approximate pricing relation and an unknown set of pervasive factors. In a pioneering study, Chen, Roll and Ross [14] assert, The theory has been silent, however, about which events are likely to influence all assets. A rather embarrassing gap exists between the theoretically exclusive importance of systematic “state variables” and our complete ignorance of their identity. (Chen, Roll and Ross, 1986 p. 384.) Despite the work of many, selection of the ”appropriate” set of factors remains one of the most challenging tasks in modern empirical finance. The bulk of the empirical literature on the APT has tried to answer two questions. First, how many factors exist in the return generating process for the universe of assets and what are they? Second, how closely does the ex post market pricing relationship conform with the APT implications ex ante predicted by the theory. As noted by, Kryzanowski & To [37], unambiguous empirical testing of the APT requires validation of a series of assumptions. One of which is that security returns are characterized by an explicit factor structure composed of at least one common factor which impacts the pricing of all security returns. Ambiguities about the factor structure will result in ambiguous conclusions about the validity of the APT and tests of its implications. In this paper we provide both empirical and methodological contributions to the existing literature. First, we adapt recent advances in Bayesian variable selection to an approximate factor model with observable factors. Unlike other studies, identification of the factors , estimation and testing of the asset pricing theories are investigated simultaneously by treating the set of factors as an unobserved random variable. Post model selection inference about the APT implications and the pricing relationship are based on probability distributions that integrate the uncertainty about the set of factors in the returns’ data generating process. This structure allows for flexible covariance matrix of the idiosyncratic component of the return generating process, which fits well with the approximate pricing version of the APT model. Second, we present new empirical evidence of time-varying factor risk premiums and we demonstrate that pricing of industry and Size portfolio returns can ultimately be determined by systematic economic risks instead of firm-specific variables. Much work has been done to formalize the search of pervasive factors when the latter are latent. This paper is related to a much smaller literature on the selection of measured economic risk factors in an approximate factor model. There are two main streams in the literature addressing factor selection in the APT. The first uses latent (unobservable) factors as sources of common variations. These common factors are estimated from sample covariance matrices using statistical techniques like factor analysis and principal components. See among others the work of Connor & Korajczyk [17],[18], Chamberlain & Rothschild ([13]. Recent advances in high-dimensional factor models have treated different aspects of the estimation of the latent factors and factor loadings (betas). Bai & Ng [4] use principal component analysis and address consistent estimation of the number of latent factors in an approximate factor model where

Ouysse& Kohn

3

both the number of securities and observations is large. Bai [3] developed asymptotic theory for inference about the estimated factors and factor loading. He also developed inferential theory for the estimated systematic risk. Uncertainty about the number of factors can be ignored for large dimensions because this number can be consistently estimated. Simulation study in Bai [3] shows that large theory is a good approximation for panel dimensions encountered in practice. Although latent factor analysis has its merits, the estimated statistical factors and their prices cannot be given economic interpretation. Rather than using only asset prices to explain asset prices1 , the second view uses measured macroeconomic factors to introduce additional information, therefore providing a link between asset-price behavior and macroeconomic events. Chen, Roll & Ross [14] combined a set of macroeconomic and financial variables as “likely” candidates for pervasive risk in asset returns. Using a series of t-test for the significance of average risk premiums for each variable allowed into the regression, they identify five pervasive risk factors: shocks to real industrial production, yield spread and default risk, unexpected inflation and change in expected inflation. Using firm characteristics as a proxy for the firms’ sensitivity to systematic risk in the economy, Fama & French [23] show that the variation in returns on stocks and bonds can be explained by five factors based on size and book-to-market. Despite the plethora of related work that followed, the literature on the APT with measured factors is yet to address two issues. (1) Formal statistical method for factor selection: the number of factors is often assumed and arbitrarily pre-specified and only few specifications are tested. (2) Post factor selection inference: testing the validity of the pricing model and its implications is conditional on the model specification being deterministic. The use of the standard statistical tools does not take into account the pretesting exercise that goes on before estimating the model. As argued by Fama (1991), 2 ..there is a danger that measured relations between returns and economic factors may be spurious,the result of special features of a particular sample (factor dredging). Thus the Chen, Roll and Ross tests, and future extensions, warrant extended robustness checks. (Fama, 1991, p. 1595). Ouysse [40] proposes a formal econometric procedure to consistently select the set of pervasive measured factors in a high-dimensional factor model. The factors are selected using a combination of information criteria similar to those used in the latent factors case and a ranking procedure to reveal their significance. However, she does not address the post-selection properties of the model estimates. The APT implies nonlinear restrictions on the model parameters, which make it very difficult and complex to derive the asymptotic distribution of the restricted estimates, let alone, the distribution of post-selection restricted estimates. Bayesian approach enables exact inference by making it possible to implicitly incorporate model uncertainty and to derive post-selection distributions of any functions of the parameters. In particular, this method makes it possible to derive the exact post-selection posterior distributions for the pricing error and for the systematic risk premiums and factor betas. Geweke and Zhou [34](hereafter GZ) use a bayesian framework to analyze the APT. The authors propose the use of the pricing error to test the implication the APT that the expected returns are approximately linear function of the risk premium on systematic factors. The authors use latent factors and do not perform a selection of the appropriate number of factors. They 1

Statistical factors are rotations of the excess returns. Therefore these estimated latent factors can be interpreted as excess returns on portfolios of securities. The weights are determined by the rotation. 2 This was also cited by Antoniou et al. (1998)

4

Selection of Risk Factors

borrow the results from the asymptotic principal component analysis of Connor and Korajczyk [17] and propose the use of 1 to 4 factors. Nardari and Scruggs [39] (hereafter NS) extends GZ latent factors framework to allow for heteroscedasticity in factor and idiosyncratic shocks, and time varying expected returns. NS uses Bayes factors to test APT restrictions and finds evidence for three priced latent factors interpreted as, stock market factor, size factor and small stock factor. This article is closely related to GZ and NS in its statistical analysis and to Ouysse[40] in its use of observable factors3 . We consider only measured variables representing economic and market conditions as potential risk factors. We employ recent advances in Bayesian variable selection techniques to search over the sample space of all possible combinations of factors. Rational expectations view of the arbitrage pricing model implies that predictability is driven by market reward for exposure to economy wide risk factors. To the extent that expected returns are driven by economic fundamentals, predictability in this context can be associated to time series dynamics of the fundamentals. The focal point of this paper goes beyond the mere identification of the rewarded risk in the financial markets. Of primary interest to this article is to study the link between changes in factor risk premiums and changes in the business cycle from recession to expansion. To anticipate the results, we find that the returns on 137 industry and Size based portfolios conform to an approximate factor structure with economy wide systematic risk. We consider three different sets of test portfolios, namely, 10 decile-sorted portfolios, 45industry sorted portfolios and 100 book-to-market (BM) and size-sorted portfolios. To our best knowledge, this is by far the largest cross-section in the empirical literature. We find that the data provide little evidence for Fama and French (FF) size and BM factors as viable sources of priced risk for the cross-section of the returns in our sample. Our prior settings give a non informative equal probability for each variable to be in the ”true” model. The data robustly drive the posterior probabilities of inclusion of the FF factors to zero. We analyze market asymmetries in rewarding risk exposure between January and nonJanuary months4 . We find some evidence of higher premium for the month of January thus supporting some of the previous evidence in the literature. This study also relaxes the assumption of constant risk premiums and analyzes the business cycle behavior exhibited by risk premiums associated with different sources of risk. We investigate the changes in the expected compensation for exposure to economic. Using the time series of the posterior factors’ risk premiums, we argue that the compensation for economic risk is larger during recessions and time of financial distress. The time series behavior also shows higher volatility during phases of economic contraction. The paper is organized as follows. In Section II, we provide an review of the asset pricing model and its implications. Section III describes the Bayesian methodology for factor selection and inference about the parameters. The empirical results, presented in Section IV address four issues: (i) Sensitivity of the model size to choice of tuning parameters of the prior densities; (ii) comparison between Bayesian and classical estimation of expected returns and risk analysis; (iii) time-varying factor risk premiums; and (iv) excess returns predictability and economic fundamentals. We conclude the paper with section IV containing a summary and suggestions for future research.

3 4

This article also differs from NS in using data dependent and empirical Bayes prior setting. A stylized fact in the literature is the opposite sign of risk premium between January and non-January months.

5

Ouysse& Kohn

2

Asset Pricing model

2.1

Econometric model and the APT implications

In the finance literature, the debate on what drives excess returns continues. A large number of studies use factor models to identify the common sources of systematic risk in expected returns. The factors used are classified into latent factors estimated through statistical methods as principal components and factor analysis and observable factors based on the sensitivity of stocks returns to economic and financial news. The use of observable variables to explain excess returns is particularly appealing. The estimated factor loadings have a meaningful interpretation. The estimated pricing relationship can be used to stimulate the financial markets through the pervasive economic and financial variables. The ability to predict excess returns with tangible factors can be useful for portfolio management and stock market investment decisions. Unlike the Capital Asset Pricing model, the AP T assumes that financial markets are competitive and frictionless and allows for multiple risk factors to enter the return generating process for asset returns. In a rational asset pricing model with multiple beta, expected returns of securities are related to their sensitivities to changes in the state of the economy. Let rt be an N − vector of security returns in excess of the free rate at time t and consider the following approximate 5 k−factor pricing (econometric) model, rt = α + ΛFt + et

(1)

where E(et |Ft ). The matrix of sensitivities Λ ≡ (λ1 , ..., λi , ..., λN )0 (also known as betas or loadings) of the N assets to the k common factors represents the amount of exposure of each asset to the underlying common risk factors Ft . In the case of traded factors, the APT implies that the asset risk premiums is related to the returns on the traded factors. In the case of non-traded factors, the pricing relationship is established using the relationship between the first moment of the returns generating process and that of the risk factors. The econometric model in (1) implies, E(rt ) = α + ΛµF The competitive equilibrium interpretation of the APT [See Connor and Korajczyk [17]] and Ross’s absence of riskless arbitrage opportunities imply the now well-known restrictions on (1), namely µi ≈ δ0 + λ0i δ

(2)

Asset i expected (excess) return µi has two components, the mispricing vector with respect to the the k−f actor model δ0 , and the factor-related component, λ0i δ. Alternatively, the constraints implied by the APT can be interpreted as δ0 6 being the riskless return, provided at least one risk-averse investor holds a portfolio without any residual risk and δ is the risk premium on the k factors. The excess return is also a measure of asset realized (historical) risk premium. The APT therefore provides a decomposition of asset’s risk premium into its exposure to risk factors 5

An approximate factor structure, first introduced by Chamberlain and Rothschild (1983)[13], relaxes the static factor model assumption of diagonal idiosyncratic covariance matrix to allow for a limited amount of cross-section dependence. The APT is based on the pricing relation for a countably infinite vector of returns to a countably infinite set of traded assets 6 The riskless rate will be measured by the 30-day Treasury-bill rate that is known at the beginning of each period month.

6

Selection of Risk Factors

and the market reward for these exposures. Combining (2) and (2), we obtain a restricted factor structure in the form of, rt = δ0 ιN + Λδ + Λ(Ft − µF ) + et

(3)

where δ are the return or price for the exposure to the factors risk. Equation (3) implies a set of restrictions on the estimates of the intercept term in the unrestricted econometric model: α = δ0 ιN + Λ(δ − µF )

(4)

With no loss of generality, the factors are normalized to have mean zero, µF = 0 and unit variance, diag(ΣF ) = 1. Testing the APT implications is equivalent to testing the set of nonlinear relationships between the expected excess return and the factor betas, µ = α = δ0 ιN + Λδ

(5)

Equation (2) represents an approximate7 relationship between the expected asset returns and their risk exposures. It implies that the risk premium in excess of the market premium, is a function of the factor betas and the market price for factors risk. Exact arbitrage pricing obtains when (2) holds with equality. Geweke and Zhou (1996)[34] proposes a measure of pricing error given by the average of squared deviations from the restriction across assets. The pricing error is measured by N X

(αi − δ0 − λi1 δ1 − ... − λik δk )2 /N

Q2N

=

QN

→ 0 as N → ∞

i=1

The authors argue that for Connor’s equilibrium APT, Q is equal to zero. For Ross’s asymptotic APT, the pricing error will converge to zero as the number of assets approaches infinity. Although for fixed N , the pricing error is not necessarily small, the authors suggest the use of the pricing error to examine some of the testable implications of the APT8 . In this paper we consider and additional measure of the pricing error which takes into account the cross section correlation structure in the idiosyncratic term and is given by, ³ e 2N Q

=

e Λ

=

eN Q 7 8

e 0δ α−Λ

´0

³ ´ e 0δ Σ−1 α − Λ

N (ı, Λ); δ = (δ0 , ..., δk )0

→ 0 as N → ∞

Connor (1984)[16] replaces the approximation with equality under the assumption of competitive equilibrium. The equilibrium version of the APT implies that for an N −vector of security returns Rt , E(Rt ) = rF t ıN + Λδt

(6)

where rF t represents the return on a risk free asset, e is a vector of ones, δt is a k−vector of factor risk premiums. Combining equations (2) and (6) gives, e +v r = ΛX e is the T × k realizations of (xt + δt ), where the N × T matrix of excess returns is given by r = R − IN rF0 and X for some risk factor xt . The pricing theory imposes a testable cross-equation restriction on the parameters of a multivariate regression of asset excess returns on the factors. Let α be the vector of intercepts in a regression of asset excess returns on the factors. The pricing theory implies that α should be identically zero.

7

Ouysse& Kohn

This paper is also interested in the finite sample behavior of the APT compared to the asymptotic predictions. Chamberlain and Rothschild [13] showed convergence of the risk premiums as the number of assets N grows to infinity. This is implied by the uniqueness of the approximate factor structure and the correct specification of the model. Let ΨN be the covariance matrix for the N returns, then the factor structure implies a variance decomposition in the form, ΨN = ΛN ΣF Λ0N + ΣN

(7)

where ΣF = cov(Ft ). The subscript N is needed to show that the factor structure will depend on the number of assets. It is necessary to keep in mind that the APT assumes the universe of all assets traded in the market and therefore, we should expect convergence as N is infinite or at lease as N represents all population of assets. The existence and uniqueness of the approximate factor structure requires that the largest K eigenvalues of ΨN be unbounded with respect to N , the remaining eigenvalues are constant. As clearly pointed out in Geweke & Zhou[34], Bayesian technique requires estimation of the N × N covariance matrix of returns. Therefore N ≤ T − K to avoid singularity. The two-Pass approach does not face this problem in the estimation of parameters of the model and in principal N can be any large number. However, issues of inference and efficient estimation in large N factor models are still pressing research topics in the literature. APT model is also used in the literature to asses the returns predictability. The model implies a market-wide price for the beat measured by the risk premium. From the variance decomposition in (7), it is clear that predictable variation of returns can be associated to changes in the betas and their price. To investigate the ability of the APT to explain the predictability of Stock returns we measure the predictable variance captured by the factor model by the variance of the common component attributed common factors and their loading. Define the variance ratio diag (ΛN ΣF Λ0N ) D1 = diag(ΨN ) where diag(A) refers to the diagonal elements of a symmetric matrix A. D1 provides a measure of the proportion of the predictable variance explained by the model. Ferson and Korajczyk (1995) [29] used a similar measure defined as V R1i =

var (Et (λ0i Ft )) var(Et (rit ))

where Et (.) denotes the conditional mean. In [29], the authors used measured (predetermined) economic variables as the conditioning information set and constructed mimicking portfolios for factors to represent risk factors Ft in a conditional APT. In this article, we limit our analyzes to the unconditional APT and therefore factor betas are time invariant. However the article will investigate the link between risk and predictability by studying the time series behavior of economic risk premium associated to each risk factor.

2.2

Review of the Empirical literature on APT

In the last three decades, considerable empirical work has been done to understand the dynamics of assets returns and bonds and the sources of the now stylized fact, rates of returns predictability over time. The bulk of the literature is concerned with testing the implications of the APT model. There is an ongoing debate whether these implications are testable and what model specification is best for econometric implementation? The variety of techniques used in the APT literature is related to the econometric advances related to both factor models and multivariate regression models.

8

Selection of Risk Factors

The APT is a perfect candidate for analysis using factor models. The risk factors are common across all the assets and the factor loadings or betas are asset specific. Research using factor models has mainly examined the case of latent factors. The factors are estimated via methods of principal components. The econometric literature on factor models is now well equipped to deal with the different aspects of estimation and inference in factor models as well as factor selection. The reader is referred to Bai (2003) [3] and Bai and Ng (2002) [4]. Traditional factor analysis uses a likelihood ratio to test the restrictions implied by the APT. This involves computing the maximum likelihood estimates under the nonlinear restrictions in (2) . This is a very difficult task in practice and the model inference is non standard. Indeed, [2] shows that the asymptotic distributions of the model parameters estimates are very complex in factor analysis, and the constrained estimates should be even more complex. This complexity makes it difficult to derive the asymptotic distribution of the likelihood ratio tests. The latent factor models fall short of offering a clear link between the estimated statistical factors and the observed economic variables. This complicated conclusion when trying to measure the price the market pays for risk exposure associated with a specific economic variable. Many researchers in the empirical finance literature surmount the complexity of the estimation under nonlinear restrictions by using a two-pass approach. In the first pass, either the factor loading or the factors are estimated. In order to estimate the factors betas (factor loading) of assets, excess returns are regressed against the common factors using the time series from t − 60 n ok=1,..,K bik,t−1 . The estimates λ bik to t − 1 to get the conditional betas λ are then used in the i=1,..,N

second pass. Treating these estimates as the true variables, the APT restrictions in (2) become linear constraints on the coefficients of the multivariate regression. In fact, these restrictions imply zero intercepts and can be tested using the standard methods. To estimate the risk premiums, a cross sectional regression model is utilized for each time point to get the time series of each risk premiums. For each month t of the next 12 months, perform cross section regressions: rit = δ0t +

K X

bik,t−1 δkt + εit λ

(8)

k=1

with i = 1, .., N and get an estimate of the sum of risk premium δbkt for month t associated to variable k, t = 1, .., 12. The two-pass steps are repeated for each time period in the sample. The time series means of the series of of risk premium estimates associated to each variable are then tested for their significance. A factor with statistically significant risk premiums is priced by investors in the market. This method is based on the estimates from the first pass being consistent. However, in small samples, this procedure suffers from errors in variables problem. The uncertainty about the first pass estimates can lead to misleading inference. This literature however does not offer any formal guidance on how to choose the variables in the first pass and therefore subject to model miss-specification bias. The number of risk factors is arbitrarily pre-specified and only few specifications are tested. The statistical properties of the model estimates are practically unknown and the inference about the risk premiums and miss-pricing are potentially misleading. Variable selection adds an extra source of uncertainty to the model and an extra dimension to the complexity of the first method and to the unreliability of the inference in the second method. The literature of factor models with observable factors has received little attention,

9

Ouysse& Kohn

Ouysse [40] developed an information criterion to produce consistent estimates of the set of pervasive common risk factors in a large Panel with observable factors. Conditional on the selection procedure being consistent, the factors can be considered as known and conventional multivariate regression inference is applicable. This is true only asymptotically and in finite samples the distributional properties of the estimated set of factors need to be addressed to enable proper inference on the post-selection model parameters. The issue of incorporating model uncertainty is still an open research question in the classical literature.

3 3.1

Bayesian analysis of the APT Factor selection

Bayesian variable selection methods have become increasingly popular in recent years due to the advances in MCMC computational algorithms. In this paper we will make use of a variable selection algorithm witch will allow us to efficiently span the space of all possible subsets of factors. The model admits the following matrix representation, Y

N ×T

= α + Λ N ×1

F0 + ε

(N ×k)(k×T )

N ×T

(9)

where, Y = (r1 , .., rt , ..., rT ) is the collection of excess returns, and the error term ε ≡ (e1 , ..., eT ). In our analysis, it is convenient to work with the pooled form of the model in equation (9). Let y = vec(Y 0 ), β = vec([ α Λ ]0 ), and ² = vec(ε0 )9 and let X ≡ [ ıT F ], y N T ×1

= (IN ⊗ X) β

+

N T ×N k N k×1

²

N T ×1

(10)

The idiosyncratic factors are assumed to be uncorrelated with the systematic factors X with mean zero and covariance matrix Σ ⊗ IT 10 . The factors in (10) are unknown but are assumed to be elements of a finite set of potential variables. Let K be the total number of potential variables represented by the columns of the matrix X. Further, let X 0 be the set of pervasive½factors in the data generating process (DGP). 1 if Xj ⊆ X 0 Define the Bernoulli random variable γj as: γj = . Therefore, γ = {γj }K j=1 , 0 otherwise is a selector vector over the column of X. Let qγ be the ”Binomial ”random variable representing the number of variables in the selected model, therefore X qγ = γi i=1,..,K

For a given value of the latent variable γ, define βγ as the subset of all nonzero elements of β and let Xγ be the collection of those columns corresponding to γi = 1. The latent variable γ therefore defines a model, Mγ , which is defined by y N T ×1

= (IN ⊗ Xγ ) βγ + N T ×N qγ N k×1

²

N T ×1

(11)

9 Let A = (A1 , ..., Am ) be a p×m matrix. The operator vec(A) = (A01 , A02 , ..., A0m )0 is equivalent to concatenating the columns of A to form an m × p − column vector. 10 This notation allows for cross section heteroscedasticity. If in addition there is serial correlation, the covariance matrix will be of the more general form, E(²²0 ) = Σ ⊗ Γ. The Kronecker product (denoted ⊗) of two matrices A = [aij ] and B = [bij ] is defined as A ⊗ B = [aij B]

10

Selection of Risk Factors

To be consistent with the APT specification in (9), we will allow for the intercept to be part of the DGP model, therefore γ1 = 1 with probability 1. The idea behind the Bayesian approach is to start from some prior belief about the quantities of interest and to use observed sample information to update these beliefs and to construct posterior probability densities. For a given γ, we define a set of priors for each model Mγ . Let p(β, Σ|γ) be the joint prior on the model parameters, the marginal likelihood p(y|γ)p(γ) p(γ|y) = P γ p(y|γ)p(γ) The marginal likelihood of the observed data under a specific model that is characterized by γ, Z Z p(y|βγ , Σ, γ)p(βγ |Σ, γ)p(Σ|γ)dβγ dΣ (12) p(y|γ) = Σ

β

Analytical expressions for the full marginal likelihood are possible to obtain for a range of conjugate priors on the model parameters. To start this recursive hierarchical structure, it is necessary to compute the posterior distribution of the latent variable γ. This is readily obtained from the following p(γ|y) ∝ p(γ)p(y|γ) (13) where p(γ) is the prior on γ. Each γj is a Bernoulli random variable. If we denote by, P (γj = 1) = πj , then K Y γ p(γ) = πj j (1 − πj )(1−γj ) j=1

For a prior reflecting equal treatment of the factors, πi = 12 and therefore p(γ) = 21K . In order to make inferences about γ one needs to be able to compute or at least simulate from the density in (13). Markov Chain Monte Carlo methods play a crucial role to enable sampling from the full conditionals without knowledge of the constant of proportionality in (13). Bayesian approach is motivated by the need to incorporate model uncertainty in the postselection estimates. Equation (13) describes the posterior distribution of the random variable γ conditional on the observed returns. Here we are keeping the design matrix fixed. All other parameters in the model are integrated out and the resulting density takes into account the inherent uncertainty. There is a hierarchy in the building up of inference in this framework. Given a model Mγ , the design matrix, Xγ represents the collection of factors Xj for which γj = 1. Conditional on γ, the full conditional of the covariance matrix, p(Σ|y, Xγ , γ) is fully specified. Lastly, the full conditionals of β, p(β|Σ, y, Xγ , γ) is derived. See Appendix for the derivation of the full conditionals.

3.2

Normal-Wishart Prior

The diffuse prior was first introduced into bayesian multivariate analysis by Geisser and Cornfield (1963) [32]. It is a prior of ’minimum prior information’. However, if one has prior information on β, an informative prior should be used. Indeed, Monte Carlo integration makes it possible to work with many choices of priors. In this section we will derive the full conditionals under a general form of Normal-Wishart prior. The amount of prior information and its importance relative to the sample information will be determined by the covariance matrix of the prior density of β.

11

Ouysse& Kohn

We adapt existing results from bayesian analysis of multivariate regression to the context of a factor model and present the result in the following lemma. (See among others Brown, Vannucci & Fearn [9]) ¢ ¡ Lemma 3.1 Consider the Normal-Wishart prior for β and Σ, β|Σ ∼ N β0 , Σ ⊗ H β and ¡ ¢ Σ−1 ∼ WN m, Φ−1 Wishart distribution with location Φand scale parameter m > N +111 .Under uniform priors on γ, the full conditionals are given by ´ ³ ´ ³ 1. β|y, Ω, γ ∼ N βeγ , Σ ⊗ D−1 where Dγ = Xγ0 Xγ + H −1 β βeγ βbGLS

³

´ ¡ ¢ IN ⊗ Dγ−1 H −1 β0 + IN ⊗ Dγ−1 Xγ0 Xγ βbGLS β h ¡ ¢−1 0 i = IN ⊗ Xγ0 Xγ Xγ y =

2. Ω|y, γ ∼ WN (m + T, (S(γ) + Φ)−1 ) where ¡ ¢ S(γ) = (Y − B0 W )0 IT − Xγ Dγ−1 Xγ0 (Y − B0 W ) + B00 LB0 £ ¤−1 Wγ = IT − XD−1 X 0 XDγ−1 H −1 β L = Σ−1 ⊗ Vγ Vγ

−1 −1 −1 −1 −1 0 = H −1 β − H β D H β − H β Dγ X Wγ

β0 = vec(B0 ) ¯ ¯− N 3. p (y|γ, X) ∝ ¯H β ¯ 2

¯ ¯− N2 (T +m) m ¯ 0 −1 ¯ X X + H |Φ| 2 |Φ + S|− 2 ¯ β ¯

See Appendix for a thorough derivation. In order to implement the selection process, the hyperparameters β0 , H β , m and Φ need to be specified. Assuming that no subjective information about these parameters is available, their values will be set in order to minimize their influence. For the parameters entering the prior on Σ, we set m = N + 2, which reflects minimum amounts of prior information and we set b M LE + aIN , where the second term is added to remedy short rank cases 12 . For the Φ = Σ prior on β, two values of the prior mean are considered in this MCMC analysis. The default prior β0 = 0 which reflects indifference between positive and negative values and, a somewhat more informative prior which centers the prior distribution around the generalized least squares estimator β0 = βbGLS . The covariance matrix H β determines the amount of information in the prior and will influence the likelihood covariance structure. In the literature, the specification is simplified to H β = cV β . The preset form V β , can be chosen to either replicate the correlation structure of the likelihood by setting V β = (X 0 X)−1 , this is also the g-prior recommended by Zellner (1986)[46]; or to weaken the covariance in the likelihood by setting, V β = IK , which implies that the components of β are conditionally independent. The scalar c is a tuning parameter controlling the amount of prior information. The larger the value of c, the more diffuse (more flat) is the prior over the region of plausible values of β. The value of c should be large enough to reduce the prior 11 12

Φ Given this prior, the first moment of Σ is E(Σ) = m−N and therefore the scale parameter m > N + 1. −1 2 In the MCMC experiment, we set a = s , the sum of squared residuals from the pooled regression

12

Selection of Risk Factors

influence. However, excessively large values can generate a form of the Bartlett-Lindley paradox by putting increasing probability on the null model as c → ∞. In the literature, different values of c were recommended depending on the application at hand. Recommendations varied with the choice of the optimality criteria. Fixed choices of c were shown to (asymptotically) correspond to fixed dimensionality penalties such as the Akaike AIC, the Cp Mallows, Bayes BIC 13 and the Risk inflation RIC criteria. Empirical Bayes criteria on the other hand were designed to have dimensionality penalties that are data dependent. The reader is referred to George and Foster (1994 [30],2000 [31]). George and Foster (2000)[31] (GF hereafter) revealed the connection between the use of information criteria and posterior density of γ. Let CI(k) =

SSγ − g(c)kγ σ2

be the penalized sum of squared criterion for model with kγ factors. In this article, we fix the prior probability of inclusion of factor j to πj = 0.5. In this case, GF link between the two approaches is through the penalty function g(c), g(c) =

1+c log(1 + c) c

Making c data dependent is in some way similar to making the penalty in the frequentist literature depend on both N and T . In this study we will consider three choices of c ∈ {4, max{T, K 2 }, b cEB }14 , where b cEB = max{Fγ − 1, 0} Fγ

=

Rγ2 /kγ (1−Rγ2 )/(n−1−kγ )

A local empirical Bayes estimate b cEB for c is required for each model. Therefore, this approach generates values of c that are data dependent, (See, [41] for a thorough discussion). As pointed out in Chipman, George and McCulloch (2001)[15], there is an asymptotic correspondence between these choices of c and the classical information criteria AIC, BIC and RIC respectively when V β = (X 0 X)−1 . The case of c = n corresponds to the so called unit information priors which corresponds to choosing priors with the same amount of information about β as that contained in one observation. This prior leads to Bayes factors with BIC behavior. A risk information criterion (RIC) corresponds to a choice of c = K 2 as shown in Foster and George (1994)([21]). ³ ´ Given that D = X 0 X + H −1 , one can see that under V β = IK the posterior correlations will β be less than those of the design correlation. The posterior correlations are however identical to those of the design matrix under the prior V β = (X 0 X)−1 . The conditionally independent prior for β, ³ ´ β|Σ ∼ N βbGLS , Σ ⊗ cIK (14) is equivalent to the prior B ∼ N (Σ, cIK ) , a matrix variate representation used by Brown et al. (1998), where cIK is the covariance matrix of B|Σ. Therefore, the columns of B in 9 are independent under this prior. 13 The AIC minimizes an Expected information distance. The Cp is motivated as an unbiased estimate of predictive risk. The BIC is asymptotically equivalent to Bayes Factor. 14 In a simulation study of the effect of the choice of c on the posterior probability of the true model, Fernandez, Ley and Steel (2001)[27] found that the effect depends on the true model and noise level and they recommend the use of c = max{T, K 2 }.

13

Ouysse& Kohn

Given this prior, the posterior densities for the parameters and the covariance matrix are given by à · µ ¶¸−1 ! 1 0 β|Σ, y, γ, X ∼ N βbγ , Ω ⊗ Iq + Xγ Xγ (15) c γ where

h ¡ ¢−1 0 i βbγ = IN ⊗ Xγ0 Xγ Xγ y

(16)

The posterior density of the inverse covariance matrix, conditional on the sample information, ³ ´ Ω|r, γ, X ∼ WN m + T, (S(γ) + Φ)−1 (17) where the location matrix depends on the sample data through the sum of squers residuals S(γ) = Y 0 (I − Xγ (Xγ0 Xγ )−1 Xγ0 )Y. Given the same uniform prior on the indicator variable, the posterior density is adjusted in light of the informative prior in (14). The posterior density for γ, conditional on the data, is ¯ ¯ N ¯ 0 m δ 1 ¯¯− 2 |Φ| 2 |S(γ) + Φ|− 2 ¯Xγ Xγ + c Iqγ ¯

−(N qγ /2) ¯

p(γ|y, X) ∝ c

Now lets consider a different informative prior for the parameters of the model, ¡ ¢ β|Σ ∼ N 0, Σ ⊗ c(X 0 X)−1

(18)

(19)

The parameter c measures the amount of information in the prior relative to the sample. Setting c = 50 gives the prior the same importance as 2% of the sample. Given this prior and given the Wishart prior the full conditional densities for the unknown parameters of the model are given by, µ ¶ c 0 −1 e β|Σ, y, γ, X ∼ N βγ , Σ ⊗ (X X) (20) 1+c ¡ 0 ¢ c (Xγ Xγ )−1 Xγ0 ⊗ IN y. The full posterior density for the covariance matrix is a where βeγ = 1+c wishart, µ ³ ´−1 ¶ e Ω|y, γ, X ∼ WN m + T, S(γ) + Φ (21) h i c e where S(γ) = Y 0 I − 1+c Xγ (Xγ0 Xγ )−1 Xγ0 Y. Finally, given the same uniform prior on the indicator variable,the posterior density for γ, conditional on the data is as follows ¯ ¯− (T +m) N qγ m ¯ 2 e ¯¯ p(γ|y, X) ∝ (1 + c)− 2 |Φ| 2 ¯Φ + S(γ) (22) Model uncertainty is a very complex problem. In the literature, it is customary to construct some objective function which represents the cost to the researcher if the chosen model is wrong. In econometrics, the objective function often takes the form of an information criterion representing a trade off between parsimony and fit. In other instances, the objective function is model specific and is a result of solving a decision theoretic problem. The same is true for the Bayesian approach. One can use a statistical rule to solve for model uncertainty, or use a decision theoretic criteria if available. The mode of the posterior distribution γ|r defined as, γ b|r, X ≡ γmode = argmax p(γ|r)

14

Selection of Risk Factors

is one candidate for model discrimination. The promising model is taken to be the one with highest posterior probability. This is similar to hypothesis testing between competing models, the one with highest posterior odds is interpreted as the one best describing the data generating process. There exist however a loud critique opposed to the use of posterior odds ratio and Bayes factors to discriminate between models. Many probabilistic models give rise to probability distributions varying greatly over a high-dimensional space. The posterior distribution of γ might be spread out over many values and the modal value might not necessarily be dominant. Bayesian model averaging BMA is an alternative to modal approach and it advocates taking the expected value of γ under the posterior probability density. By averaging over all possible models using the posterior density as weighting density, BMA provides better average predictive ability than using any single model. In our case, this is done by γ b|r, X = E(γ|r, X) This is very useful if the goal is exclusively forecasting and prediction. This will not be feasible in situations where a decision maker can only operate under one specific model. In the AP T model, it is possible to use a model specific criterion to discriminate between models. One possibility is to use the pricing error as the cost function and therefore γ = argmin Q2N To compare the performance of different models, we compute the forecast mean squared error based on Out of sample forecasts from a hold out sample. For an s periods ahead forecast, let xγ be the s × kγ design matrix for the hold out sample. The point forecast b rout for the hold out sample excess return is then computed as b rout = (IN ⊗ xγ ) βeγ where βeγ is the Bayesian estimate for βγ which is the mean of the full conditional density of β. Using the result for the specific data dependent g-prior case in equation (20) we can write, b rout |γ, xγ , r =

¢¡ ¢ c ¡ IN ⊗ x0γ (x0γ xγ )−1 x0γ ⊗ IN r 1+c

For each security i, the mean squared error is defined as ¡ ¢2 M SEi = E rit − rout it for t = T + 1, .., T + s and s is the hold out sample size. In the case of N securities, we will use the following measure for the overall out-of-sample forecast performance of the model for all securities, T M SE = M SE 0 × Σ−1 × M SE

where M SE = {M SEi }i=1:N .

3.2.1

Metropolis-Hastings Scheme

Markov Chain Monte Carlo MCMC methods are used to explore the posterior distribution γ and provide a fast and efficient way to identify promising variables witch have high marginal probability of γj = 1. In this study we use the MCMC algorithm proposed by Kohn, Smith and Chan (2001)[36] (KSC hereafter). This is motivated by the fact the search space can be very large if we have a large number of potential factors. If there are K candidate regressors, there are 2K (in this paper, the figure is221 ) possible combinations. The scheme in KSC reduces the computational burden by decreasing the algorithm visits

15

Ouysse& Kohn

to useless subsets. Therefore, it is possible to identify useful γ−vectors even if one doesn’t sweep the whole sample space. Using Metropolis-Hastings scheme, an ergodic Markov Chain γ (0) , γ (1) , ..., γ (s) of iterates of γ from the full conditional is generated. The sampling distribution of this chain of iterates is used to approximate (i) the distribution p(γ|r). Each element γk in the indicator variable is generated at a fixed or random order from the full conditional distributions. The KSC scheme as described in [36] proceeds as follows. 1. Initialize γ (0) . 2. For i=1,...,s, (a) Generate a random permutation j = (j1 , .., jk ) of (1, .., K). (b) For l = 1, ..., K, P i. Generate a proposal value γjl for γjl from the conditional prior distribution for γjl ,

³ ´ (i+1) (i) p γjl |γ6=jl = (γl (j)) (i+1)

(i)

(i)

(i+1)

ii. With γ C = (γl (j)) and γ P = (γl (j)), set γjl

(P )

= γjl

p(y|γP ) p(y|γ C )

(i)

= γjl otherwise.

For the Normal-Wishart g-prior case, the random mechanism under the general informative prior (See Appendix), ¯ ¯ ¯ ¯ ¯ 0 −1 ¯ (1) X X + H ¯ ¯ ¯ (T + m) ¯ β H β (1) γ(1) γ(1) N N |Φ + S(1)| ¯− ¯ ¯− log(π) = − log ¯¯ log log ¯ ¯ ¯ 2 2 2 |Φ + S(0)| H β (0) + H (0)−1 ¯ ¯X 0 X γ(0)

γ(0)

β

this term represents the ratio of the marginal likelihood of the data conditional on the model. In the case of the data dependent g-prior, ¯ ¯ ¯e ¯ S + Φ ¯ ¯ γ(1) N (T + m) ¯ log(π) = − log(1 + c) − log ¯¯ ¯ 2 2 ¯Seγ(0) + Φ¯ The MCMC sampling also generates chains for all the other parameters and functions of parameters in the model. At each iteration j of the sampling scheme, conditional on γ (j) , the other quantities are generated following the recursive structure of the full conditionals as described in the Lemma (3.1). MCMC uses an argument similar to the Law of Large numbers applied to the sequence of iterates, which are weakly independent if convergence is achieved ( allow for a burn-in period before we start the sampling), to claim convergence of the sampling moments to the population moments. If θ is a parameter PM 1 es es of interest, then E(θ) is approximated by θe = M s=m+1 θ , where θ are iterates from p(θ|y) and m is the burn-in sample. Another property which is inherent to the maximum likelihood principle is invariance. This is particularly important here because we can derive the distribution of different functions of the parameters by simply applying the mapping to the iterates from MCMC and using the resulting sampling distribution to approximate the probability distribution of the mapping of the unknown parameters. In other terms, the sequence {g(θes )}s=1:M will converge to the full conditional p(g(θ)|data). Based on this result, the sampling distribution of the pricing errors and the risk premiums are readily available once we obtain the iterates for β and Σ.

16

Selection of Risk Factors

Table 1: List of potential Risk Factors Measured Economic Variables Januaruy Effect Dummy JAN Consumption CG Term Structure UTS Risk Premium URP Expected Inflation EI Unexpected Inflation UEI Change in EI DEI Monthly Production Growth MP Annual Production Growth YP Unemployment rate UNEM Change in Producer Price PPI

4

Measured Financial Variables

Market portfolio Small Minus Big High Minus Low Momentum Factor Value weighted return Equally weighted return Missing Factor

MARKET SMB HML MOMT VWRET EWRET MISS

Data description

In this section we examine the empirical relationship between economy wide risk and the pricing of securities traded in the US stock exchange. For the purpose of comparison, our set of potential risk factors includes economic indicators previously examined by Chen, Roll and Ross [14]. One of the aim of this paper is to evaluate the performance of the measured economic factors relative to portfolio based factors similar to those examined by Fama and French ([23], [24],[25]). The two sets are combined in this study to form the ”universe” of potential risk factors. The data for both security returns and the factors are monthly, covering two periods. The first is from January 1960 to December 2003 with 528 observations, and the second is from January 1990 to December 2004 with 168 observations. For comparison of forecast performance of different specifications, we use the 24 monthly observations from January 2004 to December 2006 for the out-of-sample forecast analysis.

4.1

The factors

The macroeconomic variables we use as proxy for the economy wide pervasive risk are shown in Table 1. • Measured Economic Variables: Since the work of Chen, Roll & Ross [14], many studies have examined the relationship between the economic indicators and the financial markets. This list of representative variables is by no means exhaustive of the relevant economic risks but the literature provide some motivation for their ability to proxy for state of the economy. Chen et. al [14] argue that systematic risk factors are variables that change the discount factor and the expected cash flows. Real and nominal forces account for the changes in cash flows. Changes in the discount factor are related to changes in the marginal utility of wealth. The latter is inversely related to changes in the aggregate marginal utility and consumption betas were found useful in the pricing of cross-section of returns. (See among others the discussion in [28]). CG is the growth rate of real per capita of personal consumption expenditures for nondurable goods. Unanticipated changes in price-level (UEI ) could be a source of systematic risk to the extent that pricing is done in real terms. Changes in the expected rate of inflation (DEI ) would affect the nominal level of the expected cash flow and interest rate. General average inflation rate (EI ) could also affect asset valuation through changes in relative prices. Industrial production is related to capacity utilization and is considered coincident indicator, meaning that changes in its level usually reflect similar changes in the overall economic activity. Chen et. al use both monthly (MP ) and yearly (YP ) changes in the industrial production reflecting both the contemporaneous short term effect on stock returns and the long term anticipated changes in the industrial activity.

Ouysse& Kohn

17

Unanticipated changes in risk premiums reflecting movements in the risk of corporate default are measured by the difference between the return on corporate bonds rated Baa and the return on long term U.S. government bonds (URP ). To capture unanticipated changes in the return on long government bonds, Chen et al. [14] use the term structure (UTS ) which is the difference between the return on the long bonds and the risk free rate. Ferson and Harvey [28] describe it as reflecting ”the changing sloe of the treasury yield curve.” To the list of variables put forward by Chen et al. [14], we add changes of the producer price index (PPI ), unemployment rate (UNEM ) and the JAN dummy. We follow closely the definition used in Chen, et al. [14] to construct these variables. Sources and definitions are provided in the Data Appendix. • Firm Characteristics variables: Based on the argument that size and book to market equity are related to economic fundamentals, Fama and French ([23], [24], [25]) introduced the use of firm characteristics to construct factor portfolios. They argue that variables that are related to average returns, such as size and book to market equity must proxy for sensitivity to common risk factors in returns. They attribute the superior returns of size and book-to-market portfolios to compensation for extra market risk not captured by the CAPM and therefore is correlated with the idiosyncratic risk. They find evidence for both size and book-to-market effect. These pricing anomalies have been sometimes attributed to the inability of the CAPM to explain expected returns from some investment strategies based on firm characteristics due largely to the fact that the market index is size-biased and valuation blind. Fama & French propose a three-factor model in which the three factors are (i) the excess return on a broad market portfolio (MARKET ); (ii) the difference between the return on a portfolio of high-book-to-market stocks and the return on a portfolio of low-bookto-market stocks (HML); (iii) the difference between the return on a portfolio of small size stocks and the return on a portfolio of large size stocks (SMB ). The empirical literature also found mixed evidence for momentum effect. Stocks with higher returns in the previous months tend to have higher future returns than stocks with lower returns in the past 12 months. Momentum factor is seen in the literature as a further correction for the market index biased: market capitalization shows where the market has been putting its money for years, while momentum shows where it has been putting it lately. The on Fama and French three factors and the Momentum factor are kindly provided by Kenneth French from the data library at, (http : //mba.tuck.dartmouth.edu/pages/f aculty/ken.f rench/datal ibrary.html). Finally, we add the ”Missing” factor to account for any risk factors not represented in the set of variables in our list. This ”statistical” factor is constructed as the residual portfolio from a linear projection of the value weighted market index onto the space spanned by all the variables in the list. Table 10 shows the correlation structure of the 18 factors. It is clear that there is high correlation between expected inflation EI, Unexpected inflation UEI and, change in Expected inflation DEI with coefficient correlation between 0.93 and 0.99. These inflation variables are also highly correlated with the value weighted return (including dividends) portfolio (V W RET D) and with the missing factor portfolio. There is also a relatively high correlation of 0.80 between monthly industrial production growth MP and book-to-market factor HML, and a mild correlation of 0.66 between HML and consumption growth CG. This is very important characteristic of these factors as they all might be a proxy for some latent risk factors. The literature so far has addressed the pricing value of the equity based factors separately from the pure economic variables. The correlation structure between the two set of factors will play a role in the partial effect of each factor in the pricing relationship.

4.2

The returns

We study three types of portfolio returns: Industry portfolios, firm characteristics portfolios and decile portfolios.

18

Selection of Risk Factors

Industry portfolios: We include 12 (12-Industry) and 45 (45-Industry) portfolios of NYSE, AMEX, and NASDAQ firms grouped by two-digit and four-digit standard industrial classification (SIC) respectively. Firm characteristics portfolios: 100 common stock portfolios are the intersections of 10 portfolios formed on size (market equity) and 10 portfolios formed on the ratio of book equity to market equity (B/M). The size breakpoints are the NYSE market equity deciles and the B/M breakpoints are NYSE deciles. The returns series on the ”industry” and ”firm characteristics” portfolios are from Kenneth French data library. Decile portfolios: We also include ”decile” portfolios which are value-weighted averages formed according to size deciles on the basis of market capitalization. The data on the decile returns are provided by the Center for Research in Security Prices (CRSP).

5

Description of the MCMC output

We use a Metropolis-Hastings sampler to draw an ergodic Markov chain sequence for γ, β and Ω. These parameters are drawn from their full conditional distributions as described in Lemma 3.1. For the purpose of comparing different aspects of model performance, we compute the following sample statistics: 1. The posterior mean of the indicator variable γ, E(γ|y, F ) is estimated by the sample mean of the iterates. The latter is called γ, the elements in which will represent an estimate for the posterior probability of each variable being in the true data generating process. 2. Iterates for the risk premiums, δk for each variable k with nonzero posterior probability. In addition of reporting their posterior mean, δ|y, and standard deviation, Std(δ|y), we plot their histogram and a kernel density estimator of the posterior distribution. To assess significance of the risk premiums, we also compute the Bayesian confidence intervals for 95 percent confidence level. In addition we will allow for time varying risk premiums and generate their estimates along with their confidence intervals. 3. Since variable selection will depend on the loss function considered, we report the model estimates for a selection based on the highest posterior model probability p(y|γ) which we denote δbmode . e 2 , and their sample moments and 4. The histogram of the iterates for the pricing errors, Q2 and Q credible interval. 5. Bayesian estimates of the returns’ first and second moments. The expected asset returns computed M P [j] [j] 1 as the posterior means of the αi , i.e.αipm = M αi , where αi is a random draw from the j=1

posterior density of α|y, γ. The posterior mean, Σpm , of the Σ iterates represents the Bayesian estimate for the idiosyncratic risks. The Bayesian ¡ ¢estimate for the total risk is given as the sum 0 of the estimate for the systematic risk Bpm cov Xγ0 Bpm and the estimate for the non-diversifiable Σpm . We report the proportion of systematic risk to the total risk denoted D1 and given the ratio of the diagonal elements of the systematic risk to the diagonal elements of the total risk, 0 diag(Bpm X 0 XBpm ) , where Bpm is posterior mean of the iterates of B, which is a V[ ar1 = diag B 0 X 0 XB ( pm pm +Σpm ) Bayesian estimate for the factors’ betas B. We set the warming period to 200000 iterations and the sampling period to 50000. The initial values for the indicator variable are one for the constant term and zero for all other variables. The results were unchanged with different starting values. The initial value for the idiosyncratic covariance matrix b M LE + s2 IN 15 , where Σ b M LE is the is 0.2 × IN . The location matrix for the Wishart distribution is Φ = Σ 15 2

s is the sample variance from the pooled regression of y on all factors, s2 =

(y−(IN ⊗X)βbOLS )0 (y−(IN ⊗X)βbOLS ) N T −1

.

19

Ouysse& Kohn

³ ´ b PK γj |y, X and the posterior mean estimate Table 2: Posterior average model size k = E j=1 b cEB of the empirical Bayes tuning parameter c and the corresponding importance sampling b cEB parameter ηb = 1+b for the case of prior covariance matrix V β = (X 0 X)−1 . cEB Panel A: c = b cEB , ηb = N b cEB ηb kγ RT M SE

12 0.53 34.6% 10.83 0.0663

T=168 43 30.05 96.8% 5 0.0621

12 4.70 0.0663

T=168 43 8.41 0.0659

93 136 99.3% 4.60 0.0538

137 68 98.5% 6 0.1111

b cEB 1+b cEB

12 0.57 36.3% 11 0.0646

T=528 43 180.3 99.4% 8.16 0.0626

93 468 99.8% 5 0.0486

137 323.9 99.69% 5 0.0890

T=528 43 8.16 0.0640

93 9 0.0372

137 10 0.0720

Panel B: c = 4, ηb = 20% N kγ RT M SE

N kγ RT M SE

T = 168 12 1 0.0997

93 9.33 0.0541

c = 441 43 4.58 0.0800

137 20.03 0.1382

12 4 0.0885

Panel C: c = max{K 2 , T } ηb = 99.77% T = 528 c = 528 137 12 43 5 1 5 0.1280 0.0879 0.0704

ηb = 99.81% 137 5 0.1545

maximum likelihood estimator for the error covariance matrix from a normal regression model of R on a constant and all factors in F , ³ ´³ ´0 b M LE = 1 R − α b M LE F 0 R − α b M LE F 0 Σ bM LE − Λ bM LE − Λ T and the scale parameter m = N + 2. We denote the sample means of the MCMC iterates of any quantity θ of interest with an index θbpm . The results reveal some interesting behavioral patterns of the choices of the tuning parameters in the prior for β. Of primary concern are the parameter c in the prior variance H β = cV β and the choice between a data dependent (V β = (X 0 X)−1 ) and a conditionally independent (V β = IK ) prior. We consider combinations of N in {12, 43, 93, 137} representing respectively, the 12−Industry portfolios, the 48−Industry portfolios, Fama and French intersection firm characteristics portfolios, and finally a combination of the last two sets of returns. The sample size represents the two periods 19960 : 01 − 2003 : 12 and 1990 : 01 − 2003 : 12 and therefore T ∈ {167, 528}. The posterior distribution summarizes all we know about the unknown parameters after analyzing the data. Analyzing the full posterior distribution is not feasible in our problem because of the large number of parameters. The analysis of the empirical results will focus on specific aspects of different group of portfolios using statistics that summarize information about the (empirical) posterior density. For the sake of argument, we use the posterior mean motivated by a bayesian averaging argument, the mode of the density and, the highest probability interval also known as credible interval as a mean to make inference about the highly likely values of the unknown parameter.

6

Posterior sensitivity to the prior and to panel dimension

From the analysis of the results in Table 2-Panel A, clear finite sample patterns emerge in the behavior of the local empirical Bayes b cEB and q γ , the posterior average number of factors in the posterior model..

20

Selection of Risk Factors

These are summarized in what follows. Posterior Result 1: Sensitivity of empirical Bayes to N and T 1. For fixed N , increasing the sample size results in a decrease in the value of b cEB :

bcEB ∂b ∂T

|N fixed > 0 ∂N ∂T where ηb =

b cEB 1+b cEB

Posterior Result 2: Sensitivity of E(qγ |y) to the prior and sample data 1. The posterior mean of the binomial random variable qγ is an increasing function of N for fixed penalty c and a decreasing function of N for the case of the empirical Bayes penalty b cEB . 2. Small values of b cEB or c (≤ 4) result in large posterior model and vice versa, large values of c result in small model size:

b ∂q γ ∂c

1. The tuning parameter c represents the information in the prior relative to the sample. For N = 12, the prior is given the same importance as 75% of the sample. This relative importance decreases rapidly to about 4% as N doubles the size to N = 43. Further increase to N = 93 results in the prior importance reaching 0.7%. However for a given N , an increase in the sample size from T = 168 to T = 528 results in relatively smaller change in the relative importance of the sample information. For example, N = 12, the change in ηb was marginal from 34.6% to 36.3%. This suggests that unless there are enough assets returns in the data, sample information from large time series is not sufficient to dissipate the relative importance of the prior information. This result is in line with the convergence result for N → ∞ in CR [13]. Identification of the factors is sensitive to N and only when all assets in the universe are included in the sample that one can hope to identify the correct factor structure behind the AP T . More evidence is extracted from the data with a large number of assets and not necessarily from long time series of a small number of assets. The average model size q γ reflects the size of the penalty. When the data does not have enough evidence to perform selection, higher importance is given³to the prior ´ information. P21 γ = 21 × πj = (21)(0.5) = 10.5. In The prior expected number of factors is equal to E j ³j=1 ´ P21 the case of N = 12, the posterior model size is q γ = E γ |y, X = 10.83 which reflect the j=1 j small information in the sample and the high importance given to the prior. As the number of assets increases, more evidence of common dynamics is found in the data and less importance is given to the prior information. The average model size decreases as N increases. However, increasing T has marginal effect on k and if any did result in an increase of the posterior average number of factors in the promising model. In Table 2-Panel B, the penalty is fixed with c = 4 and g(c) ' 2. In this case, the prior information is given the same importance as 20% of the sample for all values of N and T . The average model size is noticeably larger than in the empirical Bayes case for N > 12 and T having practically no effect on k. Compared to the local empirical Bayes, the fixed penalty case of c = 4 results in a fundamentally different response of k to changes in N . For fixed T , increasing N results in significant increases in the posterior average size.

21

Ouysse& Kohn

Table 3: Sample means and Bayesian Estimates for the annualized historical risk premium

T = 168 T = 528

A: Industry Portfolios, N = 43 αpm RP c=4 cEB 8.34% 8.60% 8.60% 7.06% 6.80% 7.06%

B: FF Firm Portfolios, N = 93 αpm RP c=4 cEB 8.99% 10.30% 8.99% 5.91% 5.41% 5.91%

C: combined Returns, N = 137 αpm RP c=4 cEB 8.73% 5.54% 8.47% 6.19% 6.15% 6.17%

The last point in Result 2 can also be clearly seen in Table 2- Panel A. As values of b cEB became EB larger, the posterior average size of the model becomes smaller. For example, b c = 0.53 corresponds to qbγ = 10.83 and b cEB = 30.05 corresponds to qbγ = 5. Fernandez, Ley and Steel [27] recommended penalty (c = max{K 2 , T }) tends to choose the null model (q γ = 1) for small panels. For short time series, this penalty is equivalent to the RIC penalty and is it behaves as the unit information prior for large time series. The RT M SE are computed for the out-of-sample 2 years period 2004 : 01 − 2005 : 12. As it is clear from the results in Table 2, all models have relatively similar out-sample performance with RT M SE ranging from a minimum of 3.72% (for q γ = 9, T = 528, N = 93) to 13.7% (for q γ = 20.03, T = 168, N = 137). The empirical Bayes prior setting has slightly better out of sample performance while the two extreme values of RT M SE occurred for the case of c = 4. It is also not surprising that large models tend to perform well in terms of forecasting. In light of Result 1 and Result 2 and the discussion that followed above, analysis of the posterior factor structure for the Industry/FF returns will focus primarily on the case of c = 4 and the empirical Bayes prior c = b cEB .

7

Empirical content of the pricing model

7.1

Bayesian versus Frequentist approach

Estimation of expected risk premium. Table 3 represents estimates for the annualized 16 historical risk premium representing what investors, on average require as a premium over the risk free rate for an investment with average risk for each factor. The estimates of the risk premium using cross-section sample averages are ranging from an annualized rate of 3.91% to 7.83% not far from what was previously reported in other studies. For example, the findings of Dimson et. al [20] show evidence of high risk premium relative to both bonds and bills with an average of 7.5% per annum for the US equity premium relative to bills. Their results also support our finding of high risk premium in the 90’s which was probably driven by the high returns investors enjoyed prior to the ”technology burst” in the year 2000. Panel A of Table 4 report the summary statistics for the point estimates of the monthly (realized) risk premium for the 12-industry returns over the period 19960 : 01 − 2003 : 12. The MLE estimates of the historical risk premium (RP ) are quantitatively comparable to the Bayesian estimates (αpm ). However, the sample standard deviation (sRP ) of RP are much higher than sα , the posterior mean of the variance of the iterates17 . This implies that even if the distribution of the point estimates are the same, inference using the two methods will be different. The health sector (Hlth) appears to have the highest premium using both techniques with a sample average of 0.64% per month (7.96% per annum) and a posterior mean of 0.78% per month (9.77% per annum). These values are not very useful unless accompanied by some probabilistic interval. The Bayesian credible interval corresponding to the highest density region (HDR) 16

The annualized risk premium rpa is computed from the monthly estimates rpm using the formula: rpa = (1 + rpm )12 − 1 17 Iterates are random draws from the posterior distribution of the parameters. For the parameter β = vec(α, Λ), these iterates are taken from the posterior density p(β|r, γ) as given in equation (20)

22

Selection of Risk Factors

Figure 1: Comparison of the Classical and Bayesian estimates for the mean and variances for the returns series. −3

25

x 10

0.012

s2Y ;T = 168 c = bcEB ;T = 168 c = bcEB ;T = 528 c = 4;T = 528

SY2 ; T = 528 Ψb pm ; c = bcEB ; T = 528

20

0.01

Ψb pm ; c = 4; T = 528 Ψb pm ; c = bcEB ; T = 168

15

0.008

SY2 ; T = 168 Ψb pm ; c = 4; T = 168

10

0.006

5

0.004

0

0.002

−5 0

5

10

15

20

25

30

35

(a) Risk Industry

40

45

0 0

10

20

30

40

50

60

70

80

90

100

(b) Risk Fama & French

for αHlth is equal to [0.43%;1.35%] showing a strong evidence from the data of a high positive expected risk premium. In fact we computed the posterior mean of the risk premiums iterates corresponding to the highest probability model and we found αmode = 1.35% for Hlth. The 90% confidence interval based on the asymptotic normality is equal to [−0.24%;1.52%], which is much wider and less informative relative to the Bayesian HDR. This lack of precision can also be seen in the estimates for the expected risk premium for NoDur. The sample mean is a poor estimator in this case as it corresponds to a margin of error of about 4.33%. Using the bayesian approach, one can still; state that the 90% of the iterates of the risk premium fell in the interval [−0.17;1.55] which is more informative with a range of only 1.72% compared to 8.66% using the asymptotic interval. Although there are quantitative differences about the point estimates of expected risk premium, some qualitative similarities emerged. For example within the 43-Industry returns, Coal industry appears to have the highest realized risk premium according to maximum likelihood estimates, followed by ”Smoke”and Financial ”Fin”. The estimates for the posterior means of the expected risk premium suggest that ”Mines” industry has by far the highest risk premium, followed by ”Banks” and ”Drugs”. For portfolio management purposes, one can argue that the financial ”Fin” and banking ”Banks” sectors as well as the ”Coal” and ”Mines” sectors will bear the same type of risk with similar exposures. Estimation of portfolios’ risk. Figure 1 represents plots of the diagonal elements of the MLE and the posterior mean Bayesian estimates for the covariance matrix ΨN for the 43-Industry (Panel a) and Size (Panel b) portfolios returns. Results clearly indicate that the choice of the prior tuning parameter c in the variance for the prior on β does not affect the results and both the empirical Bayes prior and the data-independent prior lead to quantitatively similar measures for risk. Estimates using the sample variance for both the 43Industry and Size returns lead to significantly higher risk estimates. For example, the Bayesian estimates for risk for the combined portfolio returns is about 2.64% (case of c = 4) while the average cross-section standard error is about 5.92% per month. If one abstracts from the magnitude of the risk estimates, all the estimation methods points to Coal as the riskiest investment portfolio with a monthly standard deviation of 1.252%.

23

Ouysse& Kohn

Figure 2: Histogram of the standardized values of the time-varying (factor j) risk premiums estimates δbt,j for size and industry returns over the sample period 1990 : 01 − 2003 : 12. The prior for β has a variance with Hβ = cmax (X 0 X)−1 , and cmax = max{K 2 , T }, N = 137. 0.5

0.4

0.4

0.4

0.35

0.35

0.35

0.3

0.3

0.3

0.25

0.25

0.25

0.2

0.2

0.2

0.15

0.15

0.15

0.45

0.45

0.4

0.4

0.35

0.35 0.3 0.3

0.25 0.25

0.2 0.2 0.15 0.15 0.1

0.1

0.1

0.05

0.05

0.05

0.1

0.1

0.05

0 −4

−2

0

2

4

0 −4

δ0

The standardized value of δbt is equal to

−2

0

EI

bpm δbt −δ , std(δbt )

2

4

0 −5

0

DEI

5

0 −5

0.05

0

VWRET

5

0 −4

−2

0

2

4

EWRET

where δbt is the K × 1 vector of the posterior mean of the iterates of the risk

premium for the K factors for time t. For comparison to the asymptotic approximation, the graph also shows the standard normal density plot (dotted line).

7.1.1

Turn of the Year effect

In a seminal article, Rozeff and Kinney [44] reported evidence of a seasonal pattern in stock market returns. January returns appeared to be more than eight times higher than a typical month. From 1904 through to 1974, the average stock market return during the month of January was 3.48 percent whereas the monthly return for the remaining months of the year was 0.42%. In this study we incorporate a JAN dummy (which takes one if the month is January and zero otherwise) as a likely risk factor and we perform the variable selection simultaneously with the estimation of the JAN betas and the associated risk premium. Because the risk factors are non-traded economic variables, the estimation, the estimation of the January effect on the pricing has also two components, the January beta and the January premium. The betas reflects the seasonal patterns in the time series dynamics of the excess returns while the premium will reflect the effect or otherwise of the JAN in the pricing of the cross section of the returns. We will report the product of the two component to measure the January component in the realized risk premium. The posterior probability of the JAN dummy to be in the pricing model is different from zero. In Table 4, the posterior mean of γJAN is equal its prior value of 0.5. The corresponding premium per unit of loading is 0.0633%. The January component in the excess returns for the 12-Industry returns is about 0.02%. For the 48-Industry returns, the JAN dummy has only a 3% likelihood to be in the proper pricing model. The product of the posterior average premium and the loadings gives a January premium of 0.03%. No where else the evidence from the data supports the inclusion of the turn of the year effect other than in the size portfolio returns and their combination with the industry returns. Indeed, in Table 5 and Table 6 the posterior mean of γJAN is larger than 90%. The unit premium δbpm is positive and of order 0.0181% and 0.0136% resulting in a January pricing component of 0.15% for the combined returns over the 1990 : 01−2003 : 12 and 0.28% for the size factor over the 1960 : 01−2003 : 12. The evidence for this seasonal pattern is strong only in the returns on Fama & French size portfolios. This fact was previously reported in the literature. Haug and Hirschey [35] found a persistent January effect for small-cap stocks in equally weighted returns for the period 1802-2004. The authors documented an ”anomalous” pattern of monthly returns for portfolios based on the Fama and French (1993) size and

24

Selection of Risk Factors

Table 4: Expected returns, risk premiums and factor betas. The model is a K−factor for excess returns on 12 Industry portfolios for the period 1960:01-2003:12 (T=528): rt = α + ΛFt + et , where α = E(rt ). The prior for β = vec(α, Λ) is N (0, c(X 0 X)−1 ) where c = 4 and Xt = (1, Ft )0 . Panel A Y%

RP %

sRP %

αpm %

sαpm %

NoDur Durbl Manuf Enrgy Chems BusEq Telcm Utils Shops Hlth Money Other

1.118 0.929 0.954 1.100 0.915 0.972 0.894 0.872 1.065 1.084 1.090 0.871

0.63 0.42 0.43 0.55 0.42 0.48 0.41 0.35 0.57 0.64 0.60 0.35

4.33 1.41 1.16 1.09 1.04 0.75 0.73 0.62 0.61 0.54 0.43 0.34

0.59 0.40 0.46 0.66 0.32 0.44 0.54 0.36 0.54 0.78 0.57 0.36

0.58 0.12 0.07 0.21 0.24 0.11 0.21 0.10 0.31 0.25 0.07 0.10 Panel B

Factors

γ

δbj,pm

90%P I

δj

sδj %

δ0 JAN SM B CG U RP U N EM EW RET M OM T

1 0.5 0.1 0.7 0.1 0.9 0.1 0.6

0.0052 0.0633 -0.0461 -0.2526 -0.0583 -0.0678 -0.0212 0.0524

[ 0.0032 ; 0.0056] [-0.2580 ; 0.2662] [-0.0522 ; 0 ] [-0.4709 ; 0 ] [ 0 ; 0.1398] [-0.5286 ; 0.0234] [ 0 ; 0.0551] [-0.1554 ; 0.1992]

0.0044 0.0116 -0.0052 -0.1378 0.0140 -0.2500 0.0055 0.0113

0.0423 1.7453 0.3121 3.7798 0.7791 4.5690 0.6662 3.2388

βbCG

βbU N EM

βbM OM T

Sriski %

0.0013 0.0023 -0.0001 -0.0006 0.0017 -0.0016 0.0012 0.0020 0.0012 -0.0012 0.0010 0.0010

-0.0044 -0.0020 -0.0005 0.0001 -0.0018 -0.0013 -0.0027 -0.0016 -0.0028 -0.0049 -0.0031 -0.0020

0.0001 -0.0022 -0.0004 0.0027 -0.0013 -0.0000 0.0013 0.0003 0.0002 -0.001 0.0008 -0.0002

51.99 74.86 83.82 92.79 92.29 85.31 92.84 89.88 96.72 97.37 96.97 95.05

δj

sδCON

δj

sδEXP

-0.0107 -0.2317 -0.0247 -0.2584 -0.0654 -1.3143 0.1449 0.1485

0.0431 1.8032 0.3131 4.5129 0.7706 5.7435 0.7437 3.4179

0.0080 0.0884 -0.0004 0.0429 0.0101 0.1970 -0.0749 0.0513

0.0424 1.7042 0.2867 3.1259 0.7588 4.3154 0.5889 3.1581

90%P I [-0.17 [ 0.25 [ 0.36 [ 0.38 [-0.24 [ 0.27 [ 0.31 [ 0.21 [-0.23 [ 0.43 [ 0.41 [ 0.18

; ; ; ; ; ; ; ; ; ; ; ;

1.55] 0.60] 0.60] 1.10] 0.56] 0.61] 1.02] 0.52] 1.00] 1.35] 0.69] 0.50]

CON

j

EXP

j

The table provides the sample means Y of the returns and its sample standard deviation sY , the sample average of historical PT premium RP = T1 t=1 rt , the posterior means of the expected excess returns αpm and their standard errors sαpm , and the proportion of the systematic risk relative to the total risk, Sriski = (1 − V R1i )100%. The table also reports the posterior means βbj , of the factor loadings corresponding to factor j. The probability interval P I corresponds to the highest probability region calculated from the empirical distribution of the MCMC iterates. For example, % of iterates in the 90%P I is equal to 90%)

BE/M E factors. They show that both factors contribute to the continuing January effect.

7.1.2

Time varying price for economic risk

An analysis of the monthly time series of risk premiums indicate that overall the risk premiums strongly fluctuate over the business cycle. To this effect, we measure the average risk premium associated to CON EXP each risk during recession and expansion (denoted δbpm and δbpm respectively). We also plot the time-varying premiums for the ”pervasive” risk factors in the posterior mean model. Figure 3 presents plots of the risk premiums iterates for factors with probability one to be in the ”true” model. It is clear that the market premium for exposure to these economic risk factors has been fluctuating around a long run average with episodes of high volatility. A noteworthy observation is the change in the dynamics of the series around he year 2000 with an apparent increase in the volatility. The graphs also show the time series dynamics of the mispricing parameter δ0 . The ability of the posterior model estimated over the period 1960 − 2003 to explain the realized risk premium declines starting in the mid 1998 with a sharp incidence of mispricing in January 1988. The estimated mispricing from the posterior model corresponding to data from 1990 − 2003 does a better job with values within [−0.05%; 0.05%]. This suggests that the factor structure itself could be time-varying or at least that the market’s valuation of risk is changing over time. This in an important finding which separates the time series dynamics of

25

Ouysse& Kohn

Figure 3: Time Series plot of the iterates of the posterior risk premium δjt for factor j and month t. The plots are for industry returns for the period 1960 : 2003. 0

0

δ

δb0t

0.1

−0.1 −0.2 01/64

01/68

01/72

01/76

01/80

01/84

01/88

01/92

01/96

01/00

01/04

0.1 0 −0.1 −0.2

MP

CG

0

−0.5 0

01/64

01/68

01/72

01/76

01/80

01:84

01/88

01/92

01/96

01:00

0

50

100

150

200

250

300

350

400

450

500

0

50

100

150

200

250

300

350

400

450

500

0

50

100

150

200

250

300

350

400

450

500

01/04

5

0.5

01/64

01/68

01/72

01/76

01/80

01/84

01/88

01/92

01/96

01/00

0 −5

01/04

1

UNEM

1

EI

0 −1 0

01/64

01/68

01/72

01/76

01/80

01/84

01/88

01/92

01/96

01/00

10

MOMT

1

DEI

0.5 0 −0.5 0

01/64

01/68

01/72

01/76

01/80

01/84

01/88

01/92

01/96

01/00

(b) 12-Industry returns, 1960 : 2003, c = 4

δ0

0.2 0 −0.2 01/64

01/68

01/72

01/76

01/80

01/84

01/88

01/92

01/96

01/00

0 −10

01/04

(a) 43-Industry returns, 1960 : 2003, c = 4

δ0

0 −1

01/04

0.05 0 −0.05 −0.1 07/90

01/04

01/92

01/94

01/96

01/98

01/00

01/02

01/04

01/92

01/94

01/96

01/98

01/00

01/02

01/04

01/92

01/94

01/96

01/98

01/00

01/02

01/04

01/92

01/94

01/96

01/98

01/00

01/02

01/04

01/92

01/94

01/96

01/98

01/00

01/02

01/04

5

UEI

EI

1 0 −5 01/64

01/68

01/72

01/76

01/80

01/84

01/88

01/92

01/96

01/00

DEI

DEI

−5

01/64

01/68

01/72

01/76

01/80

01/84

01/88

01/92

01/96

01/00

EWRET

4 2 0 −2

VWRET

VWRET

−2

0 01/68

01/72

01/76

01/80

01/84

01/88

01/92

01/96

01/00

01/04

01/64

01/68

01/72

01/76

01/80

01/84

01/88

01/92

01/96

01/00

01/04

(c) Combined returns,1960 : 2003, c = max{T, K 2 }

4 2 0 −2 01/90

EWRET

01/64

2 1 0 −1 01/90

01/04

4 2

−1 01/90

01/04

5 0

0

2 0 −2 01/90

(d) Combined returns, 1990 : 2003,c = max{T, K 2 }

returns from the pricing of the cross section of returns. A factor might be ”significant” to explain the time series variation and its beta might be constant over time however the market might put a small reward to exposure to the underlying risk. This might be explained by the changes in the opportunities for investors to hedge against this risk. To gain some understanding of the links between the business cycle and the market premiums, we computed the sample average of premiums over phases of economic contraction and expansion. Three patterns emerged. First, the average premium over the months of economic recession is higher than the pooled sample average. Second, the volatility of the time series is higher in time of contraction compared to the pooled sample. Third, phases of expansion appear to have less volatility than in the full sample. These results can be seen in Tables 4,6,7 and, 5 for all returns categories, over the two samples and, for different configurations of the priors. For the purpose of inference, we report the Bayesian probability interval for the posterior distribution of the factor’s risk premium. We are also curious to check the distribution of these premiums and how it compares to the standard gaussian probability density. Figure 2 provides a histogram representation of the standardized values of the time-varying premiums for each of the risk factor j with γ j = 1. The plots are for the combined Industry and Size factors for over the period 1990 : 01 − 2003 : 12. We find similar distributional behavior for all the other configurations of priors and returns. We also show a density plot [t of the standard normal with mean zero and unit variance. The main message from this plot is that delta can be approximated by a normal distribution for inference purposes. It is worth mentioning though that these estimates are posterior means of iterates from the full posterior. These histograms do not

26

Selection of Risk Factors

Table 5: Estimates for Posterior Means of risk premiums for January and Expansion and Contraction periods for Factors Pervasive for the Fama and French Portfolios Returns. The b ML + s2 IN ; N = 93, V = (X 0 X)−1 and kmax = 21 location parameter for Wishart prior Φ = Σ β Panel A: Sample period: 1990 : 01 − 2003 : 12,T = 168

F actors

γi

γ bmode

JAN SMB UNEM U EI DEI VWRET EWRETD M OM T

0.93 0.046 0.488 0.98 0.80 0.49 1 1

1 0 1 1 1 1 1 1

F actors

γi

γ bmode

JAN MP EI M OM T V W RET EW RET

1 1 1 1 1 1

1 1 1 1 1 1

c = 4, RT M SE = 0.0541 δbEXP δbpm δbJAN pm

0.0136 -0.0009 0.0088 -0.0229 -0.0160 -0.0063 0.0222 0.1688

0.0235 4e-005 0.0004 0.0027 -0.0169 -0.0073 0.0091 -0.0075

pm

0.0006 -8e-005 -0.001 -0.0069 0.0030 0.0009 -0.0057 0.0032

CON δbpm

γi

0.0015 -4e-005 0.0003 -0.0028 -0.0029 1-0.0009 0.0147 0.0002

0 0 1 0.48 0.11 1 0

b cEB = 136.3,k = 4.60, RT M SE = 0.0538 CON EXP JAN δbpm δbpm δbpm δbpm γ bmode 0 0 1 1 0 1 0

-0.2493 -0.1090 -0.0296 0.0627 -

-0.0477 -0.0105 -0.0087 0.0712 -

-0.0622 -0.0324 -0.0066 -0.0265 -

-0.0021 -0.0349 -0.0199 0.0211 -0.0080 0.0637

0.0461 -0.0081 -0.0028 -0.0192 -0.0179 0.0420

0.0002 -0.0093 -0.0127 8e-005 0.0033 0.0449

-0.0028 -0.0021 -0.0006 -0.0044 -0.0005 0.0095

1 1

1 1

-0.1073 0.0904

0.0330 0.1136

-0.0369 0.0637

Excess Returns Predictability and Economic Fundamentals

In this section we will analyze the ”economic” composition of the posterior mean of the distribution of the risk premium α. We will consider the estimate of the likelihood of an economic variable being priced measured by γ i , the posterior mean of the distribution of the premium per unit exposure to the its risk, δbpm , the mispricing as a percentage of the total premium measured by δα0 × 100%. In addition we will consider look at the second moments of the posterior distribution and analyze the percentage of the systematic risk relative to the total risk, Sriski = (1 − V R1i ) × 100. • Decile Portfolios: The results on the factor selection in the pricing of the excess returns on the decile portfolios consistently show a very parsimonious specification in all the choices of the prior settings. For example, in the case of c = 4 and for the short subperiod of 1990 : 01 − 2003 : 12, the posterior first moments of the model specification show zero probabilities for all factors except for unemployment and industrial production variables. Monthly unanticipated changes in the industrial production index (MP ) have a unit posterior probability to be in the posterior mean model and the highest probability model. The posterior mean of the price of risk associated to MP is about 0.10%. The average mispricing measured by the posterior mean of the distribution of δ0 is equal to −0.33% or a 45%18 underpricing under the posterior mean model specification. The choice of the empirical based prior for the tuning parameter c in the g-prior for the loadings results in posterior model with only one risk factor. Book-to-market HML appears in the posterior mean and posterior mode of qγ with probability one. The premium per unit exposure to risk associated to 18

0.0121 0.0134 0.0020 0.0379 -

Panel B: Sample period: 1960 : 01 − 2003 : 12, T = 528 c = 4, RT M SE = 0.0372 b cEB = 468.8,k = 5, RT M SE = 0.0486 JAN EXP CON JAN EXP CON δbpm δbpm δbpm δbpm γi γ bmode δbpm δbpm δbpm δbpm

characterize the distribution of each factor premium at a fixed point in time but rather the distribution of over time of an average of iterates.

7.1.3

-

The average excess return on the decile portfolios for the period of 1990:01-2003:12 is equal to 0.73.

0.0190 0.0143

27

Ouysse& Kohn

Table 6: Estimates for Means of the iterates of risk premiums (percent per month) for the phases of economic Expansion and economic Contraction. Case of combined sample of Size b ML + s2 IN ; and Industry portfolio returns. The location parameter for Wishart prior Φ = Σ 2 N = 137, H β = cmax V β and kmax = 18, cmax = max{K , T }

γj δj δ EXP,j δ CON,j

Panel A: EI 1 0.0307 1.1404 0.0489 1.0282 0.1057 1.3055

Period 1960:01-2003:01, cmax = 528 DEI V W RET EW RET 1 1 1 0.0128 0.0292 0.1003 1.0638 1.0134 1.0433 0.0596 0.0736 0.0777 1.0103 0.9314 0.8238 0.0987 0.0873 0.1665 1.2619 1.2262 1.0144

Panel B: U EI 1 -0.0201 0.5585 -0.0213 0.1260 -0.0085 0.1491

Period 1990:01-2003:01 , cmax = 324 DEI V W RET EW RET 1 1 -0.0743 0.0061 0.0925 0.6498 0.8709 0.9356 -0.0147 -0.0135 0.0083 0.1641 0.2223 0.2347 -0.0113 0.0237 0.0056 0.1703 0.1960 0.1737

PK

λ0i,k δk,t . The values in the table shows time P (j) series averages of the iterates of the time varying risk premiums. For example, δ EXP,k = P 1 t t∈ζ δk,t , where ζ defines The APT pricing implications for excess return rit is : E(rit ) = δ0 +

k=1

t∈ζ

(j)

the months of economic expansion and δk,t is the iterate for the risk premium for factor k. In the table, the first row next to a given risk premium is the sample average, and the second reports the standard deviations.

the valuation effect is about 0.0329% per month, equivalent to 1.27% per year with similar average mispricing. The average size in the posterior model is about q γ = 2 and comparing the out-of-sample performance under the two priors results in qualitatively comparable mean prediction error (RTMSE ). From the description of the correlation structure of the factors, HML had a high correlation with MP (0.88) and a contemporaneous correlation of −0.21 to YP. It could be that both unanticipated changes in industrial production the the book-to-market carry information that proxies the same economic risk. Considering the longer sample period of 1960 : 01 − 2003 : 12, the risk factor MP appears again the posterior model with a negative and lower price for risk (−0.0142%). In addition, the return on the value weighted market index VWRET has a positive unit premium of 0.0710% per month. Increasing the sample period does not improve on the performance of the average posterior model. Indeed, the underpricing is more severe with an average posterior pricing error is −61% and the RTMSE is higher than in the short period case. This result is in line with the discussion about the effect of the sample size on the posterior mean of RTMSE and the posterior mean of b cEB in Table 2. An increase in the sample size affects the results through two channels. First it changes the relative importance of the prior via ηb which depends on the choice of the tuning parameter c, therefore affecting the model selection penalty. Secondly, it affects the fit of the regression. The net effect will depend on whether the structure of the pricing is time variant or not. If the risk factors are constant through time as well as their premiums, then more observations will improve identification. We finish our analysis of the Decile returns by considering the conditionally independent prior on the factor betas. The findings for V β = IK provide more evidence that economic risk factors are determining factors in the pricing of cross section of returns. In this case, changes in expected inflation and unanticipated changes in inflation have positive premiums with 0.0954% for EI and 0.0425% for UEI. The underpricing in the posterior model is about 35.6% of the realized premiums. • Industry Portfolios Table- 8 shows the estimation results for the Industry-45 portfolio returns for the case of data dependent g-prior with c = 4. The economic risk is more likely to be driven by expected inflation and changes in expected inflation, along with value/equally weighted portfolio return and the momentum factor. The premium on DEI is negative (0.54% per month) and equivalent to 6.7% annually. Chen et al. [14] argue that the negative sign could be explained by

28

Selection of Risk Factors

Table 7: Estimates for Posterior Means of risk premiums for January and Expansion and Contraction periods for Factors Pervasive for the combined size and industry portfolios. The b ML + s2 IN ; N = 137, H = cV and kmax = 18 location parameter for Wishart prior Φ = Σ β β F actors δ0 JAN MP EI UEI VWRET EWRET MOMT

δbk,pm 0.0027 0.0181 0.0139 0.0332 0.0302 0.0354 0.0458 0.0262

F actors

δbk,pm

δ0 EI VWRET EWRET

0.0081 -0.0614 -0.0599 0.0438

c = 4, RT M SE = 0.0720, T = 528 90%P I δk std(δ) δ CON [ 0.0029 ; 0.0027] 0.0039 0.0464 -0.0031 [-0.0320 ; 0.0282] -0.0014 0.1132 -0.0172 [-0.0319 ; 0.0226] -0.0047 0.0451 -0.0027 [-0.0045 ; 0.0242] 0.0080 0.1213 0.0043 [-0.0059 ; 0.0241] 0.0078 0.1381 0.0086 [ 0.0003 ; 0.0355] 0.0161 0.2309 0.0262 [ 0.0222 ; 0.0538] 0.0407 0.6622 -0.0309 [-0.0149 ; 0.0443] 0.0122 0.1413 0.0222 b cEB = 68.52, RT M SE = 0.1111, T = 168 90%P I δk std(δ) δ CON [ 0.0038 [-0.1464 [-0.1196 [ 0.0029

; ; ; ;

0.0107] 0.0774] 0.0358] 0.0782]

0.0071 -0.0412 -0.0374 0.0463

0.0440 1.167 1.2257 1.0818

0.0134 -0.3571 -0.3070 0.0724

std(δ CON ) 0.0631 0.1274 0.0538 0.1547 0.1695 0.2046 0.7696 0.1479 std(δ CON ) 0.0540 1.695 1.8961 0.9705

investors hedging against the adverse influence on assets that are fixed in nominal terms. On the other hand, EI earned a high positive premium for both subperiods and ranged from a monthly average premium of 1.14% for 1960 : 2003 to 2.77% for 1990 : 2003. Risa [42] estimated a 10 year inflation risk premium for the UK between 1983 and 1999. The premium was mostly around 2% with an initial level of 4% and a sharp drop to a slightly negative value in 1999. The momentum factor has positive effect with premium which ranging from 0.0119% per month for the 1960 − 2003 sample to 0.0297% per month for the 1990 − 2003 sample meaning that investors are rewarded for holding ”winners” stocks, i.e. those which performed well in the past 12 months. This result is in line with L’Her, Masmoudi & Suret [38]. They measure the momentum factor premium at 16.07% and find that the premium is larger in up markets and remains positive in down markets. Finally the monthly unanticipated changes in the industrial production index MP has a positive effect but much lower compared to the decile portfolios. For the recent period, the missing factor appears to have a negative premium. This means that investors are paying a premium for ensuring against all the unknown risk factors they cannot diversify away. The predictable variation in the (Bayesian) historical premium measured by the percentage of variation attributed to the (estimated) systematic risk relative to the Bayesian estimate of total risk are all within 15% with the exception of the financial industry portfolio (Fin) which has an estimated 30% of the risk due to the idiosyncratic component. • Fama & French Size portfolios Table 5 reports the post selection posterior mean of the distribution of the risk premium corresponding to factors with non-zero posterior mean probability. The results further reinforce the role of unanticipated changes in inflation. The posterior average premiums for unexpected inflation and changes in expected inflation are negative similar to what described for industry returns and to those of previous studies. The magnitudes are however much higher ranging from 1.60% per month for DEI to 2.29% for UEI. The posterior mean of the employment rate indicator variable is 0.48 with a probability one for UNEM factor to be in the highest probability model. The average premium is positive (0.88% per month) which reflects investors’ dislike for (surprise) increases in unemployment rate. MOMT factor requires yet again a positive and high premium reflecting the strong momentum effect in the size returns. • Combined Portfolio returns For the combined sample of Fama & French , and the industry portfolios, Table 6 shows that largely the premiums on the inflation factors are similar to those described for industry and/or size returns. The same is true for the sign of momentum factor with a lower

29

Ouysse& Kohn

Table 8: Estimates for Posterior Means for price of risk for Factors Pervasive for the Industry b ML + s2 IN and V = Portfolios Returns. The location parameter for Wishart prior Φ = Σ β (X 0 X)−1 ; N = 43 and kmax = 21

F actors M ark − Rf JAN CG U N EM MP EI DEI V W RET EW RET M OM T M ISS

Panel A: Sample period: 1960 : 01 − 2003 : 12, T = 528 c = 4, RT M SE = 0.0640 δbCON δbpm δbEXP γ γ bmode

Panel B: Sample period: 1990 : 01 − 2003 : 12,T = 168 c = 4,k = 8.41, RT M SE = 0.0659 δbCON δbpm δbEXP γ γ bmode

0 0.03

0 0

-0.00024

-0.0002

-0.0002

0.03 1 1 0 1 1 1 0

0 1 1 0 1 1 1 0

0.0014 0.0020 0.0114 -0.0019 -0.0360 0.01195 -

0.0010 0.0036 -5e-005 0.0007 -0.0243 0.0083 -

6.4e-005 0.0056 0.0027 0.004 0.0021 0.0297

0.871 0.032 0.354 0 1 1 1 1 0.161

pm

i

pm

pm

i

1 0 0 1 1 1 1 0

-0.0049 -0.0007 -0.0125 0.0277 -0.0054 -0.0150 0.0297 -0.0029

pm

2.9e-005 0.0001 0.0016

-0.0005 0.0005 0.0090

0.0075 -0.0072 -0.0105 -0.0032 -0.0004

0.0011 0.0045 0.0086 -0.0032 -0.0001

Table 9: Estimates for price of risk for Factors Pervasive for the Combined Industry and FF b ML + s2 IN . N = 137 size Portfolios Returns. The location parameter for Wishart prior Φ = Σ 0 −1 and kmax = 21, V β = (X X) F actors δ0 EI U EI DEI V W RET EW RET p 2 qQ e2 Q

Panel A: T = 168, cmax = 441, RT M SE = 0.1280 γi δbpm 95%P I 1 0.0079 [ 0.0053 ; 0.0118] 1 -0.0018 [-0.1119 ; 0.0554] 1 -0.0551 [-0.1482 ; 0.0015] 1 0.0019 [-0.0322 ; 0.0362] 1 0.1026 [ 0.0604 ; 0.1283] mean% std% 95%CI(%) 0.7202 0.1048 [ 0.5789 ; 0.8807]

Panel B: T = 528, cmax = 528, RT M SE = 0.1545 γi δbpm 95%P I 1 0.0048 [-0.0029 ; 0.0083] 1 -0.0098 [-0.0837 ; 0.1886] 1 -0.0153 [-0.0891 ; 0.1566] 1 0.0324 [ 0.0183 ; 0.0468] 1 0.1007 [ 0.0698 ; 0.1338] mean% std% 95%CI(%) 0.4467 0.0424 [ 0.3950 ; 0.5165]

0.1063

0.0685

0.0174

[ 0.0839 ; 0.1396]

0.0071

[0.0589 ; 0.0815]

premium when the industry returns are added. This sample returns however shows that other macroeconomic variables are potentially important in the pricing relationship. Results for the 1990 : 2003 further suggest that increases in unemployment rate require compensation in the post selection model with c = 4. The growth of nondurables consumption requires a negative risk premium (0.03% per month). This is in contrast to the higher and positive premium reported by [28] (0.32%) and the non significance result found by [14]. However, this result is in line with Bansal & Yaron [5] which found negative premium (about −0.4% and −1.2% for 12 and 36 months real bonds respectively) suggesting that real bonds offer a consumption insurance to investors. The size and book-to-market factors have 100% probability to be in the ”promising” pricing model. The sign on the average premium for SMB is negative suggesting a reversal of the size effect. The negative premium on the missing factor reinforces the investors’ dislike for unaccounted risk. For the period 1960 : 2003, the posterior model is much smaller with an average of 10 factors. Only the inflation risk factors, momentum and changes in industrial production appear in the pricing model.

30

8

Selection of Risk Factors

Conclusion

In this paper, we adapt recent advances in Bayesian analysis to an approximate factor model to investigate the role or otherwise of measured economic variables in the pricing of securities. The methodology provides a flexible framework to deal with both the severe small sample problems and the post model selection inference. We reinvestigate the identification of the factors and testing of the asset pricing theories simultaneously by treating the set of factors as an unobserved random variable. Post model selection inference about the APT implications and the pricing relationship are based on probability distributions that integrate the uncertainty about the set of factors in the returns’ data generating process. The MCMC sampling scheme we use allows for an efficient and fast sweep of the space of all possible combinations of subset of factors. In our priors about the factor structure, we treat economic factors and Fama and French size and book-to-market factors as equally likely to be in the ”true” subset of risk factors if in fact such a model does exist. The number of measured variables and statistical factors is endogenously determined and this uncertainty is integrated out in the posterior distribution of the risk premiums, factor betas and any functions of the parameters. We evaluate the posterior model pricing ability by measuring the deviation between the realized and the model implied systematic risk premium. In particular, we evaluate the viability of the Fama and French three risk factor in both the time series and the pricing of the returns on three categories of portfolios: Decile, Industry and Size returns. To understand the link between the market price and the changes in economic conditions, we estimate the time-varying reward investors demand for the exposure to each economic risk factor. For robustness purposes, we compare how the data change the priors beliefs about the market exposure to the economic risk into posterior distributions over a range of prior parameters and over two sample periods. Although there are some differences in the set of factors appearing in the posterior model depending on the time period and the prior setting, these were nested subset of each other. The findings are as follows. First, we find no robust evidence that Fama and French (FF) size, book-to-market and market portfolios are priced risk factors across the universe of returns. Indeed with exception to the decile portfolio returns where the valuation factor (HML) emerged as a priced risk factor with positive risk premium, the posterior probabilities of inclusion of these factors (individually or as a subset) in the pricing model are equal to zero.19 This overwhelming evidence lead us to assert that multiple betas from a multi-factor model absorb the role of size and book-to-market equity and that the pricing of securities can ultimately be determined by systematic economic risks instead of firm-specific variables. This finding is not at odd with the recent empirical evidence in the finance literature. After the long overwhelming empirical studies that supported the viability of the FF model, Cremers [19] uses a Bayesian framework along with a new inefficiency metric based on the maximum correlation between the market portfolio and any multifactor-efficient portfolio to show that neither FF factors nor the momentum factor improve the pricing performance relative to the CAPM. He goes on to argue that, ...the pricing performance of the FF factors is not robust to the choice of test factors. Specifically the good performance of the FF-model for the average, absolute pricing errors when we use the BM/size-sorted test portfolios is contradicted by its poor performance for the industry-sorted test portfolios. (Cremers, 2006, p.2993-2994) Second, the posterior mean model had at least one of the variables that measure changes in inflation appear to be pervasive for the pricing of the three group of portfolio returns over the two sample periods. Changes in expected inflation and unexpected inflation all had a unit probability to be in the posterior mean and highest probability model. In addition, monthly changes in the industrial production index and changes in the unemployment level are likely to be priced with Third, seasonality appears to be a characteristic of FF size portfolios only with a positive premium for the month of January. 19

The prior probability for inclusion of each variable is equal to 12 . This result was robust to an experiment where we dropped all the other economic factors and performed a factor selection with only FF factors. This resulted in a posterior mean model featuring only the value and equally weighted market indices.

Ouysse& Kohn

31

Finally, we have demonstrated the time-varying nature of the factor risk premium and shown that their dynamics are tightly linked to the economy. We found evidence of higher risk premium level and volatility during times of economic contraction and reverse behavior during expansion. These changes in the sign and magnitude of the economic factors’ risk premiums can be attributed to changes in the hedging capability of alternative securities with no significant risk premium. This article opens the door for further investigation of the pricing of securities. First, we have considered a cross-section of 137 portfolios. The original equilibrium APT applies to the ”universe” of traded assets and therefore, extension to individual assets is of interest. However, one needs to deal with the dimensionality of the covariance matrix. Second, we have considered only observed and measured variables as potential risk factors. Augmenting the model to allow for latent factors is a viable alternative to estimate the missing factor(s).

References [1] Ahmed, P. and Lockwood,L. J. (1998), Changes in Factor Betas and Risk Premiums Over Varying Market Conditions,The Financial Review, Vol. 33, pp. 149-168. [2] Anderson, T. W. and Amemiya, Y., (1988), The Asymptotic Normal Distribution of Estimators in Factor Analysis under General Conditions, Annals of Statistics, Vol. 16, 759-771. [3] Bai, J. (2003), Inferential Theory for Factor Models of Large Dimensions, Econometrica, Vol. 71, pp. 135-172. [4] Bai, J. and Ng, S. (2002), Determining the Number of Factors in Approximate Factor Models, Econometrica, Vol. 70, pp. 191-221. [5] Bansal, R. and Yaron, A., (2004), Risks for the Long Run: A Potential Resolution Of Asset Pricing Puzzles, Journal of Finance, Vol. 59, pp. 1481-1509. [6] Balduzi, P. and Robotti, C. (2007), Mimicking Portfolios, Economic Risk Premia, and Tests of Multi-beta Models, Federal Reserve Bank of Atlanta in its series Working Paper 2005-04. [7] Bartlett, M. (1957), A comment on D. V. Lindley’s Statistical Paradox. Biometrika Vol. 44, pp. 533-534. [8] Berger, J. O., Ghosh, J. K. and Mukhopadhyay, N. (2003), Approximations and Consistency of Bayes Factors as Model Dimension Grows. Journal of Statistical Planning and Inference, Vol. 112, pp. 241-258. [9] Brown, P. J., Vannucci, M. and Fearn, T. (1998), Multivariate Bayesian Variable Selection and Prediction, Journal of the Royal Statistical Society, series B, Vol. 60, Part 3, pp. 627-641. [10] Brown, P. J., Vannucci, M. and Fearn, T. (1999), The Choice of Variables in Multivariate Regression: A Non-conjugate Bayesian Decision Theory Approach. Biometrika Vol. 86, pp. 635-648. [11] Burmeister, E. and McElroy, B. M.,(1988), Joint Estimation of Factor Sensitivities and Risk Premia for the Arbitrage Pricing Theory. Journal of Finance, Vol. XLIII, No. 3, pp. 721-733. [12] Carlin, B. P. and Chib, S. (1995), Bayesian Model Choice Via Markov Chain Monte Carlo Methods. Journal of the Royal Statistical Society, Series B, Vol. 57, pp. 473-484. [13] Chamberlin, G. and Rothschild, M. (1983), Arbitrage, Factor Structure and Mean-Variance Analysis in Large Asset Markets, Econometrica 51, 1305-1324. [14] Chen, N. F., Roll, R. and Ross, S., A. (1986), Economic Forces and the Stock Market, Journal of Business Vol. 59, pp. 383-403. [15] Chipman H., George E. I. and McCulloch R. E. (2001), The Practical Implementation of Bayesian Model Selection, ims Lecture Notes- Monograph Series, Vol. 38, pp. 65-134. [16] Connor, G. (1984), A Unified Beta Pricing Theory, Journal of Economic Theory, Vol.34, pp. 13-31.

32

Selection of Risk Factors

[17] Connor, G., Korajczyk, R.A., (1988), Risk and Return in an Equilibrium APT: Application of a new test methodology, Journal of Financial Economics, Vol. 21, pp. 255-289. [18] Connor, G., Korajczyk, R.A., (1993). A test for the number of factors in an approximate factor model. Journal of Finance 48, 1263-1291. [19] Cremers M. K., j. (2006), Multifactor Efficiency and Bayesian Inference, Journal of Business, Vol. 79, no. 6, pp.2951-2998. [20] Dimson, E., Marsh, P. and Staunton, M. (2003), Global Evidence on the Equity Risk Premium,Journal of Applied Corporate Finance, Vol. 15, no. 4, pp.08-19. [21] Donoho, D. L. and Johnstone, I.M. (1994), Ideal Spatial Adaptation by Wavelet Shrinkage, Biometrika, Vol. 81, pp. 425-456. [22] Elton, E. J., Gruber, M. J. and Blake, C. R. (1995), Fundamental Economic Variables, Expected Returns, and Bond Fund Performance, The Journal of Finance, Vol. 50, No. 4, pp. 1229-1256. [23] Fama, E. F., and French, K. R. (1993), Common Risk Factors in the Returns on Stocks and Bonds, Journal of Financial Economics Vol. 33, No. 1, pp. 3-56. [24] Fama, E. F., and French, K. R. (1995), Size and Book to market Factors in Earnings and Returns, Journal of Financial Economics Vol. 50, No. 1, pp. 131-156. [25] Fama, E. F., and French, K. R. (1996), Multifactor explanationsof asset pricing anomalies, Journal of Financial Economics Vol. 51, No. 1, pp. 55-84. [26] Fama, E. and MacBeth, J. (1973), Risk, Return, and Equilibrium: Empirical Tests, Journal of Political Economy, Vol. 81, No. 3, pp. 607-636. [27] Fernandez, C., Ley, E. and Steel, M. (2001), Benchmark Priors for Bayesian Model Averaging, Journal of Econometrics Vol. 100, pp. 381-427. [28] Ferson, E. W., and Harvey, R. C. (1991), The Variation of Economic Risk Premiums, Journal of Political Economy, Vol. 99, No. 2, pp. 385-415. [29] Ferson, W., and Korajczyk, R. A., (1995), Do Arbitrage Pricing Models Explain the Predictability of Stock Returns?, Journal of Business, Vol. 68, No. 3, pp. 309-349. [30] Foster, D. P., and George, E. I. (1994), The Risk Inflation Criterion for Multiple Regression, The Annals of Statistics, Vol. 22, pp. 1947-1975. [31] George, E. I. and Foster, D. P. (2000), Calibration and Empirical BAyes variable selection, Biometrika Vol. 87, No. 4, pp. 731-747. [32] Geisser, S. and Cornfield, J. (1963), Posterior Distributions for Multivariate Normal Parameters, Journal of the Royal Statistical Society, Ser. B Statistical Methodology, Vol. 25, pp. 368-376. [33] George, E. and McCulloch, R. E. (1993), Variable Selection Via Gibbs Sampling. Journal of the American Statistical Association, Vol. 88, pp. 881-889. [34] Geweke, J. and Zhou, G., (1996), Measuring the Pricing Error of the Arbitrage Pricing Theory, The Review of Financial Studies, Vol.9, N0. 2, 557-587. [35] Haug, M. and Hirschey, M. (2006), The January Effect, Financial Analyst Journal, Vol. 62(5), pp. 78-88. [36] Kohn, R,, Smith, M. and Chan , D. (2001), Nonparametric Regression Using Linear Combinations of Basis Functions, Statistics and Computing, Vol. 11, pp. 313-322 [37] Kryzanowski, L. & To, M. C. (1983), General Factor Models and the Structure of Security Returns, Journal of Financial and Quantitative Analysis, Vol. 18, N0.1 pp. 31-52. [38] L’Her, J.F., Masmoudi,T. and Suret, J. M., (2004, Evidence to Support the Four-Factor Pricing Model from the Canadian Stock Market, Journal of International Financial Markets, Institutions & Money, Vol. 14, no. 4 pp. 313-328.

33

Ouysse& Kohn

[39] Nardari, F. and Scruggs, j. (2006), Linear Factor Models with Latent Factors, Multivariate Stochastic Volatility, and APT Pricing Restrictions, Journal of Financial and Quantitative Analysis, forthcoming. [40] Ouysse, R., (2006), Consistent Variable Selection In Large Panels When Factors are Observable, Journal of Multivariate Analysis, Vol. 97, pp. 946-984. [41] Liang F., Paulo R., Molina G., Clyde M. A. and Berger J., (2007), Mixtures of g-priors for Bayesian Variable Selection, Unpublished Manuscript. [42] Risa,, S. (2001), Nominal and Inflation Indexed Yields: Separating Expected Inflation and Inflation Risk Premia, Available at SSRN: http://ssrn.com/abstract=265588 or DOI: 10.2139/ssrn.265588 [43] Ross, S. A., (1976), The Arbitrage Theory of Capital Asset Pricing, Journal of Economic Theory, Vol. 13, pp. 341-360. [44] Rozeff, M. S., and Kinney, W. R., (1976), Capital Market Seasonality: The Case of Stock Returns, Journal of Financial Economics, Vol. 3, no. 4(October), pp. 379-402. [45] Smith, M. and Kohn, R. (1996), Nonparametric Regression using Bayesian Variable Selection. Journal of Econometrics, Vol. 75, pp. 317-343. [46] Zellner, A. (1986), Further Results on Bayesian Minimum Expected Loss (MELO) Estimates and Posterior Distributions for Structural Coefficients, in D. Slottje, Advances in Econometrics, Vol. 5, pp. 171-182.

9

Data Appendix

From Roll and Ross (1986), we construct the following economic factors: • Consumption growth CG: growth rate in real per capita consumption constructed by dividing the series of seasonally adjusted real personal consumption (excluding durables) by the population. The series are from F RED (Federal reserve bank of St. Louis). • Term structure of interest rate U T S: the series n a long term government bond and the lagged return on one month bills are from CRSP US Treasury and Inflation Indices. • Risk premium U RP : the return on low grade bonds (Moody’s seasonally adjusted Baa corporate bond yield) and a long term government bond are from CRSP. • Monthly growth of industrial production M P (t) and annual growth of industrial production Y P (t): the industrial data are from St. Louis Fed. • Expected inflation EI(t): constructed using Fama and Gibbons (1984). • Unexpected inflation U EI(t) = I(t) − E(I(t)|t − 1). • Change in expected inflation DEI(t) = E(I(t + 1)|t) − E(I(t|t − 1)). • Financial market indices: the return on value weighted including dividends V W RET D and V W RET X excluding dividends. Equally weighted return including /and excluding dividends EW RT D and EW RET X portfolios of N Y SE listed stocks. • The unemployment rate U N : Civilian unemployment rate percent, seasonally adjusted. Source: Bureau of Labor Statistics (F RED). 43 Industries.

Agric(1) Clths(9) Steel(17) Oil(25) P aper(33) RlEst(41)

F ood(2) M edEq(10) M ach(18) U til(26) Boxes(34) F in(42)

Beer(3) Drugs(11) ElcEq(19) T elcm(27) T rans(35) Other(43)

Smoke(4) Chems(12) Autos(20) P erSv(28) W hlsl(36)

T oys(5) Rubbr(13) Aero(21) BusSv(29) Rtail(37)

F un(6) T xtls(14) Ships(22) Comps(30) M eals(38)

Books(7) BldM t(15) M ines(23) Chips(31) Banks(39)

Hshld(8) Cnstr(16) Coal(24) abEq(32) Insur(40)

Selection of Risk Factors

34

JAN MARKET SMB HML CG UTS URP UNEM MP YP PPI EI UEI DEI VWRET EWRET MOMT MISS

X1 1.00 -0.05 -0.06 0.00 0.02 0.08 -0.09 -0.10 -0.07 0.05 0.14 0.14 0.16 0.16 0.12 -0.04 0.15

X2

1.00 -0.00 -0.09 0.14 0.06 0.13 0.05 0.06 0.35 -0.10 -0.10 -0.12 -0.12 -0.03 0.04 -0.04

X3

1.00 0.66 -0.17 -0.34 0.14 0.80 -0.21 -0.04 0.03 0.02 0.04 0.03 -0.00 -0.01 -0.00

X4

1.00 -0.05 -0.37 -0.01 0.31 -0.06 -0.07 0.10 0.09 0.11 0.10 0.04 -0.03 0.01

X5

1.00 0.40 0.03 -0.15 -0.15 0.20 -0.06 -0.06 -0.10 -0.10 -0.04 0.05 -0.07

X6

1.00 0.02 -0.14 -0.23 0.13 -0.11 -0.10 -0.14 -0.14 -0.11 0.02 -0.00

X7

1.00 0.26 0.32 0.10 -0.10 -0.11 -0.11 -0.12 0.03 0.00 0.03

X8

1.00 -0.25 -0.07 -0.01 -0.02 -0.01 -0.02 -0.04 0.00 0.02

X9

1.00 0.12 -0.03 -0.07 -0.00 -0.00 0.10 0.01 -0.00

X10

1.00 -0.12 -0.17 -0.19 -0.19 -0.04 0.09 -0.04

X11

1.00 0.99 0.93 0.93 0.15 -0.09 0.27

X12

1.00 0.93 0.93 0.15 -0.098 0.27

X13

1.00 0.99 0.32 -0.21 0.27

X14

1.00 0.32 -0.22 0.27

X15

1.00 -0.00 0.04

X16

1.00 -0.15

X17

1.00

X18

Table 10: Correlation matrix for the measured economic, financial and equity based potential risk factors.

1.00 -0.08 -0.00 -0.00 0.00 0.05 0.00 0.13 0.00 -0.01 -0.02 0.05 0.05 0.11 0.12 0.19 -0.18 -0.03

Suggest Documents