Consistent selection rules for the number of dynamic ... - CiteSeerX

5 downloads 0 Views 257KB Size Report
dynamic factor model is considered in Stock and Watson (2002a). ..... 6The dataset is kindly provided at Mark Watson's homepage: http://www.princeton.edu/.
Consistent selection rules for the number of dynamic factors in approximate factor models J¨org Breitung∗ University of Bonn

Uta Pigorsch† University of Mannheim

This version: January 2009

Abstract Determining the number of dynamic factors in large panel data sets is an important issue for many economic applications. In this paper we develop two selection procedures that allow to consistently estimate the number of dynamic factors in a dynamic factor model. The procedures are based on a canonical correlation analysis of the static factors obtained from a principal component analysis. Compared to other selection procedures, our approach has the advantage that it does not assume a finite order autoregressive representation of the common factors, and that it is invariant to a rotation of the factor space. Monte Carlo simulations show that the proposed selection rules outperform the existing ones based on principal components. The new selection procedures are also applied to the U.S. macroeconomic data panel used in Stock and Watson (2005).

∗ Address: University of Bonn, Institute of Econometrics, Adenauerallee 24-42, 53113 Bonn, Germany, phone: +49 (0)228 73 9201, fax: +49 (0)228 73 9189, email: [email protected]. † Address: University of Mannheim, Department of Economics, L7, 3-5, 68131 Mannheim, Germany, phone: +49 (0)621 181 1945, fax: +49 (0)621 181 1931, email: [email protected].

1

Introduction

In many economic applications it is very appealing to represent a large number of series by a small number of latent factors. In macroeconomics, for example, factor models have been used in the business cycle analysis (see e.g. Forni and Reichlin, 1998 and Giannone, Reichlin, and Sala, 2006) and in the identification of common macroeconomic or policy shocks (see e.g. Favero, Marcellino, and Neglia, 2005, Forni, Lippi, and Reichlin, 2008 and Stock and Watson, 2005). Recently, it has also been shown that forecasts based on a few number of so–called diffusion indices, that summarize a huge number of candidate predictor variables, obtain smaller forecast errors relative to alternative techniques based on ARIMA models or small structural economic models (see e.g. Angelini, Henry, and Mestre, 2001, Brisson, Campbell, and Galbraith, 2003, Eickmeier and Ziegler, 2008, Marcellino, Stock, and Watson, 2003, and Stock and Watson, 1999, 2002a,b). Knowing the correct number of common factors is crucial for all of these applications. Bai and Ng (2002) propose information criteria that can be used to consistently select the number of common factors in a static factor model with both N and T converging to infinity. The shortcoming of such a factor model, however, is the assumption of static factors. In view of the common fluctuations of many economic series, it seems to be more realistic to specify dynamic, e.g. autoregressive, factors that explain the comovement of the observed series. A leading example of such a dynamic factor model is considered in Stock and Watson (2002a). There, a large number of series is represented by a few number of dynamic factors whose lags may also enter the factor representation. Applying the information criteria of Bai and Ng (2002) yields a consistent estimate of the number of common factors in this so–called static representation of the factors.1 An important drawback of this approach, however, is that it cannot disentangle the dynamic factors from their lags and, thus, the estimated number of factors is equal to the sum of the number of dynamic factors and their lags in the factor representation. Forni, Hallin, Lippi, and Reichlin (2000), therefore, suggested informal criteria for determining the number of dynamic factors. Their approach is based on the computation of the variance explained by the common components estimated from the Stock-Watson procedure. The procedure is also applied in Favero et al. (2005) and Giannone, Reichlin, and Sala (2002). 1

Pesaran (2006) suggested a methodology that only assumes an upper bound for the number of common factors. Thus, this approach sidesteps the problem of determining the number of common factors.

1

A weakly consistent selection rule for the number of dynamic (or “primitive”) factors (k) is suggested by Bai and Ng (2007). Their empirical procedure is based on the fact that fitting a vector autoregressive (VAR) model to the vector of r > k principal components yields an asymptotically singular covariance matrix of the residuals. Therefore, the r − k smallest eigenvalues of the residual covariance matrix converge to zero as N and T tend to infinity. Based on the eigenvalues of the residual covariance matrix, Bai and Ng (2007) derive two statistics and corresponding asymptotic bounds that allow for a consistent selection of the number of dynamic factors. A related test procedure is suggested by Amengual and Watson (2007) and Stock and Watson (2005). Their empirical procedure is based on the innovations of the vector of time series. Since the factor representation of the innovations involve k common components (instead of r factors of the original time series), they suggest to apply the information criteria suggested by Bai and Ng (2002) for selecting the number of static factors of these innovations, which corresponds to the number of dynamic factors of the observed data. In this paper two alternative specification criteria for the number of dynamic factors are proposed. Our approach is based on a canonical correlation analysis (CCA) between the current and past values of the r “static” (or reduced form) factors. The first selection procedure is based on the fact that lagged values of common factors are perfectly predictable conditional on the past of the factors and, therefore, the associated eigenvalues of the CCA tend to unity as the sample size tends to infinity, whereas the remaining eigenvalues converge to values smaller than one. The second selection procedure exploits the fact that if a sufficient number of lags of the dynamic factors enter the static representation of the factors, then there exist k linear combinations that cannot be predicted based on the past of the factors and, therefore, the respective eigenvalues converge to zero. To test the hypothesis that k eigenvalues are equal to zero, we propose a likelihood ratio statistic. The CCA procedure has several advantages. First, it can also be applied to the approximate (instead of a strict) factor model. Second, our first selection procedure is not based on a finite order VAR representation of the dynamic factors. Therefore, in contrast to the approach of Bai and Ng (2007) our selection procedure can also be applied if the common factors are generated by moving average processes. Third, the selection procedures based on the CCA are invariant to any rotation of the factor space. Finally, our Monte Carlo simulations suggest that the CCA criteria are less sensitive to nuisance parameters and generally

2

outperform alternative selection procedures. The rest of the paper is organized as follows. In Section 2 we introduce the factor model and review the existing selection procedures for the number of dynamic factors. Section 3 proposes two selection procedures based on canonical correlations. In Section 4 we compare the finite sample properties of the alternative selection methods and apply the new procedures to the dataset of Stock and Watson (2005). Section 5 concludes.

2

Determining the number of dynamic factors

Following Stock and Watson (1999, 2002a,b) we consider the (restricted) dynamic factor model of the form2 xt = A0 ft + · · · + Am ft−m + ut eFet + ut , ≡ A

(1) (2)

where xt and ut are N -dimensional column vectors, ft is k ×1 vector with k  N , e = [A0 , . . . , Am ] and Fet = [f 0 , . . . , f 0 ]0 . For some part of our analysis we A t−m t assume that the vector of common dynamic factors possesses a VAR(p) representation ft = Γ1 ft−1 + · · · + Γp ft−p + εt , (3) where Γ1 , . . . , Γp and Σ = E(εt ε0t ) are k × k matrices. However, for our first selection criterion we do not assume a finite order VAR representation and just P p assume that T −1 Tt=1 ft ft0 → Σf is a positive definite matrix. e need not have full column rank. For example, It is important to note that A a subset of the factors may not enter with all lags. In this case the respective e are zero. Let r ≤ (m + 1)k be the rank of the matrix A. e Then columns of A there exists a N × r matrix A such that eFet = AFt A where Ft = RFet and R is a nonsingular r × (m + 1)k matrix. Ft is called the vector of static factors. 2

Note that the model is a restricted version of the general dynamic factor model developed in Forni et al. (2000) and Forni and Lippi (2001) since we assume that the factors enter with finite order lag polynomials. An identification procedure for the number of dynamic factors in a more general dynamic framework is proposed in Hallin and Liska (2007).

3

We assume that the idiosyncratic error ut is weakly correlated across series and time, whereas the common factors ft , . . . , ft−p give rise to a strong correlation among the series.3 To identify the number of dynamic factors (or primitive factors) ft from the static factors Ft , Favero et al. (2005) and Forni et al. (2000) inspect the fraction of variance explained by the common components. An important problem of this criterion is that the choice of the cut-off point is rather arbitrary. Therefore, Amengual and Watson (2007), Bai and Ng (2007) and Stock and Watson (2005) suggest consistent selection rules for the number of dynamic factors. In the following we briefly review these approaches. For the ease of exposition e and Ft = we assume for the moment that A˜ has full column rank such that A = A Fet . Furthermore, assume that m ≥ p so that the vector Ft can be represented by a VAR(1) system given by Ft = CFt−1 + ηt , (4) where

  Γ1 Γ2 · · · Γm−1 Γm  Iq 0 · · · 0 0    0 Iq 0 0 C= ,  .. ..  . . . . .  0 0 Iq 0

ηt = [ε0t , 0]0 and Γj = 0 for j > p. As shown by Bai and Ng (2002) and Stock and Watson (2002a), the Principal Components (PC) estimator Fbt is a consistent estimator for HFt , where H is an appropriately chosen rotation matrix. Therefore, if Ft is replaced by Fbt the VAR becomes Fbt = C ∗ Fbt−1 + et .

(5)

In this representation C ∗ = HCH −1 and, for N → ∞ and T → ∞, the asymptotic covariance matrix of et is Σe = HΣη H 0 , where Ση = E(ηt ηt0 ). Obviously, Σe is of rank k since rk(Ση ) = k. Thus, the number of dynamic factors can be obtained by determining the rank of Ση (or Σe ). Bai and Ng (2007), for example, propose a selection rule based on the ordered eigenvalues cˆ1 ≥ · · · ≥ cˆN of the estimated residual covariance matrix !−1 T T T T X X X X 0 0 b e = T −1 Fbt Fbt−1 Fbt−1 Fbt−1 Fbt−1 Fbt0 . Σ Fbt Fbt0 − T −1 t=2

t=2

t=2

3

t=2

These properties are formalized in the assumption given in e.g. Bai and Ng (2002) and Stock and Watson (2002a).

4

b e: They suggest two test statistics based on the eigenvalues of Σ ck∗ +1 b 1,k∗ = q b D Pr

(6)

c2j j=1 b

b 2,k∗ = D

!1/2 Pr c2j j=k∗ +1 b Pr . c2j j=1 b

(7)

p p be → b j,k∗ → Since Σ Σe as N and T tend to infinity, it follows that D 0 for k ∗ ≥ k and j = 1, 2. Specifically, Bai and Ng (2007) show that the estimates of the number of dynamic factors

b j,` < m∗ /min(N 2/5 , T 2/5 )} kˆj = min{k : D

(8)

p

are (weakly) consistent in the sense that kˆj → k as N, T → ∞. Based on Monte Carlo simulations, Bai and Ng (2007) recommend the value m∗ = 1. An alternative selection procedure is suggested by Amengual and Watson (2007) and Stock and Watson (2005). They assume a factor model, where the idiosyncratic components possess the autoregressive representation αi (L)uit = νit

i = 1, . . . , N,

(9)

with αi (L) = 1−α1 L−· · ·−αqi Lqi and qi is a variable specific lag order. Consider the vector of innovations: ξt = xt − E(xt |xt−1 , xt−2 , . . .) = A0 εt + νt

(10)

where νt = [ν1t , . . . , νN t ]0 . It follows that the number of common factors εt in (10) is identical to the number of dynamic factors k. Therefore, the selection criteria suggested by Bai and Ng (2002) can be used to determine k from the factor model (10). To obtain an estimator of ξt , Amengual and Watson (2007) and Stock and Watson (2005) suggest a two-step procedure. First, the r static factors in Ft are estimated from xt by using the usual PC estimator. In a second step, an estimate of the element ξit is obtained as the residual of a regression of xit on lags of Fbt and xit .4 The number of dynamic factors are estimated by applying the information criteria suggested by Bai and Ng (2002) to the estimated innovation vector ξbt . Alternatively, a regression of α ˆ i (L)xit on lags of Fet = α ˆ i (L)Fˆt can be performed, where α ˆi is obtained from an autoregression of the estimated idiosyncratic components. 4

5

Jacobs and Otter (2008) suggest a test procedure for determining the number of factors in a strict factor model (with small N ) under the assumption that the dynamic factors ft have a finite order moving average representation of order q and the idiosyncratic components are white noise. In the first step, the lag order q is selected by testing the hypothesis that all eigenvalues from a CCA of xt on xt−q−1 are equal to zero. In a second step, a LR test is applied to find out the number of nonzero eigenvalues from a CCA of xt on xt−q . If all factors have a MA(q) representation, then the second CCA tends to indicate k nonzero eigenvalues, which corresponds to the estimated number of dynamic factors. An obvious disadvantage of this test procedure is that it is based on a number of restrictive assumptions (e.g. that the MA orders of all common factors are identical and that the idiosyncratic components are white noise) that are difficult to verify in empirical applications. In the next section alternative selection procedures are suggested that are applicable to the dynamic factor model considered above.

3

Selection rules based on canonical correlations

An important drawback of the selection procedure suggested by Bai and Ng (2007) is that the statistics depend on the scaling of the factor estimates Fbt in the sense that a transformation of the factors such as Fet = QFbt affects the value of the test statistics D1,k∗ and D2,k∗ . The test statistic of Bai and Ng (2007) is b e Q0 | = 0. Obviously, the resulting obtained from the eigenvalue problem |c∗ Ir −QΣ eigenvalues c∗j , j = 1, . . . , r depend on the rotation matrix Q. Dividing the eigenvalues by the sum of eigenvalues does not solve the problem in general. This is an undesirable property as the normalization of the factor space is arbitrary.5 In what follows we suggest a test statistic that is invariant to a transformation of the form Fet = QFbt . Consider first the case m = 1. In this case linear combinations of ft and ft−1 may enter Ft . Our selection procedure is based on the generalized eigenvalues from ˆ ∗ Sb00 − Σ b e | = 0, |λ j or, equivalently, ˆ j Sb00 − Sb01 Sb−1 Sb0 | = 0 |λ 11 01

(11)

For example, the test statistic is different if the estimated factors are computed as fbt = Vr0 xt (as in Bai and Ng, 2007), where Vr is the matrix of r eigenvectors associated with the r largest b or whether it is normalized as T −1 PN fet fe0 = Ir (as in Bai and Ng, 2002). eigenvalues of Σ, t t=1 5

6

ˆj = 1 − λ ˆ ∗ and where λ j Sbij = T −1

T X

0 Fbt−i Fbt−j .

t=2

An important advantage of using the eigenvalue problem (11) is that the eigenvalues are invariant to a rotation of the system. Therefore, selection criteria based ˆ κ or the sum of eigenvalues λ ˆκ + · · · + λ ˆ r are scale invariant. on the eigenvalue λ For the more general case with some lag order m ≥ 1 the following theo˜1, . . . , λ ˜ r from the rem considers the asymptotic properties of the eigenvalues λ generalized eigenvalue problem ˜ j Se00 − Se01 Se−1 Se0 | = 0, |λ 11 01

(12)

where Se00 =

T X

Fbt Fbt0 ,

Se01 =

t=m+1

T X

b0 , Fbt G t−1

Se11 =

t=m+1

T X

bt−1 G b0 G t−1

t=m+1

bt−1 = [Fb0 , . . . , Fb0 ]0 is the vector of m lags. and G t−m t−1 Theorem: Let Fbt denote the PC estimator of the vector of static components Ft in (1) and assume that the assumptions of Bai and Ng (2002) are fulfilled such √ √ P that T −1 Tt=1 ||Fbt − HFt ||2 = Op (CN−2T ), where CN T = min( N , T ). (i) If ft is a k-dimensional vector of dynamic factors, then ˜ j ) = Op (C −2 ) (1 − λ NT

for j = 1, . . . , r − k

(ii) There exists a constant M > 0 such that as N → ∞ and T → ∞ ˜j > M ) → 1 P (1 − λ

for j = r − k + 1, . . . , r.

(iii) If ft has a VAR(p) representation as in (3) with p ≤ m and rk(A) = (m+1)k then, as T → ∞ and N → ∞, T

r X

d ˜j → λ χ2 (k 2 )

j=r−k+1

Proof: See appendix. Remark A: The results (i) and (ii) are derived under the fairly weak conditions of an approximate factor model (e.g. Bai, 2003, Bai and Ng, 2002 and Stock and 7

Watson, 2002a). They do not require restrictive assumptions on the dynamic P p process generating ft with the exception that T −1 Tt=1 ft ft0 → Σf , where Σf is a positive definite matrix. This assumption rules out that (some of) the factors are I(1) but allows for a wide range of stationary processes. For practical applications, the choice of the lag order m is an important issue, as it ensures that bt−1 . So far, there does all lags in the factor model are contained in the vector G not exist a statistical criterion to choose the lag length m. In practice it seems reasonable to try out various values of m to find out the minimum lag length necessary to obtain stable results. Remark B: The additional assumptions of a finite order VAR representation of the factors with p ≤ m and rk(A) = (m + 1)k are required for part (iii) of the theorem. These assumptions ensure that p lags of the dynamic factors are contained in Fbt−1 . If these additional assumptions are satisfied, a powerful selection procedure can be constructed (see below), since the contrast between the eigenvalues are maximal, i.e. r − k eigenvalues converge to unity, whereas the remaining k eigenvalues converge to zero. Unfortunately, in practice it is not known whether these conditions are fulfilled. Therefore, a reasonable selection strategy is to apply selection procedures based on (ii) and (iii). If both procedures find the same number of dynamic components one can be confident that this number is a good choice. ˜ j ) for j = 1, . . . , r − k is stochastically bounded by Remark C: The term (1 − λ a complicated function of N and T . Following Bai (2003) and Bai and Ng (2002) the limiting properties are expressed by using the minimum function CN T . For practical applications the use of this function suffers from an important drawback. The term CN T is identical for fairly small sample sizes (e.g. N = 50, T = 50) and for very large samples (such as N = 1000 and T = 50), although the estimation error is considerably smaller in the latter case. In order to take into account the eN T relative sizes of the two sampling dimensions we employ an alternative rate C with the property eN T C → κ with 0 < κ < ∞. CN T Specifically, our preferred selection criterion employs e−2 = a + a . C NT N T e2 ≤ 2a and, therefore, the limThis function has the property that a ≤ CN2 T /C NT iting result presented in part (i) of the theorem can be alternatively represented 8

as ˜ j ) = Op (C e−2 ) (1 − λ NT

for j = 1, . . . , r − k.

(13)

Based on the theorem we can construct two selection rules for the number of eN T of Remark C the first selection dynamic factors. Using the modified rate C criterion is constructed as ∗ r−k X



e2−δ ξ(k ) = C NT

˜j ) , (1 − λ

(14)

j=1

where 0 < δ < 2. The theorem implies that under the hypothesis that k ∗ is equal to the correct number of factors k, then the largest r − k ∗ eigenvalues ˜1, · · · , λ ˜ r−k∗ tend to unity and, thus, ξ(k ∗ ) tends to zero. Under the assumption λ ˜ r−k∗ has a probability limit smaller than one and, that k ∗ < k, the eigenvalue λ therefore, the statistic ξ(k ∗ ) tends to infinity. Define ξ(r) = ∞. The number of dynamic factors can be estimated consistently by the largest number k ∗ in the sequence k ∗ = r − 1, r − 2, . . . , 0, where the statistic ξ(k ∗ ) is larger than some fixed threshold level τ . Thus, b k = max{k ∗ : ξ(k ∗ ) > τ }.

(15)

In our Monte Carlo simulations we found that a = 20, δ = 0.5 and τ = 1 generally performs well. If the conditions of part (iii) of the Theorem are satisfied, the test statistic ∗

LR(k ) = T

r X

˜j λ

(16)

j=r−k∗ +1

can be used to test the null hypothesis H0 : k ∗ = k. This test statistic is approximatively equivalent to the likelihood ratio statistic for the hypothesis that the bt−1 k ∗ smallest eigenvalues of the canonical correlation analysis between Fbt and G are zero (cf. Tiao and Tsay, 1989). Accordingly, the number of dynamic factors can be selected by testing the sequence of hypotheses k ∗ = r − 1, r − 2, . . . , 1. For example, if the test rejects for k ∗ ≥ 3 but cannot reject for k ∗ < 3, then the maintained number of dynamic factors is b k = 2. In other words, the estimated number of dynamic factors is the largest value of k ∗ , which does not lead to a rejection of the hypothesis. Since it is well known that a selection procedure based 9

on tests with a fixed significance level is not consistent as the error probability for selecting a smaller number of dynamic factors does not vanish, we construct an information criterion as ∗

IC(k ) =

r X

˜ j + k ∗ c(T ) λ T j=r−k∗ +1

(17)

where c(T ) is a penalty function. Using part (ii) of the Theorem it is not difficult to show that the IC(k) is a weakly consistent selection criterion whenever c(T ) → ∞ and c(T )/T → 0 as T → ∞. Consider the difference ˜ r−k∗ − c(T )/T. IC(k ∗ ) − IC(k ∗ + 1) = λ ˜ r−k∗ > c(T ). If k ≥ k ∗ then T λ ˜ r−k∗ is Op (1), It follows that k ∗ + 1 is selected if T λ ˜ r−k∗ is Op (T ) if k < k ∗ . Therefore, a consistent selection rule requires whereas T λ c(T ) → ∞ and c(T )/T → 0 as T → ∞.

4

Finite sample properties

In the following we analyze the finite sample properties of the alternative selection procedures and present an empirical application of the new methods to the dataset used in Stock and Watson (2005).

4.1

Simulation study

To investigate the performance of the proposed procedures to select the number of dynamic factors, some Monte Carlo experiments were conducted. The data are generated based on the model xt = A0 ft + A1 ft−1 + ut ,

(18)

where the components of the k-dimensional vector ft are independent Gaussian i.i.d. AR(1) processes, i.e. fit = γi fit−1 + εit for i = 1, . . . , k with εit ∼ N (0, 1 − γi2 ). The elements of the N × k matrices A0 and A1 are i.i.d. standard normal. i.i.d. The idiosyncratic errors are generated as ut ∼ N (0, 4ψIN ) with ψ controlling the signal–to–noise ratio. We also tried out alternative models but the general conclusions remain the same. All results are based on 10,000 replications of the model. To investigate the ability of the selection procedures to identify the correct number of dynamic factors k, we simulate data according to the above model with 10

k = 2 and the following two sets of autocorrelation coefficients: γ1 = γ2 = 0.5 and γ1 = 0.2, γ2 = 0.8. We further set ψ = 2.5, 1, and 0.5, corresponding to low, medium and high signal–to–noise–ratios. The performance of the newly proposed procedures is compared to the selection criteria of Amengual and Watson (2007), Stock and Watson (2005) and Bai and Ng (2007). Note that for the ease of exposition we only report results for the criteria that have been shown to perform best in previous Monte Carlo simulations. We therefore consider the information criterion of Amengual and Watson (2007) and Stock and Watson (2005) N +T ln min{N, T } (19) SWP 2 = ln V (k ∗ ) + k ∗ NT with V (k k ) denoting the sum of squared residuals (divided by N T ) of the factor model for the innovation vector ξt as given in (10), if k ∗ factors are estimated. This resembles the ICP 2 criterion suggested by Bai and Ng (2002) for the selection of the number of static factors from the dataset. We further report results for the selection rule of Bai and Ng (2007) based on the D1 statistics, see equations (8) and (6). Following Bai and Ng (2007) we set m∗ = 1. To concentrate on the properties of the various selection procedures, we assume that the number of reduced form factors r is known. In practice, the information criteria suggested by Bai and Ng (2002) can be used to obtain a consistent estimate of r. We further base the construction of the D1 statistic on a VAR(1) model for the factors and treat the number of lags of Ft and xit in the regressions of the selection procedure of Amengual and Watson (2007) and Stock and Watson (2005) as given. The computation of the new selection rules (14) and (16) is based on one lag in the generalized eigenvalue problem, i.e. m = 1. For the selection criterion ξ(k ∗ ) we further set a = 20, δ = 0.5 and τ = 1, as these values have been found to perform well in a wide range of Monte Carlo simulations. Tables 1 and 2 present the performance of different selection criteria for the two different sets of the autoregressive parameter values. Reported are the frequencies (in percent) of choosing the correct number of dynamic factors. The results show that our selection procedures based on canonical correlations outperform the existing ones especially in small sample sizes. Moreover, these new procedures find out the correct number of dynamic factors with a probability converging to one and to 95% for the ξ(k ∗ ) criterion and the LR(k ∗ ) test, respectively. In the tables we further present the success rates of the various criteria 11

for different values of ψ. Interestingly, the new selection criteria are quite robust, while the selection procedures based on principal components seem to be very sensitive to the signal–to–noise ratio. In particular, for low signal–to–noise ratios these procedures, especially the SWP 2 criterion, have difficulties in determining the correct number of factors. This becomes even more pronounced in the presence of more persistent factors (see Table 2, which presents the success rates for the parameter set γ1 = 0.2 and γ2 = 0.8). The performance of the new selection procedures instead seem to be also robust to changes in the factor persistence. As mentioned previously, the LR test is only a powerful selection criteria if rk(A) = (m + 1)k. As this assumption is difficult to verify in practice, we also investigate the finite sample performance of the procedure in the case of the violation of this assumption. In particular, we consider the above factor model with k = 2 and A1 having zero second column, which implies that r = 3. Table 3 presents the corresponding success rates of the different selection procedures for γ1 = γ2 = 0.5. As expected the LR(k ∗ ) test is not able to identify the correct number of dynamic factors, while the previous conclusions for the alternative criteria also hold for this data generation process. As the selection procedures based on canonical correlations are both very powerful if rk(A) = (m + 1)k, we suggest for empirical applications to compute both criteria. If both come to similar conclusions, then this indicates that the assumption is not violated.

4.2

Empirical application

In the following we apply our selection procedures to a large U.S. macroeconomic dataset constructed by Stock and Watson (2005)6 . The data comprises 132 monthly macroeconomic time series ranging from January 1960 to December 2003, i.e. T = 528. For our empirical application we perform the same data transformations as in Stock and Watson (2005). The dataset has been widely used to test for the number of static and dynamic factors driving the U.S. economy. Bai and Ng (2007) and Stock and Watson (2005), for example, select seven static factors based on the information criteria of Bai and Ng (2002). Given these seven factors, their selection procedures indicate four and seven dynamic factors, respectively. Using the ICP 2 information criteria of Bai and Ng (2002) we also find seven 6

The dataset is kindly provided at Mark Watson’s homepage: http://www.princeton.edu/ mwatson/. ~

12

static factors to which we apply our selection procedures. Table 4 shows the statistic ξ(k ∗ ) and the LR(k ∗ ) test statistics using m = 1 in the generalized eigenvalue problem (12). The first selection criterion ξ(k ∗ ) selects kˆ = 4 dynamic factors, while the LR(k ∗ ) test indicates one dynamic factor. The observation that both criteria choose a different number of dynamic factors suggests that the rank conditions of part (iii) of the theorem are not fulfilled and we thus rely in the following on the ξ(k ∗ ) criterion. For the accuracy of the ξ(k ∗ ) criteria, we need to ensure that all lags in the bt−1 . We therefore also consider factor model are also contained in the vector G larger values of m. The results for m ranging from 2 to 4 are presented in Table 5. Interestingly, the selected number of dynamic factor remains at kˆ = 4 for all lags, which indicates that a lag order of one is sufficient. We therefore conclude that there are four dynamic factors that drive the U.S. economy - a result that is consistent with the findings of Bai and Ng (2007).

5

Conclusion

In this paper two identification procedures are suggested that allow to estimate the number of dynamic factors (or “structural factors”) from a set of static factors (or “reduced form factors”) that are obtained from a principal component analysis of a large number of observed series, see Stock and Watson (2002a,b). While the procedures of Bai and Ng (2007), Amengual and Watson (2007) and Stock and Watson (2005) use the method of principal components to determine the number of dynamic factors, our approach is based on a canonical correlation analysis (CCA) of the static factors. An important advantage of our approach is that it is invariant to the normalization of the factor space and does not assume a finite order VAR model for the vector of common factors. Furthermore, our Monte Carlo simulations indicate that the new criteria generally outperform the alternative selection criteria and are much more reliable especially in the presence of low signal–to–noise ratios and/or persistent factors. Applying the new criteria to the dataset used in Stock and Watson (2005) yields that seven static factors can be represented by four dynamic factors and three lags which corroborates the earlier findings of Bai and Ng (2007).

13

Appendix A: Proof of the Theorem We first state the following lemma: Lemma A.1: Under Assumptions A–F of Bai (2003) it holds for any fixed j that (a)

T −1

T X

0 (Fbt − Ft )Ft−j = Op (CN−2T ),

t=j+1

(b)

T −1

T X

T −1

T X

0 (Fbt − Ft )Fbt−j = Op (CN−2T )

t=j+1 0 Fbt Fbt−j = T −1

T X

0 Ft Ft−j + Op (CN−2T )

t=j+1

t=j+1

Proof: The proof follows closely the proof of Lemma B.2 and Lemma B.3 of Bai (2003) and can be found in Breitung and Tenhofen (2008, Lemma A.1).  To proof part (i) of the theorem, let F = [Fm+1 , . . . , FT ]0 , Fb = [Fbm+1 , . . . , FbT ]0 , b = [G bm , . . . , G bT −1 ]0 and Gt−1 = [Ft−1 , . . . , Ft−m ]0 . From G = [Gm , . . . , GT −1 ]0 , G Lemma A.1 it follows that T −1 Fb0 Fb = T −1 F 0 F + Op (CN−2T ), b0 G b = T −1 G0 G + Op (C −2 ) T −1 G NT

b = T −1 F 0 G + Op (C −2 ), T −1 Fb0 G NT

˜ j = λ0 + Op (C −2 ), where λ0 denotes the eigenvalue from the and, therefore, λ j j NT eigenvalue problem |λ0j F 0 F − F 0 G(G0 G)−1 G0 F | = 0. (20) The eigenvalues of can be written as λ0j

vj0 F 0 G(G0 G)−1 G0 F vj = . vj0 F 0 F vj

If Ft contains r − k lagged values up to the maximal lag m that is used to construct Gt−1 then there exist r − k linear independent combinations of the form vj0 Ft = wj0 Gt−1 where wj is some mr × 1 vector. Therefore, the r − k largest eigenvalues λ01 , . . . , λ0r−k are equal to one. This implies that for j = 1, . . . , r − k ˜ j ) = Op (C −2 ). we have (1 − λ NT To proof part (ii) we first note that if there exist r dynamic factors, then the covariance matrix of the projection residual ηt = Ft − E(Ft |Gt−1 ) is positive definite. Hence, 1−

λ0j

vj0 F 0 F vj − vj0 F 0 G(G0 G)−1 G0 F vj = vj0 F 0 F vj 14

and by the weak law of large numbers there exists a constant M such that limT →∞ P (1 − λ0j > M ) = 1 for j = r − k + 1, . . . , r. Finally, if N → ∞ ˜ j > M − Op (C −2 )) will converge to unity as well for and T → ∞, then P (1 − λ NT j = r − k + 1, . . . , r. Under the conditions stated in part (iii) of the theorem, all lags of the VAR representation of Ft enter Gt−1 and, thus, there exist r − k vectors vj , j = r − k + 1, . . . , k such that vj0 Ft = wj0 ut is white noise. The following lemma presents the limiting distribution of the sum of eigenvalues in this situation. 0 0 ] and xt = [zt0 , x02t ]0 , where x2t and y2t are m × 1 Lemma A.2. Let yt = [zt0 , y2t and zt is a n × 1 vector. Furthermore E(xt ) = E(yt ) = 0 and E(xt x0t ) = Σx , 0 yt ]0 ≡ E(yt yt0 ) = Σy . Assume that there exist m linear combinations [w10 yt , . . . , wm W 0 yt with E(W 0 yt |xt ) = 0 and V ar(W 0 yt |xt ) = Σm , where Σm is positive definite. Then, as T → ∞ n+m X d T µi + op (1) → χ2 (m2 ), i=n+1

where µi denotes the eigenvalues (in descending order) of −1 0 |µSyy − Syx Sxx Sxy | = 0 ,

and Sab = T

−1

T X

at b0t ,

a, b ∈ {x, y}.

t=1

Proof: Let x˜2t (˜ y2t ) denote the projection residuals of x2t (y2t ) on zt , and     zt z x˜t = , y˜t = t . y˜2t x˜2t Accordingly we define Sx˜x˜ Sy˜y˜ Sx˜y˜

  T 1X Szz 0 0 = x˜t x˜t = 0 Sx˜2 x˜2 T t=1   T 1X 0 Szz 0 = y˜t y˜t = 0 Sy˜2 y˜2 T t=1   T 1X Szz 0 0 = x˜t y˜t = . 0 Sx˜2 y˜2 T t=1

15

(21)

Since zt enters both xt and yt , the first n eigenvalues are unity and the corresponding eigenvectors are [v1 , . . . , vn ] = V = [In , 0]0 . The remaining matrix of m eigenvectors can be normalized such that WT = [wT 1 , . . . , wT m ]0 = [0, Im ]0 . The eigenvalues µn+1 , . . . , µn+m can be written as −1/2

−1/2

µn+j = wT0 j Sy˜y˜ Sx˜0 y˜Sx˜−1 ˜y˜Sy˜y˜ wT j x ˜ Sx −1/2

−1/2

= e0j Sy˜2 y˜2 Sx˜0 2 y˜2 Sx˜−1 ˜2 y˜2 Sy˜2 y˜2 ej , ˜2 Sx 2x where ej is the j’th column of Im and n+m X

o n −1/2 −1/2 . µi = tr Sy˜2 y˜2 Sx˜0 2 y˜2 Sx˜−1 S S ˜2 y˜2 y˜2 y˜2 ˜2 x 2x

i=n+1

√ p d −1/2 −1/2 Using Sy˜2 y˜2 → Σy˜2 y˜2 , Sx˜2 x˜2 → Σx˜2 x˜2 and T vec(Σy˜2 y˜2 Sx˜0 2 y˜2 Σx˜2 x˜2 ) → N (0, Im2 ) Pn+m it follows that T j=n+1 µj has an asymptotic χ2 limiting distribution with m2 degrees of freedom.  p

Under the conditions of part (iii) of the theorem we can find rotations Q1 Ft = and Q2 Gt−1 = [zt0 , yt0 ]0 that fulfill the conditions of Lemma 2. Therefore, the sum of the k smallest eigenvalues have a χ2 limiting distribution with k 2 degrees of freedom. Let  ∗  ∗ zt zt bt−1 b = QG ∗ = QFt and yt∗ xt

[zt0 , x0t ]0

and denote by x˜t and y˜t the respective residuals from a projection on zt∗ . Then, by using the results in part (i) of the theorem −1/2

T

T X

0 x˜∗t y˜t∗

= T

t=m+1

T −1

T X

0

t=m+1

T

−1

√ x˜t y˜t0 + Op ( T /CN2 T )

t=m+1

x˜∗t x˜t∗ = T −1

T X

T X

−1/2

T X

x˜t x˜0t + Op (1/CN2 T )

t=m+1 0 y˜t∗ y˜t∗

= T

−1

T X

y˜t y˜t0 + Op (1/CN2 T )

t=m+1

t=m+1

and, therefore, T

r X j=r−k+1

˜j = T λ

r X

√ µj + Op ( T /CN2 T )

r−k+1

where the eigenvalues µj are defined as in Lemma A.2 with m = k. It follows P ˜ j has a χ2 distribution with k 2 degrees of freedom. that T rj=r−k+1 λ 16

References Amengual, D. and Watson, M. (2007), “Consistent Estimation of the Number of Dynamic Factors in a Large N and T Panel,” Journal of Business & Economic Statistics, 25, 91–96. Angelini, E., Henry, J., and Mestre, R. (2001), “Diffusion Index-based Inflation Forecasts for the Euro Area,” ECB Working Paper 61. Bai, J. (2003), “Inferential Theory for Factor Models of Large Dimensions,” Econometrica, 71, 135–172. Bai, J. and Ng, S. (2002), “Determining the Number of Factors in Approximate Factor Models,” Econometrica, 70, 191–221. — (2007), “Determining the Number of Primitive Shocks in Factor Models,” Journal of Business & Economic Statistics, 25, 52–60. Breitung, J. and Tenhofen, J. (2008), “GLS Estimation of Dynamic Factor Models,” Working paper, http://www.ect.uni-bonn.de/mitarbeiter/breitung/pcgls.pdf. Brisson, M., Campbell, B., and Galbraith, J. (2003), “Forecasting Some Lowpredictability Time Series Using Diffusion Indices,” Journal of Forecasting, 22, 515–531. Eickmeier, S. and Ziegler, C. (2008), “How Successful Are Dynamic Factor Models at Forecasting Output and Inflation? A Meta-Analytic Approach,” Journal of Forecasting, 27, 237–265. Favero, C., Marcellino, M., and Neglia, F. (2005), “Principal Components at Work: The Empirical Analysis of Monetary Policy with Large Data Sets,” Journal of Applied Econometrics, 20, 603–620. Forni, M., Hallin, M., Lippi, M., and Reichlin, L. (2000), “The Generalized Dynamic Factor Model: Identification and Estimation,” The Review of Economics and Statistics, 82, 540–554. Forni, M. and Lippi, M. (2001), “The Generalized Factor Model: Representation Theory,” Econometric Theory, 17, 1113–1141.

17

Forni, M., Lippi, M., and Reichlin, L. (2008), “Opening the Black Box: Structural Factor Models versus Structural VARs,” Econometric Theory, forthcoming. Forni, M. and Reichlin, L. (1998), “Lets Get Real: A Factor-Analytical Approach to Disaggregated Business Cycle Dynamics,” Review of Economic Studies, 65, 453–473. Giannone, D., Reichlin, L., and Sala, L. (2002), “Tracking Greenspan: Systematic and Unsystematic Monetary Policy Revisited,” CEPR Discussion Paper 3550. — (2006), “VARs, Factor Models and the Empirical Validation of Equilibrium Business Cycle Models,” Journal of Econometrics, 132, 257–279. Hallin, M. and Liska, R. (2007), “Determining the Number of Dynamic Factors in the General Dynamic Factor Model,” Journal of the American Statistical Association, 102, 603–617. Jacobs, J. and Otter, P. (2008), “Determining the Number of Factors and Lag Order in Dynamic Factor Models: A Minimum Entropy Approach,” Econometric Reviews, 27, 398–427. Marcellino, M., Stock, J., and Watson, M. (2003), “Macroeconomic Forecasting in the Euro Area: Country Specific versus Area-Wide Information,” European Economic Review, 47, 1–18. Pesaran, M. H. (2006), “Estimation and Inference in Large Heterogeneous Panels with a Multifactor Error Structure,” Econometrica, 74, 967–1012. Stock, J. and Watson, M. (1999), “Forecasting Inflation,” Journal of Monetary Economics, 44, 293–335. — (2002a), “Forecasting Using Principal Components from a Large Number of Predictors,” Journal of the American Statistical Association, 97, 1167–79. — (2002b), “Macroeconomic Forecasting Using Diffusion Indexes,” Journal of Business & Economic Statistics, 20, 147–162. — (2005), “Implications of Dynamic Factor Models for VAR Analysis,” NBER Working Paper 11467. Tiao, G. and Tsay, R. (1989), “Model Specification in Multivariate Time Series,” Journal of the Royal Statistical Society, Series B, 51, 157–213. 18

Table 1: Rates of success of model selection criteria for r = 4, k = 2 and γ1 = γ2 = 0.5 SWP 2 N 20 50 100 150 200 N 20 50 100 150 200 N 20 50 100 150 200 N 20 50 100 150 200 N 20 50 100 150 200 N 20 50 100 150 200

0.0021 0.0107 0.0096 0.0367 0.0648 0.0026 0.0314 0.3353 0.6603 0.9327

0.2772 0.4876 0.9091 0.9789 0.9907 0.5367 0.9832 1.0000 1.0000 1.0000

0.7948 0.9832 0.9999 1.0000 1.0000 0.9216 1.0000 1.0000 1.0000 1.0000

D1

ξ

T = 50 0.0081 0.5345 0.0007 0.5336 0.3104 0.9125 0.7794 0.9903 0.9540 0.9987 T = 150 0.0076 0.1262 0.0174 0.1840 0.1649 0.7423 0.5049 0.9791 0.9406 0.9995 T = 50 0.3531 0.9863 0.6840 0.9990 0.9963 1.0000 0.9999 1.0000 1.0000 1.0000 T = 150 0.5029 0.9833 0.8879 1.0000 0.9991 1.0000 1.0000 1.0000 1.0000 1.0000 T = 50 0.8430 0.8816 0.9899 0.9992 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 T = 150 0.9127 1.0000 0.9991 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

LR SWP 2 ψ = 2.5 0.3708 0.8697 0.9399 0.9436 0.9382

0.0024 0.0085 0.0537 0.3292 0.6074

0.9339 0.9456 0.9479 0.9454 0.9433 ψ=1

0.0036 0.0622 0.6269 0.9377 0.9900

0.8658 0.9402 0.9371 0.9410 0.9389

0.4644 0.9167 0.9982 1.0000 1.0000

0.9509 0.9345 0.9472 1.0000 0.9460 1.0000 0.9440 1.0000 0.9440 1.0000 ψ = 0.5 0.9388 0.9377 0.9341 0.9402 0.9373

0.9028 1.0000 1.0000 1.0000 1.0000

0.9469 0.9469 0.9457 0.9446 0.9437

0.9461 1.0000 1.0000 1.0000 1.0000

19

D1

ξ

T = 100 0.0057 0.2034 0.0155 0.3078 0.1190 0.8569 0.7072 0.9908 0.9704 0.9999 T = 200 0.0105 0.0997 0.0220 0.1074 0.1894 0.6025 0.5918 0.9591 0.8653 0.9991 T = 100 0.4586 0.9818 0.8390 1.0000 0.9955 1.0000 1.0000 1.0000 1.0000 1.0000 T = 200 0.9174 0.9846 0.9997 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 T = 100 0.8950 1.0000 0.9990 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 T = 200 0.9234 1.0000 0.9997 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

LR

0.8338 0.9464 0.9497 0.9469 0.9435 0.9423 0.9506 0.9501 0.9443 0.9482

0.9510 0.9455 0.9470 0.9476 0.9426 0.9459 0.9487 0.9478 0.9426 0.9486

0.9477 0.9452 0.9464 0.9466 0.9430 0.9459 0.9488 0.9481 0.9425 0.9468

Table 2: Rates of success of model selection criteria for r = 4, k = 2 and γ1 = 0.2, γ2 = 0.8 SWP 2 N 20 50 100 150 200 N 20 50 100 150 200 N 20 50 100 150 200 N 20 50 100 150 200 N 20 50 100 150 200 N 20 50 100 150 200

0.0005 0.0000 0.0000 0.0000 0.0000 0.0004 0.0002 0.0005 0.0008 0.0126

0.1064 0.0659 0.2950 0.4498 0.5500 0.1892 0.4946 0.8975 0.9800 0.9981

0.5597 0.7240 0.9596 0.9596 0.9959 0.7937 0.9935 1.0000 1.0000 1.0000

D1

ξ

T = 50 0.0118 0.6295 0.0029 0.4348 0.1224 0.6504 0.4455 0.8192 0.7207 0.9102 T = 150 0.0051 0.3933 0.0070 0.2370 0.0603 0.5297 0.2468 0.8328 0.7882 0.9590 T = 50 0.2935 0.9747 0.4725 0.9784 0.9780 0.9982 0.9993 0.9995 1.0000 1.0000 T = 150 0.4440 0.9894 0.8166 0.9983 0.9944 1.0000 0.9999 1.0000 1.0000 1.0000 T = 50 0.7894 0.9375 0.9726 0.9997 0.9999 1.0000 0.9999 1.0000 1.0000 1.0000 T = 150 0.8842 1.0000 0.9967 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

LR SWP 2 ψ = 2.5 0.3687 0.7852 0.9196 0.9334 0.9419

0.0003 0.0000 0.0001 0.0003 0.0025

0.9372 0.9484 0.9452 0.9470 0.9457 ψ=1

0.0003 0.0003 0.0031 0.0137 0.0323

0.8043 0.9410 0.9432 0.9370 0.9404

0.1527 0.3071 0.5732 0.8849 0.9645

0.9505 0.2026 0.9473 0.6212 0.9451 0.9742 0.9451 0.9992 0.9453 0.9999 ψ = 0.5 0.9241 0.9394 0.9395 0.9333 0.9380

0.7324 0.9681 0.9992 0.9992 1.0000

0.9488 0.9489 0.9432 0.9452 0.9446

0.8202 0.9983 1.0000 1.0000 1.0000

20

D1

ξ

T = 100 0.0049 0.4347 0.0035 0.2959 0.0300 0.5567 0.3868 0.8326 0.8306 0.9515 T = 200 0.0072 0.3854 0.0072 0.1941 0.0929 0.4883 0.3587 0.8144 0.6887 0.9574 T = 100 0.3793 0.9810 0.7301 0.9939 0.9808 1.0000 1.0000 1.0000 1.0000 1.0000 T = 200 0.4698 0.9927 0.8563 0.9990 0.9978 1.0000 1.0000 1.0000 1.0000 1.0000 T = 100 0.8589 0.9996 0.9956 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 T = 200 0.8912 1.0000 0.9988 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

LR

0.8511 0.9485 0.9475 0.9451 0.9485 0.9464 0.9478 0.9477 0.9502 0.9467

0.9441 0.9453 0.9443 0.9438 0.9476 0.9467 0.9481 0.9482 0.9444 0.9474

0.9450 0.9458 0.9460 0.9408 0.9453 0.9461 0.9449 0.9483 0.9440 0.9469

Table 3: Rates of success of model selection criteria for r = 3, k = 2 and γ1 = γ2 = 0.5 SWP 2 N 20 50 100 150 200 N 20 50 100 150 200 N 20 50 100 150 200 N 20 50 100 150 200 N 20 50 100 150 200 N 20 50 100 150 200

0.0009 0.0001 0.0096 0.0311 0.0550 0.0030 0.0298 0.3278 0.6676 0.9322

0.2607 0.4577 0.9068 0.9763 0.9903 0.5161 0.9811 1.0000 1.0000 1.0000

0.8072 0.9847 0.9998 1.0000 1.0000 0.9607 1.0000 1.0000 1.0000 1.0000

D1

ξ

T = 50 0.0357 0.9396 0.0729 0.9943 0.5861 0.9949 0.9088 0.9962 0.9856 0.9952 T = 150 0.0268 0.9999 0.0806 0.9990 0.3718 1.0000 0.7267 1.0000 0.9767 1.0000 T = 50 0.5445 0.3914 0.8509 0.8707 0.9989 0.9499 1.0000 0.9624 1.0000 0.9715 T = 150 0.6702 0.8870 0.9544 1.0000 0.9997 1.0000 1.0000 1.0000 1.0000 1.0000 T = 50 0.9279 0.0502 0.9977 0.6433 1.0000 0.8827 1.0000 0.9316 1.0000 0.9505 T = 150 0.9631 0.2989 0.9997 0.9997 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

LR SWP 2 ψ = 2.5 0.8660 0.6179 0.4696 0.4042 0.3751

0.0014 0.0075 0.0516 0.3085 0.5999

0.3068 0.0317 0.0075 0.0038 0.0036 ψ=1

0.0021 0.0598 0.6280 0.9357 0.9907

0.6147 0.4213 0.3531 0.3273 0.3228

0.4303 0.9172 0.9977 1.0000 1.0000

0.0581 0.5640 0.0044 0.9944 0.0019 1.0000 0.0021 1.0000 0.0011 1.0000 ψ = 0.5 0.4641 0.3536 0.3187 0.3050 0.3064

0.9370 1.0000 1.0000 1.0000 1.0000

0.0142 0.0018 0.0009 0.0018 0.0008

0.9675 1.0000 1.0000 1.0000 1.0000

21

D1

ξ

T = 100 0.0264 0.9988 0.0741 0.9986 0.3151 0.9999 0.8579 1.0000 0.9910 1.0000 T = 200 0.0235 1.0000 0.0839 0.9998 0.4123 1.0000 0.7802 1.0000 0.9480 1.0000 T = 100 0.6249 0.7579 0.9358 0.9990 0.9990 1.0000 1.0000 1.0000 1.0000 1.0000 T = 200 0.6824 0.9402 0.9638 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 T = 100 0.9566 0.1733 0.9997 0.9878 1.0000 0.9999 1.0000 1.0000 1.0000 1.0000 T = 200 0.9693 0.3726 0.9998 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

LR

0.5416 0.1727 0.0721 0.0523 0.0375 0.1616 0.0049 0.0004 0.0003 0.0000

0.1923 0.0600 0.0364 0.0309 0.0249 0.0132 0.0004 0.0000 0.0000 0.0000

0.0908 0.0372 0.0272 0.0262 0.0216 0.0015 0.0001 0.0000 0.0000 0.0000

Table 4: Selection criteria for the number of dynamic factors in the U.S. economy k∗ 1 2 3 4 5 6

˜ λ 0.0422 0.1706 0.2237 0.7577 0.8596 0.8837

ξ(k ∗ ) 3.0625 2.1048 1.2754 0.4991• 0.2567 0.1163

LR(k ∗ ) 1.5• 23.7 113.5 231.1 629.6 1081.8

crit. val. 3.8415 9.4877 16.9190 26.2962 37.6525 50.9985

Reported are the selection criteria ξ(k ∗ ) and LR(k ∗ ) for different values of k ∗ using m = 1. Note, that the critical value of −2+δ the ξ(k ∗ ) statistic is C˜N T =0.2873. The first column reports the eigenvalues of (12) with m = 1, the last columns reports the critical values of the LR–test. • indicates those criteria that select the number of dynamic factors.

Table 5: Selection criteria ξ(k ∗ ) based on different values of m k∗ 0 1 2 3 4 5 6

m=2 3.7484 2.8051 1.8827 1.1119 0.4218• 0.2244 0.1047

m=3 3.6428 2.7212 1.8203 1.0731 0.3980• 0.2151 0.0981

m=4 3.5221 2.6137 1.7450 1.0506 0.3883• 0.2095 0.0942

Reported are the selection criteria ξ(k ∗ ) for different values of k ∗ and different lag lengths m. The critical −2+δ • value of the ξ(k ∗ ) statistic is C˜N indicates T =0.2873. those criteria that select the number of dynamic factors.

22