Removing Biases in Computed Returns Lawrence Fisher Department of Finance and Economics Rutgers Business School Rutgers University 111 Washington Street Newark, NJ 07102 Daniel G. Weaver* Department of Finance and Economics Rutgers Business School Rutgers University 94 Rockafeller Road Piscataway, NJ 08854-8054 Phone: (732)-445-5644 Fax: (732) 445-2333
[email protected] Gwendolyn Webb Bert W. Wasserman Department of Economics and Finance Baruch College, Zicklin School of Business One Bernard Baruch Way, Box B10-225 New York, New York 10010
[email protected] *Contact Author June 15, 2009 JEL classifications: G10, G12, C43 Keywords: Unbiased market index, Bias in computed returns, Jensen's inequality, Asset pricing, Index construction We thank Marshall Blume, Ivan Brick, Stephen Brown, Douglas Jones, Jay Ritter, Scott Linn, Michael Pagano, David C. Porter, Robert Stambaugh, Yusif Simaan, and David Whitcomb for their comments on earlier versions of this study. Fisher and Weaver thank the Whitcomb Center for Research in Financial Services for research support. Fisher also thanks the donors of the First Fidelity Bank Research Professorship of Finance.
Electronic copy available at: http://ssrn.com/abstract=1420314
Removing Biases in Computed Returns Abstract This paper presents a straightforward method for asymptotically removing the wellknown upward bias in observed returns of equally-weighted portfolios. Our method removes all of the bias due to any random transient errors such as bid-ask bounce and allows for the estimation of short horizon returns. We apply our method to the CRSP equally-weighted monthly return indexes for the NYSE, Amex, and NASDAQ and show that the bias is cumulative. In particular, a NASDAQ index (with a base of 100 in 1973) grows to the level of 17,975 by 2006, but nearly half of the increase is due to cumulative bias. We also conduct a simulation in which we simulate true prices and set spreads according to a discrete pricing grid. True prices are then not necessarily at the midpoint of the spread. In the simulation we compare our method to calculating returns based on observed closing quote midpoints and find that the returns from our method are statistically indistinguishable from the (simulated) true returns. While the mid-quote method results in an improvement over using closing transaction prices, it still results in a statistically significant amount of upward bias. We demonstrate that applying our methodology results in a reversal of the relative performance of NASDAQ stocks versus NYSE stocks over a 25 year window.
1 Electronic copy available at: http://ssrn.com/abstract=1420314
1. Introduction It is well known that observed returns of single-security prices are upward biased and that the bias is caused by errors in quoted prices. 1 Frequently re-balanced, equally-weighted indexes (or portfolios) are especially prone to the bias, which is known to be cumulative (Conrad and Kaul (1993).) However, equal weighting is the method used in virtually all event studies and is the preferred method of forming an index when there are disproportionate market capitalization weights among representative stocks. 2,3 Also, since the bias is cumulative, it can potentially affect the relative rankings of stock performance, especially over longer periods of time. Blume and Stambaugh (1983) show that a buy-and-hold portfolio contains a diversification effect that removes the bias after the first period, as the number of stocks increases. 4 Previous implementations of their methodology have focused on long investment horizons. 5 Alternative methodologies for estimating unbiased returns that are applicable to short investment horizons are shown to have undesirable properties. 6 In this paper, we present an alternative derivation of Blume and Stambaugh’s buy-and-hold methodology. This allows for the estimation of unbiased short horizon equally-weighted returns which can then be used in event studies, as well as for the construction of frequently rebalanced indexes. The intuition behind our methodology is straightforward and is similar to that for deriving implied future single-period spot rates from two-period rates (assuming that the liquidity premium is zero.) The implied future spot rate for t-1 to t is solved by dividing the square of one plus the spot rate from t-2 to t by one plus the spot rate from t-2 to t-1. In a similar 1
See Macaulay (1938) and Fisher (1966), among others. A notable exception is Fama, Fisher, Jensen, and Roll (1969), who use continuously compounded returns. 3 For example, the explanation of the weighting method for the Dow Jones Turkey Equal Weighted 15 Index from the company’s web site (http://www.djindexes.com/mdsidx/?event=showTurkey15) is that “The index includes the largest stocks traded on the Istanbul Stock Exchange, and is equal weighted to limit the influence of the biggest companies on overall index performance.” Also in 2005, NASDAQ began constructing an equally-weighted version (rebalanced quarterly) of several of their indexes including the NASDAQ 100. See also Hamza, Kortas, L’Her, and Roberge (2006) for a discussion of the efficacy of different index weighting methods for emerging markets. 4 In Blume and Stambaugh (1983) a buy-and-hold portfolio sets equal weights in a portfolio at the beginning of the period and no rebalancing is done before the end of the multi-period investment horizon. In contrast a rebalanced portfolio is rebalanced each period. See Roll (1983) for a comparison of rebalanced and buy-and-hold portfolio returns. 5 For example Blume and Stambaugh (1983) examine one-year investment horizons and Conrad and Kaul (1990) examine three-year investment horizons. 6 Blume and Stambaugh (1983) and Bessembinder and Kalcheva (2007) note that while short horizon continuous-compounded rates of return contain no bias, they also possess certain properties that limit their use in many tests 2
2
manner, in this paper we show that dividing a two-period average portfolio price relative (one plus the observed return) ending at time t, by a one-period average portfolio price relative that ends at time t-1 results in an unbiased estimate of the true one-period price relative (hence return) ending at time t, as long as errors in t, t-1, and t-2 are independent. The average bias inherent in observed returns is due to pricing errors at the beginning of the holding period. We show, as others have, that by invoking the law of large numbers the expected bias in observed prices at time t is zero, leaving only the bias in observed prices at the beginning of the period. Our contribution to the literature is to provide a method for removing the remaining bias due to any random transient errors (not limited to bid-ask bounce as in previous studies.) The only assumptions that we need to make are that (1) transient errors in successive observed prices are independent, and (2) all observed prices are finite and greater than zero. In our methodology, both the numerator and denominator start at the same time. Because they have the same amount of bias, it cancels out, leaving asymptotically unbiased estimates of true returns. In addition to discrete pricing errors, a growing literature (e.g., Hou and Moskowitz (2005)) finds significant lagged responses to new information. We examine the efficacy of our methodology in the context of lagged adjustment and find that as long as prices at time t have fully adjusted to the information available at time t-1, then we still produce an unbiased estimate of true returns. Given that Hou and Moskowitz find that some stocks take up to four weeks to adjust to new information, our method is probably adequate for estimating true monthly returns of U.S. stocks, but not shorter periods. One method that has been proposed for avoiding the bias in observed returns is to use the mid-quote. However, that method assumes that the true price is the mid-point of the spread, which in turn assumes both continuous pricing, and no other departures from equilibrium. This then naturally leads to the simplifying assumption of a binomial distribution of errors, as assumed by Blume and Stambaugh (1983), among others. However, true prices may not be at the midpoint of the spread because of the discrete pricing grid. We first show that our methodology does not depend on the distribution of error terms. Then we employ a simulation to compare the accuracy of our methodology to using the mid-quote to calculate returns. In our simulation, equilibrium spreads are applied to generated true prices and the observed bid and ask prices are then determined using a discrete pricing grid. We find that taking the mid-quote removes between one and two thirds of the bias in observed returns, but that our methodology 3
removes all but random errors. This suggests that taking the mid-quote is an insufficient remedy for removing bias in observed returns. We then compare our methodology for constructing an index to the naïve observed return approach. We replicate the return on the CRSP Equally-Weighted Index for the New York Stock Exchange, American Stock Exchange, and Nasdaq. 7 We compute monthly returns and resulting index levels, setting December 31, 1973 equal to 100. Since observed returns are upward biased, the indexes are cumulative with respect to the bias. Comparing our unbiased equally-weighted index to the CRSP index for each of the three markets reveals the extent of this cumulative effect. For example, we find that over the 33 years from 1973 to 2006, the ending level of the CRSP Nasdaq index is over 90% higher than our unbiased Nasdaq index (17,976 vs. 9,418). The indexes for the other two markets show similar, but smaller, cumulative bias. When we compare markets, we find that the overall performance of Nasdaq stocks as measured by the CRSP index is about 54 percent higher than that of NYSE stocks in the 1973-2006 period. However, after we remove the bias, we find that Nasdaq stocks perform on average no higher than stocks on the NYSE. This clearly shows the importance of our methodology in calculating equally-weighted stock indexes, in asset class horse-races. We examine the average size of the bias in observed monthly returns for five-year time periods, as well as other periods, for each of the three markets. For NYSE stocks, the bias ranges from an average of 2.27 basis points (hereafter b.p.) for the last half of the 1950s, to almost 50 b.p. during the first half of the 1930s. The NYSE has the lowest average level of bias in observed monthly returns, followed by the Amex, and then NASDAQ. During the period of our study, previous authors have found NASDAQ spreads to be wider than those on the NYSE. This underscores the applicability of our methodology for constructing indexes in markets with larger transient errors (e.g. wider spreads), such as emerging markets. Finally, we show that although the bias in January returns is larger than other months for all three markets, virtually all calendar months have substantial upward bias in observed returns. Our methodology is of interest to academics who routinely construct equally-weighted portfolios, as well as to practitioners who construct indexes for the purpose of evaluating market performance. The rest of this paper is organized as follows. The next section reviews relevant literature. Section 3 develops a model of price adjustment that allows for random transient errors 7
The CRSP Equally-Weighted Index methodology is first developed by Cohen and Fitch (1966).
4
and lagged adjustment to information. In Sections 4 and 5, we present results of our simulation comparison tests, and then our measurement of the level of bias in CRSP return indexes. Section 6 concludes and discusses several areas for future research.
2. Literature Review It is well known that the existence of significant transaction costs leads to imperfect and non-instantaneous adjustment of reported prices to new information. This leads to security price behavior that is inconsistent with that expected in an efficient market. For example, Fisher (1966) observes that frequently rebalanced indexes for equally-weighted portfolios seem to outperform buy-and-hold portfolios. 8 In particular, Fisher compares n-period observed returns, Qn, to chained equally-weighted observed monthly returns, rˆt , and finds: Qn
⎤ ⎡n < ⎢ ∏ (1 + rˆt )⎥ − 1 ⎦ ⎣t = 1
(1)
Fisher considers whether this inequality reflects bias due to infrequent trading and the propensity of databases to use non-synchronous closing prices. He conjectures that the lagged response to new information, as well as random errors, could lead to observed link relatives (one plus return) (1 + rˆt ) =
Pˆt Pˆ
t −1
being upward biased. Let rˆt be the fully-adjusted observed return of
a security, and let εt be the error in the observed price, Pˆt , of the security at time t. Then
(1 + rˆt )
=
1 + εt (1 + rt ) 1 + ε t −1
(2)
where rt is the true (unbiased) return at time t. If E(εt) = E(εt-1) = 0, then by Jensen’s inequality, the ratio on the right-hand side of Equation (2) is greater than 1.0. Hence, observed returns will lead to upward bias relative to true returns. Fisher (1966) conjectures that differential lagged response to market-wide information will result in negative serial correlation in the residuals from a market model regression. However, he further conjectures that the amount of error introduced by the return generation process is very small.
8
This upward bias in equally-weighted returns is first observed by Macaulay (1938, pp. 149-154) in his analysis of railroad stocks. Macaulay concludes that the bias, which he calls mathematical drift, is larger than can be caused by chance, and is much larger than the bias observed for a value-weighted index of the same stocks.
5
Blume and Stambaugh (1983) extend the work of Fisher (1966) by modeling bias as a function of bid-ask spread. 9 The spread supposedly compensates liquidity providers for providing liquidity and creates a friction in the market. They assume that a security sells either at the bid price, Pˆb , or the ask price, Pˆa , and that security's true price, P, is at the midpoint of the spread. Then the observed price contains a relative error equal to
ε
= ±
Pˆa − Pˆb 2P
(3)
Blume and Stambaugh (1983) show that a relative error of ε causes upward bias equal to ε 2 /(1 + ε 2 ) which is shown to be approximately equal to ε2. Further, they suggest that the bias
can be approximated by the variance of the previous period’s error terms, σ 2 (ε t −1 ) . They show that bid-ask spread is a problem only in equally-weighted portfolios. 10 They also show that equally-weighted portfolios will contain the average amount of bias σ 2 (ε t −1 ) across firms, which can be significant. Although the returns on buy-and-hold portfolios contain less bias than rebalanced portfolios, they are still biased upward relative to true returns. 11 It is clear from Equation (3) that wider spreads will cause larger errors in observed prices. Blume and Stambaugh (1983) conjecture that, since smaller firms typically have wider spreads, the small-firm effect (Reinganum (1982) and Keim (1983)) may be due to bid-ask bias. To test this hypothesis, they form ten portfolios based on year-end firm values. They then calculate the following year’s average daily return based on: (1) a daily rebalancing strategy and (2) a one year buy-and-hold strategy. The overall difference between the smallest and largest firm size portfolios is only half as large for the buy-and-hold strategy relative to the rebalanced strategy. They conclude that at least part of the small firm effect is due to bias in returns induced by relatively wider bid-ask spreads. Roll (1984) exploits the fact that observed prices differ from true prices due to bid-ask
9
Although Blume and Stambaugh (1983) consider other possible pricing errors (see their Section 2.4), they assert that the bias from them is negligible. 10 Equally weighting portfolios is the method most commonly used in event studies. 11 Buy-and-hold portfolios reduce the bias because the weights used after the first period have a negative correlation with subsequent observed returns which offsets the upward bias. It then follows that other portfolio weighting methods will also reduce the bias if the weights are similarly negatively correlated with observed returns. Bessembinder and Kalcheva (2007) show that the method presented in this paper is one such method.
6
spreads to develop a method for estimating the “effective” bid-ask spread. 12 His method is based on the transactional model of Niederhoffer and Osborne (1966), as well as the models of Cootner (1962) and Samuelson (1965). Niederhoffer and Osborne show that the market-making process (which causes the bid-ask spread) results in negative serial dependence in observed trade prices. Roll extends previous work by showing that the bid-ask spread results in negative serial covariance in observed returns. He shows that if true price equals (PB + PA)/2 and true returns are serially uncorrelated, then cov ret
= − 4 ∗ ES 2
(4)
where covret is the serial covariance of observed returns, and ES is the effective spread as a function of price. Roll estimates the average effective spread for NYSE and Amex stocks based on daily holding periods to be 0.017. For five-day holding periods, the average effective spread is estimated to be 0.30. He proves that in an efficient market the serial covariance of true returns is zero. He also argues that the measurement of effective bid-ask spread is independent of the return interval. Therefore, since the effective bid-ask spread is not the cause of the difference between one- and five-day intervals, he concludes that the NYSE and Amex are not as informationally efficient as previously thought. Fama (1991) also concludes that stocks do not immediately adjust to new information. Therefore, observed prices may contain errors other than those induced by bid-ask spreads. The other errors arise from a combination of nonsynchronous trading, discrete pricing, and slow adjustment to information. 13 Although both Blume and Stambaugh (1983) and Roll (1984) show that observed returns are upward biased, to date no studies have proposed a method for removing the bias that allows for frequent rebalancing. Conrad and Kaul (1993) suggest that studies examining long-term returns use annual (or longer) buy-and-hold returns. Canina, Michaely, Thaler, and Womack (1998) agree with Conrad and Kaul (1993) and suggest that researchers construct long holding-
While quoted bid-ask spread is defined as Pˆa − Pˆb , effective spread takes into account the fact that trades can occur at prices other than the posted bid and ask. Effective spread measures the distance between the midpoint of the spread and trade prices. Mathematically it can be expressed as ⎡ Pˆ + Pˆb ⎤ 2⎢Pˆt − a ⎥ , where Pˆt is the observed trade price. 2 ⎣⎢ ⎦⎥ 13 Discrete pricing may cause errors if the amount of expected price adjustment to information is less than the minimum tick size on a market. In addition, discrete pricing may cause true prices to deviate from the mid-point of bid-ask spread. 12
7
period buy-and-hold portfolios when they are concerned with long-run performance. 14 However, an annual holding period is not suitable for event studies, which typically examine returns in the days following an event. In a recent paper, Bessembinder and Kalcheva (2007) examine the impact of bid-ask bounce errors on asset pricing tests. They suggest that the upward bias imparted by bid-ask bounce results in noisy beta estimates and downward bias in the estimated premium for beta risk. They examine several potential ways of removing the bias in observed returns, including the one presented in this paper. One of their methods applies if the true price is equal to the quote midpoint. They first find proportional spreads for each security as in Roll (1984) as Pˆ + Pˆb , then find the variance of these biases for a time series of observations for a firm Pˆt − a 2
and subtract the obtained variance from observed returns. Bessembinder and Kalcheva (2007) also examine two potential methods of removing bias that do not require knowing the bid-ask spread. The first of these is to use continuouslycompounded returns. They show that the mean of observed continuously-compounded returns is equal to the mean of true continuously-compounded returns. However, as they note, there are several problems associated with using these returns. For example, Fisher (1966) illustrates that chained short horizon continuously-compounded returns are downward biased relative to long holding period returns by about the same amount. In addition, Ferson and Korajczyk (1995) argue that continuously-compounded returns are inappropriate for tests of asset pricing models. The last correction Bessembinder and Kalcheva (2007) examine is the method presented in this paper -- as well as in Weaver (1991.) 15 They find that our method effectively removes the upward bias in observed returns. In the next section, we present our method for producing asymptotically unbiased equally-weighted indexes.
3. The Model As mentioned above, observed returns may contain biases due to random transient errors (e.g., bid-ask spread), non-synchronous trading, discrete pricing, and/or slow adjustment to
14
Fisher and Lorie (1964, 1968, and 1977) do just that. However, they find that initial year returns are almost always higher that second and subsequent year returns. 15 Multiplying each stock’s “return weight” by one plus the stock’s observed return and cumulating over stocks yields our Equation (11).
8
information. 16 Here we develop a model of price adjustment that takes these factors into account. We then develop an asymptotically unbiased method that removes random transient errors. In subsequent sections, we use this method to estimate the bias inherent in the monthly CRSP Equally-Weighted Index and to examine the length of the price adjustment process. Let the observed price of security i at time t, Pˆ i ,t , be equal to 17 : Pˆ i , t
=
[(1 − a ) P i ,t
i ,t −1
]
+ ai ,t Pi ,t (1 + ε i ,t )
(5)
where: ai,t = the adjustment coefficient which shows the extent to which a price has adjusted to information released since t-1. If there is no lag, then ai = 1. ei,t = a random transient error expressed, as a fraction of Pit Equation (5) implies that two different types of errors cause departures from true prices. The first is an independent random transient error, e. This class of error can either be a byproduct of the return generating process (e.g., bid-ask spreads) or independent transient errors in prices. The second type is a lagged response process. There is considerable empirical support for lagged response to information. For example, Hou and Moskowitz (2005) find that for some stocks, responses can take up to four weeks. The process of removing the errors in returns of equally-weighted portfolios is more easily understood if the effect of each type of error is first examined separately. Therefore, we examine the random transient type error (eit) first for the case where errors are completely corrected by the next observation, and then we take up the lagged response case.
3.1 Random Transient Errors In the absence of a lagged adjustment process (ai,t = 1), observed prices are equal to true prices on average since E(ei,t) = 0. However, returns are upward biased since ⎡ Pˆ ⎤ ⎡ P (1 + eit ) ⎤ ⎡ (1 + eit ) ⎤ it ⎥ = E ⎢ it ⎥ = [1 + E(rit )] ∗ E ⎢ ⎥ ⎢⎣ Pi ,t −1 ⎥⎦ ⎣ Pi ,t −1 (1 + ei ,t −1 )⎦ ⎣ (1 + ei ,t −1 )⎦
[1 + E(rˆit )] = E ⎢ ˆ
(6)
⎡ (1 + ei, t ) ⎤ and from Jensen’s inequality we know that E ⎢ ⎥ > 1 . Blume and Stambaugh (1983) ⎣ (1 + ei , t −1 ) ⎦ 16
Additional examples of random transient errors include errors in observed stock prices caused by incorrect order entry or transaction recording. 17 Observed variables will be indicated with a hat and true values of variables will have no notation.
9
show that by invoking the law of large numbers, defining Rt as the return on a large portfolio (or index) at time t, and noting that E(ei,t) = 0, Equation (6) can be rewritten as
[1 + E(Rˆ )] t
≈
⎡
1 ⎤ ⎥ ⎣ (1 + et −1 )⎦
[1 + E(Rt )] ∗ E ⎢
(7)
Therefore, the bias inherent in observed portfolio returns is the result of random transient errors in observed prices at the beginning of the holding period. Blume and Stambaugh further show that Equation (6) can be approximated using a Taylor series as 18
[1 + E (rˆ )] i ,t
≈
[1 + E (r )]∗{1 + σ (e )} 2
i ,t
i , t −1
(8)
and for the return on a frequently rebalanced equally-weighted portfolio, Equation (7) becomes 19
[1 + E (Rˆ )] t
≈
[1 + E (Rt )]∗ {1 + σ 2 (ei ,t −1 ) }
(9)
Thus far, we have dealt with single holding period returns. To denote a multiple period return or price, we employ leading subscripts to indicate the beginning of the holding period and a trailing subscript to denote its end. For example, a two holding-period true return on a portfolio observed at time t would be 2Rt. Next, we define a one holding-period true portfolio price relative, Wt, as (1 + Rt) and a two holding-period true portfolio price relative as Wt = (1 + 2 Rt ) . 20
2
In the absence of any errors (both random transient and lagged adjustment), it is clear that the expectation of a two-period portfolio price relative is equal to the product of the expectations of two sequential portfolio price relatives (although the distributions are different). Thus, E ( 2Wt ) = E (Wt −1 ) ∗ E (Wt )
(10)
⎡ ( W )⎤ E (Wt ) = E ⎢ 2 t ⎥ ⎣ (Wt −1 ) ⎦
(11)
Then 21
We note that observed price relative on a two holding-period portfolio ending at time t is
18
As shown in Blume and Stambaugh (1983) footnote 6. Blume and Stambaugh (1983), and some others, implicitly assume continuous pricing so that the distribution of error terms is binomial. That is, the true price of stock is the midpoint of the spread. Given the reality of tick-induced discrete pricing, true price is not necessarily at the midpoint of spread. Therefore a log normal distribution of error terms is more representative. In Appendix 1 we show that the bias arising from a log normally distributed error term is equal to that of a binomially distributed error. 20 Subtracting 1 from a wealth relative yields the return on an index or portfolio. 21 Assuming no lagged adjustment is equivalent to assuming no serial covariance, thus the product of expectations is equal to the expectation of the product. 19
10
Wˆt
2
=
[1 + E( Rˆ )] 2
t
≈
[1 + E( 2 Rt )] {1 + σ 2 (ei ,t −2 ) }
(12)
In addition, the observed wealth relative on a one holding-period portfolio ending at time t-1 is
[ ( )]
Wˆt −1 = 1 + E Rˆt −1
≈
[1 + E(Rt −1 )] {1 + σ 2 (ei ,t − 2 )}
(13)
The expectation of Equation (11) asymptotically becomes 22 ⎡ Wˆ ⎤ E⎢ 2 t ⎥ ˆ ⎣Wt −1 ⎦
=
[1 + E ( 2 Rt )]∗ [1 + σ 2 (ei ,t − 2 )] [1 + E ( Rt −1 )]∗ [1 + σ 2 (ei ,t − 2 )]
=
1 + E ( 2 Rt ) 1 + E ( Rt −1 )
= 1 + E ( Rt )
(14)
The intuition for this is straightforward. Recall that the bias inherent in observed returns is due to the variance of error terms at the beginning of the holding period that are corrected by the end of the holding period. In Equation (14), since the two-period wealth relative in the numerator and the one-period wealth relative in the denominator both begin in the same period; they both contain the same average bias. Therefore, the biases cancel out and the asymptotic result is approximately the expected one-period portfolio wealth relative for time t.
3.2 Lagged Response Transient errors are not the only variables to have an impact on observed prices. There is also the possibility that prices do not adjust instantaneously or simultaneously. In this section, we analyze the effect of a lagged response on observed returns. Recall Equation (5): Pˆi ,t
=
[(1 − a )P i ,t
i ,t −1
+ a i ,t Pi ,t ]∗ (1 + ei ,t )
(5)
For simplicity we assume when prices adjust to new information, they do so fully. Accordingly then ai,t is a Kronecker delta with the value of 1 for prices that have fully adjusted, otherwise 0. Next, assume a “true” price generating process
Pi ,t
=
[P
i ,t −1
+ β i Rt Pi ,t −1 ]∗ {1 + ui ,t }
(15)
where Rt is the true return on the market index at time t and ui,t is a random non-systematic component for period t. For tractability, we assume that all betas are equal to 1. Then by defining the index link relative as Wt = (1 + Rt), Equation (15) becomes
Pi ,t
= Pi ,t −1 (Wt ) ∗ {1 + ui ,t }
(16)
Note that Pˆ i ,t −1 is also subject to non-adjustment to new information so that
22
Since
{1+σ 2 (ei ,t −2 )} is in both the numerator and denominator, Jensen’s inequality does not apply. 11
Pˆi ,t −1
=
[(1 − a )P i ,t −1
i ,t − 2
+ a i ,t −1 Pi ,t −1 ] ∗ (1 + ei ,t −1 )
Define θ as the probability that a = 0 and α as (1 – θ). Then by noting that ∑ u = 0 we can combine Equations (5) and (16) to obtain the aggregate observed index relative, Wˆ t , as 23 Wˆt
=
[1 + σ (e )](α + θW 2
t −1
t −1
)(θ + αWt )
(17)
this can be rewritten in return form as Wˆt
=
[1 + σ (e )](1 + θR 2
t −1
t −1
)(1 + αRt )
(18)
Similarly, we show in Appendix 3 that in the presence of slow adjustment to new information, the ratio of a two-period index relative to a one-period index relative ending at time t-1, as in Equation (14), is ⎡ Wˆ ⎤ E⎢ 2 t ⎥ ˆ ⎣Wt −1 ⎦
=
(1 + Rt −1 )(1 + αRt ) (1 + αRt −1 )
(19)
If prices fully adjust by time t for information released at time t-1 (i.e., α = 1), then ⎡ Wˆ ⎤ E⎢ 2 t ⎥ ˆ ⎣Wt −1 ⎦
= 1 + E ( Rt )
(20)
Therefore, if prices fully adjust to new information within a month, our method provides an unbiased estimate of true index returns at time t. Given that Hou and Moskowitz (2005) find that there is a lagged response of up to four weeks for some U.S. stocks, it is reasonable to conclude that Equation (20) provides an unbiased estimate of monthly index returns. However, it is also clear that it may not be adequate for estimating unbiased returns for periods of less than one month.
4. Simulation Results To examine how our index construction method performs relative to other bias correction methods, we perform simulations based on the characteristics of observed historical NYSE prices and the CRSP Equally-Weighted Index return (including all distributions.) Although stock pricing errors can occur due to bid-ask spreads (or other random transient errors), slow adjustment to information, and non-trading, we only examine the first type of error. To create more realistic simulation data, we base our simulated prices and market returns on the actual
23
The conditional expected index relative table is provided in Appendix 2.
12
distributions of monthly prices and index returns for the period January 1926 to December 1996. During this period, the tick size for stocks is $0.125 for all stocks priced over $1. We find that the distribution of NYSE prices (over $1) during the $0.125 tick period can be well approximated by a gamma distribution with shape parameter of 1.757, mean of $28.55, and standard deviation of $21.54. During this period, returns on the CRSP Equally-Weighted Index are approximately log normally distributed with a mean of 1.024% per month and standard deviation of 6.68%. For each run of the simulation, we first randomly generate Month 1 prices for 1,000 stocks based on the distribution of observed prices. These prices are deemed “true” prices. We next assume that true spreads for each stock are a percentage of true price (separately 0.5%, 1.0%, 2.0%, and 5 %.) Since minimum tick sizes result in discrete pricing, we next determine the observed bid and ask prices based on rounding the true spread to next highest discrete spread. This is done by rounding down to the tick just below the simulated true price less one half the simulated true spread then adding on the rounded spread to find the simulated observed ask. For example, a simulated true price of $21.15 with an assumed true spread of 1% ($0.2115) would have an observed bid of $21.00 and an ask of $21.25. We assume that observed closing prices are either at the bid or the ask with equal probability. To generate Month 2 simulated true prices we assume that the true return for our stocks follows the market model with an α of zero or Rit = α i + β i R Mt + ε
(21)
where Rit is the “true” return on stock i at time t and RMt is the return on a simulated equallyweighted index. βi is the beta of stock i, drawn from a log normal distribution with mean and standard deviation of one. We assume that residuals ε are equal to 10% of the market price times a normally distributed random number with mean zero and a standard deviation of 1. The month’s “true” prices, P2 , are determined as P1(1+Ri). The true spread is then determined as well as the observed discrete spread and observed closing price using the method described above. The same procedure is performed to simulate Month 3 values. We simulate three months since the unbiased method requires at least three months of observed prices to calculate an unbiased one-month index return, while other (biased) methods only require two months of data (i.e., one month returns based on observed returns or quote mid-points.) For each simulation run, we compare the true equally-weighted index return for a universe of 1,000 stocks and compare that return to the biased observed index return as well as to two potential methods for removing the bias: 13
•
Computing returns based on the midpoint of the closing observed spreads
•
Our proposed unbiased method of portfolio return estimation.
For observed returns and the first two methods, we employ observed prices in months 2 and 3 to calculate the index return for Month 3. All three months are used to calculate the unbiased portfolio return. We run the simulation 1,000 times and compute average errors across simulations. The results are summarized in Table 1. As mentioned earlier, we examine true spread widths of 0.5%, 1.0%, 2.0%, and 5% of true price separately. As expected, we find that observed returns contain upward bias, and this bias increases with spread width. The average error is statistically significant at acceptable levels. Turning to the proposed methods for creating an unbiased equally-weighted index, we find that basing returns on the midpoint of observed closing bid-ask spreads reduces the bias by about one third for spreads of up to 2% of price, and by two thirds for spreads of 5%. Recall that in our simulations, the true price is not the midpoint of the spread. All of the remaining errors using the mid-point of observed quotes are still statistically significant. Consistent with our earlier proof, the unbiased portfolio return exhibits statistically insignificant errors on average for spreads of less than 10% of price. We compare the unbiased portfolio returns to actual observed index returns for several markets and periods in the next section.
5. CRSP Equally-Weighted Index Bias Results In the previous sections of this paper, we have noted that the observed returns of an equally-weighted portfolio, such as the CRSP Equally-Weighted Index, are upward biased. We also presented a method for asymptotically removing the bias. Since the bias in observed returns is positive, it will be cumulative. In this section, we compare the observed CRSP EquallyWeighted Index to an index constructed using our methodology and examine the bias and its cumulative impact. Wˆ We begin by constructing the price relative for our index, i.e. 2 t , for all stocks Wˆ t −1
contained in the CRSP Equally-Weighted Index using monthly returns. We separately examine NYSE, Amex, and NASDAQ stocks through 2006. 24 We define the bias as RCRSP – RUnbiased, 24
As a check, we first re-construct the CRSP Equally-Weighted Index, so that we can find possible differences in the data bases.
14
where the former is the return on the monthly CRSP Index and the latter the return on the unbiased index. In order to examine comparative cumulative effects, we set the index value equal to 100 on December 1973 for each market. (NASDAQ stocks are included in CRSP beginning in December 1972.) Each month, the previous month’s index value is multiplied by the applicable index relative for the month. Table 2 contains the December levels of the resulting series. It reveals the significant cumulative upward bias inherent in an index constructed using observed equally-weighted returns. For stocks listed on the NYSE, the resulting ending index value on December 2006 reveals that the cumulative bias inherent in observed returns is over 20% greater than our unbiased portfolio (11,638 vs. 9,664) from December 1973. 25 For Amex and NASDAQ stocks the differences are more dramatic. The ending index value based on observed returns for NASDAQ stocks is over 90% larger than that for the unbiased index. We next compare the overall returns of the NYSE and Nasdaq markets. The biased (CRSP) index shows that between 1973 and 2006, the overall index level of the Nasdaq market is 17,975.6, fully 54 percent higher than the index level of 11,638.3 for NYSE stocks. This is consistent with the general perception that returns of smaller stocks, more predominant on the Nasdaq market, tend to show higher returns than larger stocks. However, when we compare the levels of the unbiased indexes, the ranking changes dramatically: the index level for Nasdaq is 9,418.2, slightly lower than the 9,663.5 level for NYSE stocks, by about 2.5%. This result indicates that accounting for bias in index returns over long periods of time can change rankings of relative stock performance. We report average bias levels in observed monthly index returns by five-year (and several longer) periods in Table 3. Examining the results for NYSE stocks reveals that the average monthly bias ranges from 2.27 b.p. for the last half of the 1950s to 49.21 b.p. for the first half of the 1930s. Through 1996, all of the NYSE sub-periods exhibit statistically significant bias at acceptable levels. Although the periods starting in 1996 exhibit upward bias, it is not statistically significant. In 1997, U.S. markets cut their tick size from $1/8 to $1/16, and further to $0.01 in 2001. These reductions are associated with a narrowing of spreads on the NYSE. 26 It follows that if spreads narrow, the bias inherent in observed returns also declines. 25
If the index value for December 1926 is set to 100, then the unbiased NYSE index would have a value of 361,016 by 2002, while the CRSP NYSE index would reach a value of 836,852. This further illustrates the impact of bias in the construction of long series of equally-weighted indexes. 26 See Jones and Lipson (2001) and Bessembinder (2003), among others.
15
The bias inherent in observed returns based on indexes of Amex and NASDAQ stocks reveals averages over three times larger than those found for NYSE stocks. In particular, we find that the average bias for Amex (NASDAQ) observed index returns is about 15 (16) b.p. a month over the period April 1973 through December 2006. All of the average biases are statistically different from zero at acceptable levels. This suggests that all time periods are subject to significant upward bias in computed index returns. In addition, the facts that Amex and NASDAQ stocks are typically smaller and have wider spreads indicate that indexes for this type of stock (e.g., emerging markets indexes) will benefit most from the index construction method presented in this paper. Blume and Stambaugh (1983) find that one-half of the small firm effect is due to upward bias in observed returns. Accordingly, we disaggregate our data by time period and calendar month and show the bias by calendar month in Table 4. It is apparent that January’s bias is much larger than any other month’s bias for every market and sub-period. However, there is statistically significant upward bias for observed returns in most other calendar months for each market and sub-period. Therefore, this is not a January anomaly.
6. Conclusions and Future Research It is well known that market microstructure noise induces upward bias in individual returns and in equally-weighted portfolios. Noise can be due to simple error, bid-ask bounce, discrete pricing, or slow adjustment of prices to new information. In this paper, we address how to remove the bias in large equally-weighted portfolios such as the CRSP Equally-Weighted Index. We develop a model of price formation that corrects for transient pricing errors and slow adjustment to information. We separately examine the two types of errors just mentioned. We first show that the bias due to transient errors, consistent with previous studies, is due to errors in price at the beginning of the holding period. We then show that the transient error bias in observed portfolio returns can be asymptotically removed by taking the ratio of a two-period wealth ratio (i.e., one plus the portfolio return) starting at time t-2 to a one-period wealth relative also starting at time t2. Since the bias in each relative is due to the average transient pricing error at time t-2, and both the numerator and denominator of the ratio contain the same bias, the biases cancel out leaving an asymptotically unbiased portfolio return. We also examine the impact of slow adjustment of prices to new information on the 16
return on a large portfolio. We show that as long as the slow adjustment pricing error is corrected by the end of the holding period, our method will not be impacted and will still result in an asymptotically unbiased portfolio return. Given that previous studies have found that prices adjust to new information within a month, the method presented here will yield unbiased estimates of large portfolio returns for monthly or longer holding periods. We test our method using a simulation that allows for errors in price due to bid-ask bounce, a coarse pricing grid, and random errors. We compare portfolio returns generated using: observed returns; the ratio method presented in this paper; quote mid-points; and continuously compounded returns. The simulation is performed 1,000 times for 1,000 stock portfolios. We find that the bias in observed returns increases linearly with proportional spread width. We also find that our method results in the smallest proportional bias. Assuming that our model is representative, this suggests that taking the quote midpoint will not remove all of the bias in observed returns due to a discrete pricing grid. We argue in the paper that the bias in observed returns, since it is always positive, is cumulative in indexes. To determine the extent of the cumulative bias we examine the commonly used CRSP Equally-Weighted Index over the period 1926 through 2006. We create indexes equal to 100 on December 31,1973 and then compare the CRSP index return to our ratio method for each of the three markets covered by CRSP. We find that the bias is indeed cumulative and results in large index errors over time. For example, upward bias in observed NASDAQ returns results in a cumulative error of 90% in the ending index value on December 31, 2006. After the bias is removed, the overall equally-weighted return on Nasdaq stocks is found to be slightly lower than that of NYSE stocks. Examining five-year sub-periods, we find that the bias in observed monthly returns ranges from 2.27 b.p. during the last half of the 1950s for NYSE stocks to over 49 b.p. for the first half of the 1930s. During the period 1973-2006 (when all three markets are covered by CRSP) we find that NYSE stock indexes contain the smallest monthly bias (4.66 b.p.) while NASDAQ stock indexes contain the largest (16.45 b.p.) Finally we find that although January contains the largest amount of bias, virtually all months, and markets, contain statistically significant amounts of bias. Our findings suggest that the ratio method presented here should be used to estimate the return on all equally-weighted indexes (or large portfolios) measured over a monthly holding period in the United States. While this paper presents a good start to solving the problem of 17
upward bias in observed returns, there is much work yet to be done. For example, we show that as long as prices adjust to new (independent) information by the next price observation, our method is adequate in removing bias. Given that other studies have shown that some stocks take up to four weeks to fully adjust to new information, it is not clear how our method will perform for holding periods of less than a month. Another area where work is to be done is to determine the relationship between the number of stocks in a portfolio and the amount of bias removed using our method. We present a method that asymptotically removes bias, but do not examine the relationship between the number of stocks in a portfolio and the amount of bias removed. The cumulative bias in equally-weighted returns over long periods of time is substantial and can vary significantly from one market to another. For these reasons, it can affect relative rankings of return performance across markets, and may have significant implications for risk/return comparisons of stock performance. Evaluations of stocks in thinly traded and smaller markets, such as those in emerging markets, may be especially sensitive to this bias. In addition, given that equal weighting is the method used in virtually all event studies, it is clear that most studies report returns that are upward biased. If transient errors increase around events then “abnormal” returns may merely reflect increased transient errors. If this is the case, then the conclusions of many event studies need to be revisited.
18
References Aitchison, J., and J.A.C. Brown, 1957, The Lognormal Distribution, With Special Reference to its Uses in Economics, University Press, Cambridge, England. Bessembinder, H., 2003, “Trade Execution Costs and Market Quality after Decimalization,” Journal of Financial & Quantitative Analysis, 38, 747-777. Bessembinder, H., and I. Kalcheva, 2007, “Liquidity Biases in Asset Pricing Tests,” Working Paper, David Eccles School of Business, University of Utah. Blume, M. E., Stambaugh, R. F., 1983, “Biases in Computed Returns: An Application to the Size Effect,” Journal of Financial Economics, Vol. 12, pp. 387-404. Canina, L., R. Michaely, R. Thaler, and K. Womack, 1998, “Caveat Compounder: A Warning about Using the Daily CRSP Equal-Weighted Index to Compute Long-Run Excess Returns.” The Journal of Finance, Vol. 53, No. 1, pp. 403-416 Cohen, K. J., and B. P. Fitch, 1966, “The Average Investment Performance Index.” Management Science, Vol. 12, No. 6, Series B, Managerial, pp. B195-B215. Conrad, J., and G. Kaul, 1993. “Long-Term Market Overreaction or Biases in Computed Returns.” Journal of Finance, Vol. 48, no. 1, pp. 39-63. Cootner, P.H., 1962, “Stock Prices: Random Versus Systematic Changes,” Industrial Management Review, Vol. 3, pp.24-45. Fama, E. F., 1991, “Efficient Capital Markets: II,” Journal of Finance; Vol. 46, pp. 1575-1617 Fama, E.F., L. Fisher, M. C. Jensen, and R. Roll, 1969, “The Adjustment of Stock Prices to New Information,” International Economic Review, Vol. 10, No. 1, pp. 1-21. Ferson, W. E., and R. A. Korajczyk, 1995, “Do arbitrage pricing models explain the predictability of stock returns?,” Journal of Business, Vol. 68, pp. 309–349. Fisher, L., 1966, “Some New Stock-Market Indexes,” Journal of Business, Vol. 39, pp. 191-225. Fisher, L., and J.H. Lorie, 1964, “Rates Of Return On Investments In Common Stock,” Journal of Business, 37, 1-21. Fisher, L., and J.H. Lorie, 1968, “Rates of Return On Investments in Common Stock: The Yearby-Year Record, 1926-65,” Journal of Business, Vol. 41, pp. 291-316. Fisher, L., and J.H. Lorie, 1977, A half century of returns on stocks and bonds : Rates of return on investments in common stocks and on U.S. Treasury securities, 1926-1976, Chicago : University of Chicago Graduate School of Business.
19
Hamza, Olfa, Kortas, Mohamed, L'Her, Jean-François, and Roberge, Mathieu, 2006, “International Equity Portfolios: Selecting the Right Benchmark for Emerging Markets,” Emerging Markets Review, Vol. 7, pp. 111-128. Hou, K. W., and T. Moskowitz, 2005, “Market Frictions, Price Delay, and the Cross-Section of Expected Returns,” Review of Financial Studies, 18, 3, 981-1020. Jones, C. M., and M. L. Lipson, 2001, "Sixteenths: Direct Evidence on Institutional Execution Costs," Journal of Financial Economics, 59, pp. 253-278. Keim, D., 1983, “Size Related Anomalies and Stock Market Seasonality: Further Empirical Evidence,” Journal of Financial Economics, Vol. 12, pp. 13-32. Macaulay, F.R., 1938, Some Theoretical Problems Suggested by the Movements of Interest Rates, Bond Yields, and Stock Prices in the United States since 1856, National Bureau of Economic Research, New York. Moore, A.B., 1964, “Some Characteristics of Changes in Common Stock Prices,” in The Random Character of Stock Market Prices, P.H. Cootner, editor, The Massachusetts Institute of Technology, Cambridge, pp. 139-161. Niederhoffer, V., and M.F.M. Osborne, 1966, “Market Making and Reversal on the Stock Exchange,” Journal of the American Statistical Association, Vol. 61, pp. 897-916. Reinganum, M.R., 1982, “A Direct Test of Roll’s Conjecture on the Firm Size Effect,” Journal of Finance, Vol. 37, pp. 27-35. Roll, R., 1983, “On Computing Mean Returns and the Small Firm Premium,” Journal of Financial Economics, Vol. 12, pp. 371-386. Roll, R., 1984, “A Simple Measure of the Effective Bid-Ask Spread in an Efficient Market,” Journal of Finance, Vol. 39, pp. 1127-1139. Samuelson, P.A., 1965, “Proof that Properly Anticipated Prices Fluctuate Randomly,” Industrial Management Review, Vol. 6, pp.41-49. Weaver, D.G., 1991, “Sources of Short-Term Errors in the Relative Prices of Common Stocks,” Ph.D Dissertation, Rutgers University.
20
Appendix 1
Proof That a Log Normally Distributed Error Term Is Approximately Equal to a Binomially Distributed Error Term Let Yit = log e (1 + eit ) . If (1 + eit ) is log normally distributed, Yit is normally distributed with mean μ it and variance σ 2 ( Yit ) . From Aitchison and Brown (1957, pages 8-10),
[
E(1 + ei ,t −1 ) = exp μ it + 0.5σ 2 (Yit )
]
(A1)
and
[
]
⎡ ⎤ 1 2 E⎢ ⎥ = exp − μ it + 0.5σ ( Yit ) ⎢⎣1 + ei ,t −1 ⎥⎦
(A2)
By assumption E(1 + eit ) = 1, then since exp(0) = 1 it must be that μ it + 0.5σ 2 ( Yit ) = 0
(A3)
Solving for μ it and substituting the result into Equation (A2) yields
[
]
⎡ ⎤ 1 2 E⎢ ⎥ = exp σ ( Yit ) 1 + e i ,t −1 ⎦⎥ ⎣⎢
(A4)
Aitchison and Brown (1957, Eq. 2.9) state that the exponential function of the variance of the transformed distribution (i.e., the right-hand side of Equation (A4)) is equal to one plus the squared coefficient of variation of the parent, (1 + eit ) , distribution or eσ
2
= 1 + η 2 . Therefore,
Equation (A4) can be rewritten as ⎡ ⎤ σ 2 (1 + ei ,t −1 ) 1 E⎢ = 1 + ⎥ E(1 + ei ,t −1) 2 ⎣⎢1 + ei ,t −1 ⎦⎥
[
]
(A5)
Since E(1 + ei,t -1) = 1 and σ 2 (1 + eit ) = σ 2 (ei ,t −1) , Equation (A5) becomes ⎡ ⎤ 1 2 E⎢ ⎥ = 1 + σ (ei ,t −1 ) 1 + e i ,t −1 ⎦⎥ ⎣⎢
(A6)
21
Appendix 2 Conditional One-Period Expected Index Relatives Assuming Lagged Adjustments to New Information Recall Equation (5) from the text:
[
]
Pˆ i ,t = (1 − a it )Pi ,t −1 + a it Pit (1 + eit )
(5)
where: ai,t = adjustment coefficient which shows the extent to which prices have adjusted to information released since t-1. If there is no lag process, ai,t = 1. For tractability we assume either that all prices fully adjust or that they do not adjust at all.
( )
ai,t-1
ai,t
Pˆi ,t −1
Pˆi , t
Conditional E Wˆ t
0
0
Pi ,t − 2 (1 + ei ,t −1 )
Pi ,t −1(1 + eit )
W t −1(1 + σ 2 (et −1 )
0
1
Pi ,t − 2 (1 + ei ,t −1 )
Pit (1 + eit )
Wt −1Wt (1 + σ 2 (et −1 )
1
0
Pi ,t −1(1 + ei ,t −1 )
Pi ,t −1(1 + eit )
(1 + σ 2 (et −1 )
1
1
Pi ,t −1(1 + ei ,t −1 )
Pit (1 + eit )
Wt (1 + σ 2 (et −1 )
Define θ as the probability that a = 0 and α as (1 – θ) then
( )
[
E Wˆ t ≈ (1 + σ 2 (et −1) θ 2Wt −1 + αθWt −1Wt + αθ + α 2Wt
]
≈ (1 + σ 2 (et −1 )(α + Wt −1 )(θ + αWt )
(A6) (A7)
and since W = (1 + R), where R is the return on the true aggregate portfolio, and α = 1 – θ, then Equation (A7) can be rewritten as . E (Wˆ t ) ≈ (1 + σ 2 (et −1 )[1 − θ + θ (1 + R t −1 )] θ + (1 − θ )(1 + R t )
(A8)
E (Wˆ t ) ≈ (1 + σ 2 (et −1)(1 + θRt −1 )(1 + αRt )
(18)
or
22
Appendix 3 The Ratio of a Two-period Expected Index Relative to a One-period Expected Index Relative Assuming Lagged Adjustment to New Information Following the methodology of Appendix 2, finding the expected one-period index relative ending at time t-1 yields the following conditional table.
(
)
ai,t-2
ai,2
Pˆi ,t − 2
Pˆi ,t −1
Conditional E Wˆ t −1
0
0
Pi ,t − 3 (1 + ei ,t − 2 )
Pi ,t − 2 (1 + ei ,t − 1)
W t − 2 (1 + σ 2 ( et − 2 )
0
1
Pi ,t − 3 (1 + ei ,t − 2 )
Pi ,t −1(1 + ei ,t −1 )
Wt − 2Wt −1(1 + σ 2 (et − 2 )
1
0
Pi ,t − 2 (1 + ei ,t − 2 )
Pi ,t − 2 (1 + ei ,t −1 )
(1 + σ 2 (et − 2 )
1
1
Pi ,t − 2 (1 + ei ,t − 2 )
Pi ,t −1(1 + ei ,t −1 )
Wt −1(1 + σ 2 (et − 2 )
Also as in Appendix 2, define θ as the probability that a = 0 and α as (1 – θ). Then
(
)
[
E Wˆ t −1 ≈ (1 + σ 2 (et − 2 ) θ 2Wt − 2 + αθWt − 2Wt −1 + αθ + α 2Wt −1
]
(A9)
≈ (1 + σ 2 (et − 2 )(α + Wt − 2 )(θ + αWt −1 )
(A10)
and once again, since W = (! + R), where R is the return on the true aggregate portfolio and α = 1 – θ, then Equation (A7) can be rewritten as E (Wˆ t −1 ) ≈ (1 + σ 2 (et − 2 )[1 − θ + θ (1 + Rt − 2 )] θ + (1 + θ )(1 + R t −1 )
(A11)
E (Wˆ t −1 ) ≈ (1 + σ 2 (et − 2 )(1 + θRt − 2 )(1 + αRt −1 )
(A12)
or
Similarly, the following table gives the two-period aggregate wealth relative.
(
)
ai,t-2
ait
Pˆi ,t − 2
Pˆit
0
0
Pi ,t − 3 (1 + ei ,t − 2 )
Pi ,t −1(1 + eit )
W t − 2W t −1(1 + σ 2 (e t − 2 )
0
1
Pi ,t − 3 (1 + ei ,t − 2 )
Pit (1 + eit )
Wt − 2Wt −1Wt (1 + σ 2 (et − 2 )
1
0
Pi ,t − 2 (1 + ei ,t − 2 )
Pi ,t −1(1 + eit )
Wt −1(1 + σ 2 (et − 2 )
1
1
Pi ,t − 2 (1 + ei ,t − 2 )
Pit (1 + eit )
Wt −1Wt (1 + σ 2 (et − 2 )
Cconditional E 2Wˆ t
23
Defining θ as the probability that a = 0 and α as (1 – θ) then
(
[
)
E 2Wˆ t ≈ (1 + σ 2 (et − 2 ) θ 2Wt − 2Wt −1 + αθWt − 2 + αθWt −1 + α 2Wt −1Wt
]
(A13)
and in return form E ( 2Wˆ t ) ≈ (1 + σ 2 (et − 2 )[((1 + Rt −1 )(1 + θRt − 2 )(1 + αRt )]
(A14)
then the ratio of A14to A12 is
( (
) )
E 2Wˆ t (1 + σ 2 (et − 2 )[((1 + Rt −1 )(1 + θR t − 2 )(1 + αRt )] = E Wˆ t −1 (1 + σ 2 (e )(1 + θR )(1 + αR )
{
t −2
t −2
(A15)
t −1
}
Since 1+σ 2 (ei ,t − 2 ) is in the numerator and denominator, Jensen’s inequality does not apply, thus ⎛ Wˆ ⎞ ((1 + Rt −1 )(1 + αRt ) E⎜ 2 t ⎟ ≈ ⎜ Wˆ ⎟ (1 + αRt −1) ⎝ t −1 ⎠
(A16)
Therefore, in the presence of a lagged adjustment process longer than the holding period under consideration, our method will remove all of the random transient error bias, but not all of the lagged adjustment process bias.
24
Table 1 Comparison of Proposed Alternative Methods of Removing Bias This table reports the average errors found for observed returns and two proposed methods of removing bias in index returns. “True” monthly returns are generated by simulating the price distribution for stocks listed on the NYSE and applying a market model with a zero alpha and a mean return equal to the CRSP Equally-Weighted Index return over the period January 1926 to December 1996 of 1.024% and a standard deviation of 6.68%. Simulated prices are drawn from a gamma distribution with shape parameter of 1.757, a mean of $28.55, and a standard deviation of $21.54. For each run of the simulation we first randomly generate period (month) 1 prices for 1,000 stocks based on the distribution of observed prices. We next assume that true spreads for each stock are a percentage of true price (separately 0.5%, 1.0%, 2.0 %, and 5%.) Observed bid and ask prices are found by rounding the true spread to the next highest discrete spread. Observed closing prices are assumed to be either at the bid or at the ask with equal probability. Second and subsequent month “true” prices are generated by applying the market model of simulated returns with beta of each stock drawn from a log normal distribution with mean one. We assume that residuals are equal to 10% of the market return times normally distributed random number with mean zero and standard deviation of 1. The month’s “true” prices, P2 , are determined as P1(1+Ri), the true spread is then determined as well as the observed discrete spread and observed closing price using the method described above. The same procedure is performed for simulation month 3. Three months are simulated since the unbiased method requires at least three months of observed prices to calculate an unbiased 1 month index return, while other methods only require two. For each simulation run, we compare the true equallyweighted index return for our simulated universe of 1,000 stocks and compare that return to the biased observed index return as well as two potential methods for removing the bias: • •
Computing returns based on the midpoint of the closing observed spreads The unbiased method of index construction.
Reported is the average error (true return minus each method’s return) for each return estimation method based on over 1,000 simulations. Standard errors are in italics, and statistical significance, based on a t-test, and are indicated by asterisks.
Method
Average Error Spread as a Percentage of Price 0.5%
1%
2%
5%
Observed
0.009%*** 0.0010
0.014%*** 0.0015
0.028%*** 0.0028
0.102%*** 0.0069
Quote Midpoint
0.006%*** 0.0006
0.009%*** 0.0009
0.016%*** 0.0020
0.033%*** 0.0054
Unbiased
-0.003% 0.0142
0.006% 0.0146
0.022% 0.0160
***, **,*
-0.000% 0.0143
Denote significant at the 0.01, 0.05 and the 0.10 level respectively.
25
Table 2 Values of the Unbiased and CRSP Equally-Weighted Indexes, December 1926-2006 This table compares end-of-year index values, based on compounded monthly returns, for the traditional CRSP Equally-Weighted Index and the unbiased index. Index values for the NYSE, Amex, and NASDAQ are shown separately. Since CRSP’s coverage of NASDAQ stocks begins in December 1972, all (end-of year) index values are set to 100 as of December,1973. NYSE
Amex Unbiased Index
NASDAQ
Unbiased Index
CRSP EWRETD
CRSP EWRETD
Unbiased Index
CRSP EWRETD
1926 1927 1928 1929 1930
1.1 1.5 2.1 1.4 0.9
0.6 0.8 1.2 0.8 0.5
. . . . .
. . . . .
. . . . .
. . . . .
1931 1932 1933 1934 1935 1936 1937 1938 1939 1940
0.4 0.4 1.0 1.1 1.8 2.7 1.4 1.9 1.9 1.7
0.3 0.3 0.7 0.9 1.4 2.1 1.1 1.5 1.6 1.5
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
1941 1942 1943 1944 1945 1946 1947 1948 1949 1950
1.5 2.0 3.3 4.6 7.4 6.6 6.6 6.4 7.8 10.6
1.4 1.8 2.9 4.1 6.7 6.0 6.0 5.9 7.2 9.8
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
12.2 13.4 13.0 20.3 24.3 25.9 22.3 35.3 40.7 40.0
11.4 12.5 12.1 19.0 22.9 24.5 21.0 33.5 38.8 38.2
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
26
Table 2 (continued) Values of the unbiased and CRSP Equally-Weighted Indexes, 1926-2006 NYSE
Amex
NASDAQ
Unbiased Index
CRSP EWRETD
Unbiased Index
CRSP EWRETD
Unbiased Index
CRSP EWRETD
1961 1962 1963 1964 1965 1966 1967 1968 1969 1970
51.5 44.7 52.9 62.4 80.1 74.0 110.8 144.1 114.5 110.2
49.4 43.1 51.1 60.4 77.8 72.2 108.5 141.3 112.5 109.2
. 46.2 50.9 59.3 85.5 79.8 174.5 272.2 186.1 144.9
. 40.1 44.9 53.1 77.2 72.8 161.2 254.1 173.8 138.3
. . . . . . . . . .
. . . . . . . . . .
1971 1972 1973 1974 1975 1976 1977 1978 1979 1980
131.0 141.6 100.0 72.4 113.8 164.9 180.0 203.5 274.3 357.1
130.4 141.5 100.0 73.5 118.9 173.1 189.5 216.2 292.6 382.8
173.0 174.4 100.0 70.4 115.6 175.3 214.9 272.5 397.3 547.6
167.4 170.4 100.0 73.0 127.0 196.9 245.6 319.6 472.2 663.5
. . 100.0 71.8 110.3 159.8 210.1 274.8 387.0 585.7
. . 100.0 74.3 119.2 176.7 235.5 313.8 448.5 679.7
1981 1982 1983 1984 1985 1986 1987 1988 1989 1990
376.9 486.5 644.3 645.3 835.8 951.7 922.2 1,119.3 1,304.4 1,066.3
405.4 525.2 700.6 706.3 917.3 1,047.0 1,017.0 1,243.3 1,451.6 1,195.2
552.2 694.2 970.4 862.6 1,030.5 1,092.7 968.8 1,126.0 1,288.1 933.1
676.1 861.5 1,226.4 1,098.1 1,321.8 1,423.9 1,275.6 1,504.6 1,726.1 1,276.7
555.8 663.1 893.1 739.2 906.5 946.0 839.3 960.7 1,039.5 790.6
653.2 796.1 1,098.2 923.1 1,151.1 1,215.5 1,093.4 1,285.9 1,405.3 1,091.3
1991 1992 1993 1994 1995 1996 1997 1998 1999 2000
1,461.8 1,707.3 2,033.5 1,950.1 2,404.6 2,881.4 3,583.6 3,466.3 3,621.2 3,972.5
1,661.2 1,960.6 2,340.5 2,249.9 2,778.3 3,335.1 4,154.8 4,048.5 4,237.4 4,669.3
1,266.1 1,553.0 2,053.4 1,986.2 2,479.4 2,972.9 3,607.1 3,131.7 3,723.3 3,467.3
1,795.9 2,279.4 3,027.2 2,989.5 3,777.4 4,623.9 5,667.6 5,105.9 6,117.8 5,712.1
1,224.8 1,565.0 1,975.4 1,820.8 2,402.8 2,757.7 3,181.8 2,953.9 4,584.9 3,569.3
1,742.1 2,278.4 2,951.1 2,761.7 3,708.3 4,299.2 5,042.1 4,927.9 7,680.4 5,884.3
2001 2002 2003 2004 2005 2006
4,399.3 4,090.3 5,973.6 7,260.7 7,946.6 9,663.5
5,264.3 4,920.1 7,167.3 8,744.2 9,559.5 11,638.3
3,951.2 3,699.9 6,221.1 7,436.1 7,847.1 9,212.3
6,949.4 6,587.4 11,145.7 13,358.9 14,197.9 16,739.5
4,115.0 3,380.7 6,606.3 7,967.2 8,122.6 9,418.2
7,441.8 6,327.7 12,377.2 15,068.5 15,427.3 17,975.6
27
Table 3 Bias Estimates for Various Periods This table lists the average bias in the monthly return on the traditional CRSP Equally-Weighted Index. Bias is defined as RCRSP – RUnbiased, where the former is the return on the monthly CRSP index and the latter the return on the unbiased index. Biases are expressed in basis point (1 b.p. = 0.0001 or 0.01%). Standard errors are expressed as a percentage. Biases in indexes for NYSE, Amex, and NASDAQ stocks are examined separately. For each period and market, we list the number of observations and the mean bias, as well as the standard error and t-statistic. Panel A lists five-year sub-periods (except for the first, which is 57 months long). Panel B lists the results for sub-periods conforming to CRSP start dates for the three markets. Tests of significance are based on twotailed tests. NYSE Amex NASDAQ Mean Bias
se
57 60 60 60 60 60 60 60 60 60 60 60 60 60 60
7.12 49.21 18.91 8.99 3.46 2.72 2.27 2.72 3.35 8.95 4.24 3.92 3.53 5.13 2.88
2.48 8.38 4.51 4.95 0.86 0.46 0.79 0.69 1.17 2.76 1.40 1.12 1.68 2.10 1.96
1/01 – 12/06 72
3.38
2.26
N
se of Transient Errors
2.87*** 5.87*** 4.20*** 1.82* 4.01*** 5.91*** 2.88*** 3.95*** 2.86*** 3.25*** 3.02*** 3.51*** 2.10** 2.44** 1.47 1.49
t statistic
se of Transient Errors
N
Mean Bias
se
2.67 7.02 4.35 3.00 1.86 1.65 1.51 1.65 1.83 2.99 2.06 1.98 1.88
. . . . . . . . 60 60 60 60 60
. . . . . . . . 9.18 23.35 16.66 9.44 10.79
. . . . . . . . 2.41 4.12 1.94 1.77 2.19
. . . . . . . . 3.81*** 5.66*** 8.57*** 5.35*** 4.94***
. . . . . . . . 3.03 4.83 4.08 3.07 3.28
2.26 1.70
60 60
18.1 12.81
4.65 6.62
3.89*** 1.93*
1.84
72
13.86
5.74
2.41**
4.25 3.58 3.72
t statistic
N
Mean Bias
se
t statistict
se of Transient Errors
A. 5 Year Periods 4/26 – 12/30 1/31 – 12/35 1/36 –12/40 1/41 – 12/45 1/46 – 12/50 1/51 – 12/55 1/56 – 12/60 1/61 – 12/65 1/66 – 12/70 1/71 – 12/75 1/76 –12/80 1/81 – 12/85 1/86 – 12/90 1/91 –12/95 1/96 –12/00
. . . . . . . . . 33
. . . . . . . . . . . . . . . . . . 28.67 5.15
. . . . . . . . . 5.57***
. . . . . . . . . 5.35
60 60 60 60 60 72
11.68 14.98 13.75 18.68 11.32 20.71
4.73*** 7.04*** 5.07*** 6.25*** 1.15 2.19***
3.42 3.87 1.32 4.32 3.36 4.55
2.47 2.13 2.71 2.99 9.84 9.47
28
Table 3 (continued) Bias Estimates for Various Periods
NYSE se of Transient Errors
7.63***
0.63
Mean Bias
se
439
12.76
1.67
11/62 –12/06 530
4.27
N
Amex t statistic
se of Transient Errors
.
.
14.08
1.39
N
Mean Bias
se
3.57
.
.
6.78***
2.07
530
t statistic
NASDAQ se
t statistict
se of Transient Errors
.
.
.
.
.
.
.
.
.
N
Mean Bias
.
.
10.15***
3.75
B. Longer Periods 4/26 –10/62
4/26 – 3/73
564
10.6
1.32
8.03***
3.26
.
.
.
.
.
.
.
.
.
.
4/73 – 12/06
405
4.66
0.80
5.84***
2.16
405
15.34
1.76
8.72***
3.92
405
16.45
2.39
6.88***
4.06
4/26 – 12/06
969
8.12
0.84
9.63***
2.85
.
.
.
.
.
.
.
.
.
.
29
Table 4 Bias Estimates by Calendar Month This table lists the average bias in the monthly return on the traditional CRSP Equally-Weighted Index by calendar month. Bias is defined as RCRSP – RUnbiased, where the former is the return on the monthly CRSP index and the latter the return on the unbiased index. Biases are expressed in basis point (1 b.p. = 0.0001 or 0.01%). The standard error is expressed as a percentage of the true price at the beginning of the month. Biases in indexes for NYSE, Amex, and NASDAQ stocks are examined separately. For each period and market, we list the number of observations used and the mean bias, as well as the standard error and t-statistic. To allow for comparisons across markets, we partition the data into four time spans to take into account the differing start periods for CRSP indexes from the NYSE, Amex, and NASDAQ. Panel B compares the NYSE and Amex, while Panel D compare all three markets. Panels A and C list calendar month biases for the NYSE from the start of CRSP data until the beginning of CRSP data for the Amex and NASDAQ, respectively. Tests of significance are based on two-tailed tests.
Table begins on next page
30
Table 4 (continued) Bias Estimates for Calendar Months NYSE se of Mean t TranMonth N se Bias statistic sient Errors A. April 1926 – October 1962 January 36 37.56 9.15 February 36 5.59 7.19 March 36 13.14 4.27 April 37 9.75 3.66 May 37 9.29 4.01 June 37 14.41 7.35 July 37 14.86 4.22 August 37 12.11 3.48 September 37 11.48 6.73 October 37 12.66 8.00 November 36 7.59 3.24 December 36 4.83 2.10
4.10*** 0.78 3.08*** 2.67** 2.32** 1.96* 3.52*** 3.48*** 1.71* 1.58 2.34** 2.30**
6.13 2.36 3.62 3.12 3.05 3.8 3.85 3.48 3.39 3.56 2.75 2.20
B. November 1962 – December 2006 January 44 19.12 4.41 4.34*** February 44 2.95 2.52 1.17 March 44 5.82 1.85 3.14*** April 44 1.28 1.13 1.13 May 44 1.38 1.09 1.27 June 44 2.00 1.08 1.85* July 44 3.44 1.40 2.45** August 44 2.55 1.41 1.81* September 44 1.60 1.65 0.97 October 44 4.95 2.20 2.25** November 45 1.89 1.95 0.97 December 45 4.30 2.03 2.12**
4.37 1.72 2.41 1.13 1.18 1.42 1.85 1.60 1.26 2.22 1.37 2.07
Amex
Nasdaq
N
Mean Bias
se
t statistic
se of Transient Errors
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
50.65 18.76 14.05 10.48 5.70 9.09 12.23 8.27 10.87 12.9 8.73 7.52
9.15 6.16 4.20 3.45 3.14 2.83 2.59 2.50 4.38 3.14 3.56 4.81
7.12 4.33 3.75 3.24 2.39 3.01 3.50 2.88 3.30 3.59 2.96 2.74
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
44 44 44 44 44 44 44 44 44 44 45 45
5.54*** 3.05*** 3.34*** 3.04*** 1.82* 3.22*** 4.72*** 3.31*** 2.48** 4.11*** 2.45** 1.56
N
Mean Bias
se
se of t Transtatistic sient Errors
31
Table 4 (continued) Bias Estimates for Calendar Months NYSE se of Mean t TranMonth N se Bias statistic sient Errors C. April 1926 – March 1973 January 47 30.47 7.26 February 47 4.19 5.52 March 47 10.82 3.32 April 47 7.69 2.96 May 47 7.44 3.22 June 47 11.65 5.84 July 47 11.98 3.43 August 47 10.50 2.81 September 47 9.53 5.31 October 47 10.69 6.36 November 47 7.00 2.57 December 47 5.22 1.71
4.20** 0.76 3.26*** 2.60** 2.31** 2.00** 3.49*** 3.74*** 1.79* 1.68* 2.72*** 3.05***
5.52 2.05 3.29 2.77 2.73 3.41 3.46 3.24 3.09 3.27 2.65 2.29
D. April 1973 – December 2006 January 33 23.08 5.69 February 33 4.06 3.29 March 33 6.68 2.46 April 34 1.62 1.33 May 34 1.62 1.28 June 34 2.17 1.30 July 34 4.06 1.75 August 34 1.96 1.70 September 34 1.38 2.13 October 34 5.41 2.61 November 34 0.86 2.38 December 34 3.59 2.56
4.06*** 1.24 2.72** 1.22 1.27 1.67 2.33** 1.15 0.65 2.07** 0.36 1.40
4.80 2.02 2.58 1.27 1.27 1.47 2.02 1.40 1.18 2.32 0.93 1.90
***, **,*
Amex
Nasdaq
N
Mean Bias
se
t statistic
se of Transient Errors
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
57.95 21.85 15.48 13.42 5.27 8.97 12.69 9.31 12.24 12.33 8.06 7.94
11.78 8.08 5.55 4.17 3.93 3.57 3.13 3.10 5.53 3.60 4.51 6.13
33 33 33 34 34 34 34 34 34 34 34 34
4.92*** 2.70*** 2.79*** 3.22*** 1.34 2.51** 4.05*** 3.01*** 2.21** 3.43*** 1.79* 1.29
7.61 4.67 3.93 3.66 2.30 2.99 3.56 3.05 3.50 3.51 2.84 2.82
33 33 33 34 34 34 34 34 34 34 34 34
N
Mean Bias
se
t statistic
se of Transient Errors
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
58.16 16.91 17.89 15.20 5.93 9.19 15.15 13.32 11.58 17.14 5.44 12.75
15.85 12.62 10.53 5.44 5.43 4.21 3.90 3.29 3.85 7.01 5.81 8.96
3.67*** 1.34 1.70* 2.79*** 1.09 2.18** 3.89*** 4.05*** 3.00*** 2.44** 0.94 1.42
Denote significant at the 0.01, 0.05 and the 0.10 level respectively. 32
7.63 4.11 4.23 3.90 2.43 3.03 3.89 3.65 3.40 4.14 2.33 3.57