Journal of Financial Markets 40 (2018) 23–39
Contents lists available at ScienceDirect
Journal of Financial Markets journal homepage: www.elsevier.com/locate/finmar
A natural experiment for efficient markets: Information quality and influential agents Brian M. Mills a,* , Steven Salaga b a b
University of Florida, Department of Tourism, Recreation, & Sport Management, P.O. Box 118208, Gainesville, FL 32608, USA University of Georgia, Department of Kinesiology, 330 River Road, Athens, GA 30602, USA
article info
abstract
Article history: Received 19 March 2018 Revised 9 July 2018 Accepted 10 July 2018 Available online 17 July 2018
We test the integration of repeated decision making of influential agents in asset prices. Our approach exploits a natural experiment in the Major League Baseball (MLB) betting market, where umpire assignments are revealed only for certain games, using over 2.5 million decisions made by these officials. Estimations reveal only partial adjustment to information related to umpire behavioral heterogeneity. We show this is exploitable by informed bettors, providing advantages over more salient, lower quality information. These results suggest that underlying information on influential individual decision making can serve as a high quality indicator of asset values due to persistence across time. © 2018 Elsevier B.V. All rights reserved.
JEL: G14 L83 Z2 Keywords: Market efficiency Betting markets Behavior Baseball
1. Introduction The extent to which prices reflect all information about an asset is of central interest to understanding the efficiency of markets. According to theory, information will be fully integrated into asset prices as soon as it is available, removing the ability for traders to make gains without asymmetries in information. The sports betting market is regularly used to study this efficient market hypothesis (EMH) (Fama, 1970), where betting lines and information about game outcomes are easily accessible (Levitt, 2004; Sauer et al., 1998). However, the observational nature of betting markets can sometimes limit clean identification of the effects of new information on betting lines, –though there tends to be less leakage than is found in financial markets (Brunnermeier, 2005)–, with a few exceptions that exploit clear exogeneity in various characteristics of the game (Croxson and Reade, 2014; Larsen et al., 2008). We use a unique quasi-randomized intervention in the context of Major League Baseball (MLB) umpire scheduling to identify changes to prices as they relate to measurable behavioral tendencies of influential individuals when news of their assignment to specific games breaks. More specifically, it is well-established that MLB game officials (umpires) can have significant influence over the outcome of games through ball-strike calls, particularly related to total runs scored (Mills, 2017a). Further, there is considerable individual heterogeneity in umpire ball-strike calls that may result in measurable differences to the expected total runs scored, depending on which umpire is assigned to home plate (Mills, 2017b). Most importantly to our inquiry, umpire scheduling is exogenous
* Corresponding author. E-mail addresses:
[email protected]fl.edu (B.M. Mills),
[email protected] (S. Salaga). https://doi.org/10.1016/j.finmar.2018.07.002 1386-4181/© 2018 Elsevier B.V. All rights reserved.
24
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
to team game scheduling and quasi-random, with the assignment of individual umpires unknown prior to the first game of a regular season series. Upon the start of the first game, however, the umpire rotation is perfectly known for the remaining games of the series. This process defines our identification strategy for estimating the effect of information release on the efficiency of the totals market (total runs scored) across these subsets of games. We test information integration within the totals market using data on past umpire behavioral tendencies – aggregated decisions on thousands of ball-strike calls by each umpire – and pair this with the exogenous umpire assignment to games. Our estimates show that the although the totals market adjusts to information about umpire assignments in the correct direction when this information is released, likely through asymmetric betting volume by bettors, it does not fully integrate changes in expected offensive output attributable to detailed information about umpire strike zones. There is therefore evidence for statistical inefficiency based upon this detailed behavioral information. Using simple betting rules, we show that prices in the market are also economically inefficient, generating a return on investment as high as 9.75% per bet. We presume this inefficiency arises from the relatively high cost of data acquisition and processing, despite its public availability. We also compare the use of aggregated decision making data against umpire own-lagged contest-level outcomes, essentially the lagged true value of all prior bets, often prominently provided on expert betting websites. We find these more salient macro-level game outcomes to be a poor predictor of future game outcomes. This highlights the value inherent in the behavioral measurement of micro-level decision making information. Nevertheless, we still find evidence that bettors tend to place asymmetric bets consistent with expecting to receive a market advantage from knowledge of more salient macro-level game outcomes. This is in contrast to the returns generated by a bettor that utilizes information on the decision making tendencies of influential actors. With the use of underlying behavioral measurement, future expectations may be more precisely estimated due to the persistence of these tendencies. This is particularly important if measurements of the behavior of individuals with substantial influence are obtainable only by certain experts, or if higher quality, less salient information is costly to obtain. We suggest that as the technological capacity to collect data on individual behavior grows, and acquisition costs decrease for certain investors, questions are likely to arise related to the accessibility and confidentiality of this data and the role that it can play in gaining market advantages among those facing lower access costs (or access to proprietary data). Our paper proceeds as follows. In the next section, we review the literature on the efficiency of markets and note the important gains in understanding the EMH that have been made with the use of sports betting market data. In Section 3, we fully describe the exogenous scheduling policy and release of information in MLB. In Section 4, we describe our empirical approach, and in Section 5 we discuss the results. In Section 6 we present the results of a simple betting strategy that returns profits acting only on umpire decision making behavior. We conclude our paper in Section 7. 2. The efficient market hypothesis The EMH is most commonly associated with Fama (1970) and the propensity for stock market prices to internalize information immediately upon release. Researchers investigating the EMH have attempted to find information shocks that change prices immediately to identify inefficiencies that experts can exploit to generate profits from trading. In characterizing EMH inquiry, there are three general ways in which efficiency can be classified: weak form, semi-strong form, and strong form. In weak form efficiency, the current price of an asset includes all information about past prices of the asset, and therefore profits cannot be made from observing only past prices. In Semi-strong form efficiency, when new underlying information is released publicly, prices update fully and immediately to reflect the new information. Strong form efficiency is the most stringent classification, under which a price inherently includes all information about the asset, including private information. Researchers have attempted to address both weak form and semi-strong form efficiency in various settings, with varying levels of success in identifying long-term inefficiency (Fama, 1998). One common area in which the EMH is tested is the stock market, where economists have tested efficiency at the release of company-specific financial information (Ball and Brown, 1968; Fama et al., 1969), macroeconomic information (Hardouvelis, 1987), tangible, intangible, or uncertain information (Brown et al., 1988; Daniel and Titman, 2006), and how the arrival of this information interacts with the time horizons of investors (Vives, 1995). Other work has established the relevance of new macroeconomic information in exchange rates, such as the money supply, inflation, and interest rates (Dornbusch, 1980; Hakkio and Pearce, 1987). Behavioral inquiry has been particularly fruitful in the context of EMH, identifying overreaction or under-reaction to certain types of new market information, or continued price drift and delayed response to this information (Bernard and Thomas, 1989; De Bondt and Thaler, 1985; Jackson and Johnson, 2006; Michaely et al., 1995; Poteshman, 2001; Stein, 1989). However, identification of the effect of information release or breaking news in these markets is often difficult due to uncertainty over the moment of its release and the possibility of information leakage to certain market participants. Further, public availability of information itself is not necessarily sufficient in generating large price movements toward efficiency. For example, prior work has shown that the cost of obtaining information, in combination with an investor’s risk preferences, can play a role in investor willingness to incorporate that information into an investment strategy (Ho and Michaely, 1988). Similarly, limited attention can result in ignoring important information if it is not easily accessible, despite being public (Della Vigna and Pollet, 2009; Hirshleifer and Teoh, 2003; Hirshleifer et al., 2013, 2011; Peng and Xiong, 2006). In addition, the salience of news has been shown to play a role in the propensity for investors to react Huberman and Regev (2001); Klibanoff et al. (1998).
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
25
To address uncertainty over the timing of the release of information, economists have turned to betting markets and prediction markets, particularly those related to sports (Croxson and Reade, 2014; Paul et al., 2014b; Sauer et al., 1998). These allow for a clearer and cleaner test of efficiency. Most importantly, sports betting markets are particularly useful in that, unlike most financial markets, the true value of a bet is eventually revealed: once the bet is placed at a given price, and the game is over, realizations of payoffs are straight forward (Gray and Gray, 1997). Within the literature on the efficiency of sports betting markets, economists have identified explicit behavioral biases among bettors and addressed a number of other drivers of relatively small market inefficiencies. These include atmospheric conditions (Paul et al., 2014b), racial biases (Igan et al., 2015; Larsen et al., 2008), the role of expert bettors in moving a betting line toward fundamental values (Gandar et al., 2000), sentiment biases (Braun and Kvasnicka, 2013; Feddersen et al., 2017), favoritelongshot biases (Snowberg and Wolfers, 2010), specific player absences (Dare et al., 2015), clustering (Brown and Yang, 2016), and bettor belief in the hot hand (Paul et al., 2014a), among others. Related work shows that various types of betting lines on a single game may provide additional information to uninformed bettors (Berkowitz et al., 2015). The release of news is only pivotal in some of these studies, such as those evaluating player absences. However, Croxson and Reade (2014) recently used exogenous and sudden news to identify market efficiency in the soccer betting market, finding that bettors in prediction markets react swiftly and fully to a scoring event. Despite this result, given that wagering responses are likely to be muted in the face of less salient information, such as data that takes nontrivial costs to obtain for the lay bettor, we propose that information acquisition costs are likely to slow moves toward full efficiency in the aggregate. Specifically, we extend this work to further understand the propensity for influential agents to affect betting outcomes, and whether these effects are fully and immediately integrated into the MLB totals betting market when assignments are released. As Sauer (2005) calls for an expansion of the efficiency paradigm, this work provides unique evidence of the strength of the growing availability of underlying behavioral data in valuing assets. 3. The structure of umpire scheduling MLB umpires are endowed with various tasks on the field, the most important of which is calling pitches balls or strikes when the batter does not swing. Conditional on location, calling more balls gives an advantage to the batter, putting them a step closer to reaching first base and increasing scoring expectations. More strike calls result in an advantage to the pitcher, putting the batter a step closer to striking out, and decreasing scoring expectations. Each game, umpires make approximately 150 of these decisions. Despite the relatively precise monitoring of each call and league rules clearly defining what constitutes a ball or a strike as it relates to pitch location, umpires are given rather wide discretion as to how they interpret this location. In other words, there remains considerable heterogeneity in the accuracy of umpire ball-strike calls across the league (Mills, 2017b). In some cases, individual umpires are consistently favorable to batters, while others are quite favorable to pitchers. This bias can influence the expected total score of each game they work behind the plate (Mills, 2017a). We note that both umpire data collection and processing is likely to be of considerable interest to those speculating in the MLB wagering market. In 2016, Genius Sports estimated the annual size of the MLB betting market to be $55 billion (Purdum, 2016). This market is comprised of wagers on sides (betting on a specific team) and runs totals, with a slightly higher volume of betting found for the former. However, the totals market is considered a sharper market with more advanced betting activity and less casual wagering activity. While an exact monetary breakdown is not publicly available, estimates place totals wagers at approximately a quarter of the market, or $13.75 billion annually (Solar, 2017). The MLB regular season schedule consists of a collection of three to four game series between two teams on consecutive days to complete its 2430 game schedule. During this regular season, each of the 30 league teams plays 162 games; 81 at their home park and 81 as a visitor in another team’s park. Each regular season series is played in a single location, and teams rotate to play another team at the close of each. Unlike many international leagues (e.g., English Premier League), schedules for MLB teams are unbalanced, with a larger number of games played against divisional opponents in competition for designated playoff slots. All teams have both consecutive home stands that last longer than a single series, as well as consecutive road series against various teams at different locations. Although the full schedule for MLB teams is known ahead of time, umpire schedules are not publicly available and are relatively complex due to additional restrictions by the league. In the interest of neutrality by its game officials, MLB requires that umpire crews do not receive overexposure to any single team, with expectations that each umpire crew, consisting of four umpires, travels to each of the 30 ballparks at least once in per season. There are also travel distance constraints that make the scheduling of umpires rather difficult.1 Therefore, umpires do not have a true home base in the way that individual teams have home markets, and must travel more often than teams. Although complex methods are used to create the schedule each year for MLB umpires (Trick et al., 2011; Trick and Yildiz, 2012), and this schedule is largely set at the start of the season, regular season series assignments of umpire crews are not publicly released. Given the complexity with which scheduling is performed, the ability to predict future umpire rotations using back-engineering seems unlikely. Further, members of the umpire crew work different positions during each game, rotating
1 There are other constraints about the required days between a crew working games at the same park, among others. See Trick et al. (2011) and Trick and Yildiz (2012) for more details on scheduling requirements.
26
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
clockwise between home plate, third base, second base, and first base. However, the position of each umpire at the start of the series is not necessarily given by the previous series rotation. This introduces an additional layer of unpredictability in umpire home plate assignment, in addition to the unknown crew assignment to begin a series, at least until just before the start of the first game. Once the umpire crew and positional assignment is known in Game 1, the rotation continues in an essentially perfectly predictable clockwise fashion through Game 2, Game 3, and Game 4 of the series. This quasi-random scheduling process provides identification for our test of efficiency of the MLB totals market with respect to umpiring, given that its release is exogenous to other information specific to expected total runs. Betting generally opens approximately 24 h prior to a game, and betting for subsequent games usually opens after the start of the most recent game. Thus, information on umpires is specific to the opening of the betting market for each game. Our central interest relates to the position in which umpires call balls and strikes, and therefore have the most influence on game outcomes: home plate. We use the designation of Game 1 and Games 2 through 4 in a regular season series as our control and treatment group, respectively, for the release of information. In Game 1 of any series, the home plate umpire is unknown to the public until essentially game start time, when betting on this contest closes. Therefore, information on umpire offensive favorability should not be integrable into the betting market. However, in Game 2 and beyond, the rotation of umpires is perfectly known, and this information can be used by oddsmakers to set the total line and by bettors in assessing the correct side for which to wager. We use this information about umpire assignment to test for what we classify as semi-strong form and weak form efficiency. We begin by testing the propensity for the totals market to adjust to less salient public information about specific umpire behavior: individually aggregated decisions (costly, micro-level) on ball-strike calls. We then test the propensity for the totals market to adjust to recent information about umpire-specific betting outcomes in the form of total runs scored and runs relative to expectation in past contests (salient, macro-level). 4. Data & estimation procedure 4.1. Underlying strike zone behavior To measure umpire behavior, we first gathered information on umpire strike zones from pitch-by-pitch locational data at Baseball Savant for the 2008 through 2014 regular seasons.2 These data are more commonly known as PITCHf/x, and have been used in past work to identify biases among umpires (Chen et al., 2016; Green and Daniels, 2017; Kim and King, 2014; Mills, 2014; Parsons et al., 2011; Tainsky et al., 2015). The data include information on every pitch thrown from 2008 through 2014 (approximately 4.96 million pitches), and a projected location through which that pitch crosses a twodimensional plane at the front of home plate. Both vertical and horizontal coordinates are included, as well as batter and pitcher handedness. We restrict our analysis only to those pitches which require judgement by the umpire: called balls and called strikes. Any pitches at which the batter swings were removed from the data. This left about 2.54 million total umpire-called pitches from which to estimate heterogeneity in individual umpires’ propensity to call strikes. Although the MLB rulebook explicitly states the definition of this strike zone,3 and all umpires presumably have interest in calling it correctly, previous work has shown that there is considerable heterogeneity in the rate at which different umpires call balls and strikes, even after holding location constant (Mills, 2017b; a). Our initial task was to estimate a model of the probability of a called strike, conditional on its location within or near the strike zone. This baseline estimate allows us to measure the extent to which individual umpires are more favorable to batters (more balls, higher scoring) or pitchers (more strikes, lower scoring), relative to the rest of the league’s umpires. Previous work has shown that there is a non-linear relationship between the horizontal and vertical location of a pitch and the probability it is called a strike (Mills, 2014), and we therefore estimate a pooled semi-parametric generalized additive model (Wood, 2003, 2006) of the called strike zone for a specified lagged time window. The generalized additive model (GAM) is specified as: g(𝜇n ) = fb (Xn , Yn ) +
12 ∑
𝛾j Zn + 𝜀n .
(1)
j=1
The term 𝜇 n is the mean of a binary variable, IS (cn ), that is equal to one if the pitch, n, was called a strike, and zero otherwise, ∑12 with g(·) representing the logit link function for the binomial response. j=1 𝛾j Zn identifies the linear additive effects (𝛾 j ) of each of the 12 possible ball-strike count scenarios (Zn ), as these have been shown to substantially impact the likelihood of a strike call even after controlling for pitch location (Green and Daniels, 2017; Mills, 2014; Moskowitz and Wertheim, 2011). The unknown smooth function fb (Xn , Yn ) is estimated jointly for vertical and horizontal location, indexed by Xn and Yn , respectively, with separate right- and left-handed batter surfaces indexed by b. 𝜀n is an error term. We estimate the model with the restricted
2 Accessed via www.baseballsavant.mlb.com. We stop prior to the 2015 season, as there is strong evidence suggesting that changes to the baseball from late 2015 through 2017 resulted in large unexpected structural changes in run scoring across the league. 3 While the definition requires an adjustment to the size of the zone based on batter height, we assume that the average height of batters across umpires is distributed at random such that our estimates of strike zone size are unbiased from game to game.
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
27
maximum likelihood method of Wood et al. (2015) to estimate the appropriate smoothness of thin-plate regression splines without the restrictive assumptions of structural polynomial regression models. Individual models were estimated for each game date using all umpire-called data over the previous 90 days of game play to establish a pooled strike zone for the entire league over this period. For example, if a game is played on July 1, 2014, then all pitches thrown from April 2, 2014 through June 30, 2014 would be included in the estimation. We estimate these rolling lagged models for two reasons. First, it ensures that the well-documented time-dependent changes in the strike zone are not adversely affecting our individual estimates (Mills, 2017b), and any strike zone evolution at the league level happens gradually in our model. Secondly, in this procedure, we avoid any ex-post creation of a strike zone by using data after the date being analyzed. Clearly, this means there is considerable overlap in the sample used for each model to create a rolling average strike zone from day to day. As our data span seven seasons of about 184 days each, we estimate approximately 1200 individual models at the daily level.4 Once each model is estimated, we use it to calculate 𝜀n for each pitch in the data set, with each error attributed to a specific umpire, i, as 𝜀in . We subsequently aggregate the pitch-level deviations at the umpire level over this same lagged time period as follows: N
ig 1 ∑ 𝛿ig = 𝜀 Nig n=1 in
(2)
In equation (2), 𝛿 ig represents the average deviation from the pooled strike zone over the 90 days prior to game g, and Nig is the total number of called pitches for umpire i over this same period. 𝜀in = IS (cn ) − g(𝜇 n ) is the observation level error term from equation (1), where IS (cn ) ∈ {0, 1} is an indicator function as before, the expected strike call probability from the pooled semi-parametric strike zone estimation for observation n. We restrict our use of this measure to umpires that have worked at least five games behind the plate over the past 90 days.5 Fig. 1 visualizes the strike zone to clarify the usefulness of this estimation procedure to arrive at 𝛿 ig . The top right panel presents the strike zone of former MLB umpire Lance Barksdale6 for right-handed batters for the entirety of 2008, while the top left panel presents the pooled strike zone model that includes data on all umpires in 2008. Most importantly, the bottom panel presents the differenced strike zone, or the location-conditional pooled probability of a called strike subtracted from the same location-conditional probability of a Lance Barksdale called strike. From these visuals, it is easy to see that other umpires are much more likely to call strikes on the outside corner (darker colored in the bottom panel), while Barksdale is much more likely to call strikes high and inside (lighter colored in the bottom panel). Our measure numerically aggregates these differences at the umpire level for all pitches called over the 90-day rolling time period, essentially creating a variable that represents the aggregate deviation in the bottom panel of Fig. 1 for each umpire-game in the data. This is effectively a measurement of the relative size of the umpire-specific strike zone. As an exhibition of variability in the measure across umpires and across time, the rolling 𝛿 ig for two umpires over the sample period, one of which is clearly more favorable to batters (Tim McClelland), and another who more favorable to pitchers (Bill Miller), is plotted in Fig. 2. We also test two more salient, macro-level game outcome measures related to umpires’ home plate assignments, referred to as umpire own-lagged values. Specifically, we test the extent to which lagged total runs scored, and runs scored relative to the posted over/under total runs line, are predictive of upcoming game totals in which individual umpires work. For example, if an umpire has historically worked games where a large number of runs are scored, and this information is useful in predicting total runs in an upcoming game, an odds maker or bettor should be able to integrate this into the price of betting the totals line. However, if these past results are not related to future outcomes, then bettors responding to this type of information could make biased decisions or engage in noise trading type behavior. This type of information is commonly provided to the public on popular sports betting websites and is therefore easily accessible to most bettors.7 The first measure used is a 90-day rolling average of the actual combined runs scored in games prior to game g during which umpire i worked behind home plate. We denote this measure as Aig , with TotalScore defined as the actual total combined runs scored by both teams in a given contest. It is clear that Aig is a particularly crude measure, given its variability related to the offensive quality of the competing teams in the umpires’ assignments over the past 90 days. We therefore seek to establish an umpires’ game outcomes relative to the runs expectation given by the past betting market. These expectations take the form of the opening and closing totals for the respective games, Gi , within the past 90 days. We calculate an outcome measure, ΔCig or ΔOig , of this deviation from the totals line expectation as:
ΔCig =
Gi 1 ∑ (TotalScoreig − ClosingTotalig ) Gi g =1
(3)
4 Because reliable data started at the beginning of the 2008 season, we start our evaluation of the strike zone 90 days after the start of that season, reducing the total number of models estimated. 5 For robustness, we used a 10 game cutoff and no minimum games worked cutoff in alternative models. The results of the estimations were not substantively changed by changing the cutoff, and we chose five games to avoid inclusion of umpires with very little information about recent strike zone tendencies. 6 Barksdale was known to be one of the more accurate strike callers in the league before retiring (Davis and Lopez, 2015). 7 Some of the most common websites including this information include Covers.com, DonBest.com, Docsports.com, and SportsBettingStats.com.
28
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
Fig. 1. Strike zone visualization. Visualization of pooled 2008 strike zone for right-handed batters (top-left), umpire Lance Barksdale’s individual 2008 strike zone for righthanded batters (top-right), and the difference in strike probability between Barksdale and the average umpire (bottom). The dashed box approximates the rulebook strike zone based on the average height of MLB batters. Darker colors in the top panels represent low probabilities of a strike, while light colors represent high strike probability. In the bottom panel, light areas indicate that Barksdale is much more likely to call a strike in the given location than the average 2008 MLB umpire, while dark areas represent locations in which the average 2008 MLB umpire is much more likely to call a strike than Barksdale. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
ΔOig =
Gi 1 ∑ (TotalScoreig − OpeningTotalig ). Gi g =1
(4)
OpeningTotal is the initial over/under betting line set by odds makers, which approximates the expected total number of runs to be scored in a given contest. ClosingTotal is the final over/under betting line available in the betting market prior to contest start. 4.2. Betting lines and game data We integrate our daily umpire strike zone measures with information on umpire assignments and game outcomes from play-by-play files at Retrosheet8 alongside totals and betting line data from Sports Insights, consisting of all regular season
8
Accessed at www.retrosheet.org.
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
29
Fig. 2. Strike zone persistence and favorability. This figure exhibits of 90-day rolling 𝛿 ig (2008–2014) for Tim McClelland (bottom) and Bill Miller (top) with 95% confidence intervals (dashed lines). If 𝛿 ig > 0, the umpire is more likely than average to call a pitch a strike, controlling for its location (upper half of figure). If 𝛿 ig < 0, the umpire is less likely than average to call a pitch a strike controlling for its location (lower half of figure). McClelland is notably more favorable to batters than the average umpire, while Miller is much more favorable to pitchers. Each umpire’s directional relationship relative to the average umpire is steady across our sampling period, revealing considerable persistence in the measure.
games from 2008 through 2014.9 We remove games that are within the first 90 days of the 2008 season to ensure that 𝛿 ig is consistently estimated across all models with 90 days of information integrated.10 Further, there were 196 games in which the home plate umpire was not identified and 5 games missing listed closing totals lines. We also removed games that were part of a doubleheader, as it was unclear when information about umpires was released. For example, the first game of a series may be originally scheduled for a Friday night, but rescheduled as Game 1 of a doubleheader on Saturday due to a rainout. However, rainouts sometimes happen after umpires are known. Further, depending on when bettors are allowed to place bets, the time window of information availability may be particularly heterogeneous depending on the timing of the rescheduled game. Between 2008 and 2014, there were 174 scheduled double headers, resulting in the removal of 348 games. Our final number of game observations is 14,578. It is useful to provide a basic background on line movement and the odds attached to over/under betting lines, as it is relevant to our identification strategy. Standard over/under totals bets are associated with 10% vigorish, or in other words, a ten-cent line of −110. This means that a bettor must risk $110 to win $100. As liability increases on one side of a bet, odds makers may keep the over/under total static but increase the vigorish on that side of the bet to make it more expensive for bettors to wager on that side.11 Once the liability on a given side reaches a certain threshold, odds makers may make the decision to move the over/under total (for example, a move from 8.5 to 8.0 if bettors are wagering on the under). Odds makers may reset the vigorish at −110 on the newly set over/under total or may set the vigorish at a price of their choosing. We therefore generally assume line movement takes place when bets are placed more heavily than expected on one side of the line, and are adjusted accordingly by bookmakers in the direction of bets placed.12 Summary statistics for all variables are presented in Table 1. Our data are approximately evenly distributed across all seven seasons from 2008 through 2014, with the exception of games missing for the first 90 days of 2008. Of all games, approximately 32.5% are the first game in a regular season series, indicating that the majority of series span three games. The number of unique home plate umpires slightly varies from year to year, with 106 total umpires included in the data set that served as an umpire at some point from 2008 through 2014. The average TotalScore in the data is 8.67 runs, while the average OpeningTotal and ClosingTotal are 8.30 and 8.26, respectively. The fact that the ClosingTotal is less than the OpeningTotal suggests that, on average, odds makers adjust the over/under line down during the wagering period. Actual game scores also have a long right tail, which does not appear in over/under lines, biasing the average estimate of TotalScore upward, relative to betting lines. The 𝛿 ig measure
9
Accessed at www.sportsinsights.com. We note that the 90 day choice is somewhat arbitrary, but generally provided at least 10 games worked by individual umpires to be included in each measurement. We attempted to use other standards for recent tendencies, including the last 15 and last 30 games umpired, and found nearly identical results. If any umpire worked fewer than five games over the last 90 days, the game was removed from consideration. 11 Odds makers also routinely set the opening line with higher vigorish (for example, −115) on one side of a bet. 12 In baseball, it is also quite common to change the vigorish, rather than the over/under, due to smaller variations in scoring than other sports. However, data on both opening and closing vigorish were not available in our data. We therefore focus on the over/under for estimation purposes, but return to the vigorish when evaluating the economic significance of our models. 10
30
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39 Table 1 Summary statistics for all variables.
Var.
Obs.
Mean (%)
S.D.
Min
Max
2008 2009 2010 2011 2012 2013 2014
785 2,301 2,334 2,250 2,312 2,311 2,285
5.38 15.78 16.01 15.43 15.86 15.85 15.67
– – – – – – –
– – – – – – –
– – – – – – –
Game 1 Other Games
4,738 9,840
32.50 67.50
– –
– –
– –
TotalScore OpeningTotal ClosingTotal LineMove |LineMove|
14,578 14,578 14,578 14,578 14,578
8.67 8.30 8.26 −0.34 0.21
4.37 1.01 1.04 0.36 0.29
1.0 4.5 5.5 −3.0 0.0
36.0 14.5 14.5 3.0 3.0
Aig ΔCig ΔOig
14,578 14,578 14,578 14,578
8.74 0.43 0.39 0.00
1.27 1.16 1.17 0.015
4.8
−3.9 −3.7 −0.067
15.2 6.7 6.8 0.069
𝛿 ig
This table presents the descriptive statistics for each of our variables. We include the number of observations (Obs.), mean value (Mean) or percentage of observations taking the given value (%), standard deviation (S.D.), and minimum and maximum values (Min and Max, respectively). Aig is the 90 day rolling average runs scored for umpire i when working the plate, while ΔCig and ΔOig represent the average deviation in runs from the closing and opening totals lines, respectively, for umpire i over the past 90 days. Our central variable of interest, 𝛿 ig , is the strike rate deviation measure for these same lagged time periods, generated from the strike zone model.
ranges from −0.067 to 0.069 across all years in the data set. We note that this implies a range from −6.7 to 6.9 in raw percentages relative to the league average strike rate, conditional on location. 4.3. Estimation procedure Our estimation procedure consists of four separate stages. We first establish the importance of controlling for known scoring biases such as the effects of lower air density at high altitudes specific to Colorado (Bahill et al., 2009; Paul et al., 2014b) and changes to total offense across seasons (Mills, 2017a). Therefore, we begin with a simple regression model using each of the OpeningTotal, ClosingTotal, and TotalScore as a dependent variable predicted by yearly dummy effects and a dummy variable for home games being played in Colorado. This ordinary least squares regression takes the form: yg = 𝛼 + 𝜏g + 𝜃g + vg .
(5)
In equation (5), yg takes the values of one of OpeningTotal, ClosingTotal, or TotalScore for game g. The parameters 𝛼 , 𝜏 g , and 𝜃 g represent an intercept term, yearly effects, and Colorado atmospheric effect, respectively, and vg ∼ N(0, 𝜎 2 ). All estimations include standard errors robust to heteroskedasticity. If markets are fully efficient with respect to these outcomes, we would expect general equality of 𝜏 g and 𝜃 g across models including betting lines (OpeningTotal and ClosingTotal) and models including the actual total scores (TotalScore). However, in the event that these coefficients differ from one another, they should be included in later models in which we test for umpire-specific efficiency.13 Once we establish whether markets fully adjust to changes across years or in Colorado, we move to our second stage of analysis. In this stage, we first confirm that our measure of strike zone size, 𝛿 ig , influences TotalScore at the game level using a regression model as follows: yg = 𝛼 + 𝜏g + 𝜃g + 𝛽𝛿ig + vg .
(6)
Equation (6) is essentially identical to equation (5), but with the addition of the 𝛿 ig as an explanatory variable. Here, we are interested in estimating the parameter 𝛽 as the influence of relative umpire strike zone size, 𝛿 ig , on the expected total number of runs scored in game g, worked by umpire i. We also estimate identical models with OpeningTotal and ClosingTotal as dependent variables to evaluate whether the umpire’s behavioral tendencies with respect to the strike zone are used to set over/under lines. In a perfectly efficient market, we would expect that the umpire tendencies measured by 𝛿 ig would be the same positive
13 Although we do not expect there to be substantial correlation between recent individual umpire zone size and games played in Colorado or across seasons, we proceed with this test due to the clear role that these could play in influencing total scores, as shown in past work (Mills, 2017a; Paul et al., 2014b).
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
31
magnitude across all three regression estimations. We also estimate identical models using the own-lagged game outcome variables ΔOig and ΔCig in place of 𝛿 ig as a comparison of the usefulness of these macro-level variables in estimating expected TotalScore and to establish whether they are integrated into the OpeningTotal or ClosingTotal for an upcoming game. Our third stage then estimates identical regression models for TotalScore, ClosingTotal, and OpeningTotal with the sample split between games identified as the first game in a regular season series (when the umpire is unknown), and other games in the series (when the umpire is known). This provides the crux of our identification of the release of information and the efficiency of the market to this information, when available. If markets are efficient, then we would expect to see a positive coefficient in both regression estimations that include only games classified as the second, third or fourth game of a regular season series. As this information is not available for Game 1 in a series, then there is no expectation that markets could absorb this into the
totals line (price). If markets are inefficient, the coefficient estimate, 𝛽̂, would be either non-significant or of lower magnitude than that found for our previous regression estimation with TotalScore as the dependent variable. Our final estimation procedure uses a newly calculated dependent variable, LineMove, in a regression model identical to those used above, with three separate samples: all games, only the first game of each series, and all other series games. LineMove represents the degree of movement in the over/under total runs line from opening until closing. This estimation therefore tests the possibility that bookmakers do not integrate umpire information fully into their OpeningTotal, but bettors place bets consistent with the information provided in 𝛿 ig (Aig , ΔOig , or ΔCig ), therefore moving the line closer to the efficient price by the time the ClosingTotal is established. A value of zero represents no movement in the over/under line (i.e., the over/under line opens at 8.0 and closes at 8.0). A value of 0.5 represents the over/under line increasing by one half of a run (i.e., a move from 8.0 to 8.5) while a value of −0.5 represents the over/under line decreasing by one half of a run (i.e., a move from 8.5 to 8.0). The average absolute value of the LineMove from opening to closing is 0.21 runs, ranging from −3.0 to 3.0.14 5. Results & discussion 5.1. Seasonal and atmospheric controls
We begin by presenting our results related to controls for yearly changes and atmospheric effects for totals lines and actual scoring outcomes. Column 1 of Table 2 presents the true changes to average TotalScore across time and when games are played in Colorado, where air density is notoriously low. Columns 2 and 3 present yearly and Colorado effects for OpeningTotal and ClosingTotal, respectively. A fully efficient totals market would exhibit equality between the coefficients column 1 and each of columns 2 and 3. However, there is some evidence that true scoring experienced a larger yearly decrease than the totals market managed to incorporate. This is not surprising: there are well-documented changes to the size of the strike zone during this time, with little public indication that there would be a time-persistent increase in its size (Mills, 2017a). In particular, from 2009 through 2014, both OpeningTotal and ClosingTotal are overestimated on average by 0.23–0.40 runs relative to the year dummy coefficients in the TotalScore regressions. Wald statistics in column 4 of Table 2 test the statistical differences in coefficient magnitude across models in column 1 and column 3. For simplicity, we report the Wald statistic only for the comparison in coefficients between ClosingTotal and TotalScore. These reveal that differences in coefficient estimates between the totals market regression (ClosingTotal) and the actual score regressions (TotalScore) are statistically significant at the 5% level for 2013, and the 10% level for 2009, 2010, and 2014. We therefore proceed by including yearly effects in all subsequent tests of efficiency with respect to 𝛿 ig .15
Further, consistent with past work (Bahill et al., 2009; Paul et al., 2014b), the coefficient estimate (𝜃̂g ) on the Colorado indicator variable is nearly a full run lower in the totals market as compared to the actual score, and statistically significant at the 1% level. The size of this effect indicates that simply betting overs for games played at home by the Colorado Rockies may be a profitable betting strategy. Although it seems unlikely that this bias is strongly correlated with our umpire assignments, we nevertheless preserve this control variable in all models due to the rather large size of the effect of lower air density most specific to Colorado.16 5.2. General integration of umpire information We next confirm that our measure of umpire strike zone tendency, 𝛿 ig , indeed predicts changes in TotalScore. These results are presented in column 1 of Table 3. We compare these results with our own-lagged outcome measures, Aig , ΔCig , and ΔOig , in Table 4.17 Because 𝛿 ig increases with the size of the strike zone, we expect that the sign on 𝛽 will be negative: as the strike
14 Line moves equal to or larger than 1.5 are likely to take place due to events such as pitching changes, in which case most bets are canceled on the previous line. We therefore estimated our models only on games in which |LineMove| < 1.5 and found nearly identical results (only 59 observations were removed). 15 We note that it is possible that totals earlier (later) in the year may be less (more) efficient if umpires mostly change their behavior in the off-season, and markets adjust slowly to that change. 16 We estimated models with fixed effects for locations of all games, and found Colorado to stand out with respect to magnitude and statistical significance. These alternative models are available upon request. 17 For consistency, we estimate the effect of ΔCig only on TotalScore and ClosingTotal while testing ΔOig only on TotalScore and OpeningTotal. Aig is used as an explanatory variable for all three potential outcome variables in our regression models.
32
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39 Table 2 Totals, offensive decline, & atmosphere.
Constant 2009 2010 2011 2012 2013 2014 Colorado
N R2
(1) TotalScore
(2) OpeningTotal
(3) ClosingTotal
9.524∗∗∗ (0.171) −0.326∗ (0.194) −0.824∗∗∗ (0.194) −1.072∗∗∗ (0.193) −0.947∗∗∗ (0.193) −1.272∗∗∗ (0.192) −1.490∗∗∗ (0.192) 2.311∗∗∗ (0.231)
8.886∗∗∗ (0.036) −0.024 (0.040) −0.480∗∗∗ (0.041) −0.822∗∗∗ (0.041) −0.716∗∗∗ (0.041) −0.872∗∗∗ (0.039) −1.105∗∗∗ (0.040) 1.349∗∗∗ (0.043)
8.850∗∗∗ (0.038) −0.010 (0.042) −0.487∗∗∗ (0.043) −0.800∗∗∗ (0.043) −0.725∗∗∗ (0.043) −0.885∗∗∗ (0.041) −1.127∗∗∗ (0.041) 1.465∗∗∗ (0.043)
14,578 0.018
14,578 0.191
14,578 0.194
(4)
𝛽 Wald 𝜒 2 – 2.73∗ 3.15∗ 2.06 1.38 4.22∗∗ 3.75∗ 13.73∗∗∗
Columns 1, 2, and 3 present the results of preliminary fixed effects estimates for each dependent variable, TotalScore, OpeningTotal, and ClosingTotal, respectively. A test of the equality of coefficients between column 1 and column 2 is presented in column 4. Standard errors are in parentheses. ∗∗∗, ∗∗, and ∗ refer to statistical significance at the 99%, 95%, and 90% levels, respectively.
zone size increases, run scoring should decrease. However, for the other three measures, a positive coefficient would indicate that odds makers or bettors are using this information to set lines or make bets, respectively. We find a statistically significant effect of 𝛿 ig on TotalScore in the expected direction, exhibited in Table 3. The magnitude of this coefficient indicates that a one standard deviation increase in 𝛿 ig is associated with a decrease in TotalScore of 0.13 runs per game. Alternatively, switching the home plate assignment from the most batter-favorable umpire-game observation to the most pitcher-favorable umpire-game observation would be expected to decrease run scoring by nearly 1.2 runs. This confirms Table 3 𝛿 ig Effects on totals lines & actual score.
Constant 2009 2010 2011 2012 2013 2014 Colorado
𝛿 ig N R2 L-R Test 𝛽 Wald 𝜒 2 Total 𝛽 Wald 𝜒 2 Open
(1) TotalScore
(2) OpeningTotal
(3) ClosingTotal
9.546∗∗∗ (0.171) −0.360∗ (0.195) −0.828∗∗∗ (0.193) −1.077∗∗∗ (0.193) −0.948∗∗∗ (0.193) −1.294∗∗∗ (0.191) −1.490∗∗∗ (0.190) 2.310∗∗∗ (0.232) −9.084∗∗∗ (2.473)
8.885∗∗∗ (0.036) −0.023 (0.040) −0.480∗∗∗ (0.041) −0.822∗∗∗ (0.041) −0.716∗∗∗ (0.041) −0.872∗∗∗ (0.039) −1.105∗∗∗ (0.040) 1.349∗∗∗ (0.043) 0.105 (0.522)
8.853∗∗∗ (0.038) −0.015 (0.042) −0.488∗∗∗ (0.043) −0.801∗∗∗ (0.043) −0.725∗∗∗ (0.043) −0.888∗∗∗ (0.041) −1.127∗∗∗ (0.041) 1.465∗∗∗ (0.043) −1.144∗∗ (0.537)
14,578 0.019 13.66∗∗∗ – –
14,578 0.191 0.04 14.27∗∗∗ –
14,578 0.194 4.67∗∗ 10.72∗∗∗ 38.30∗∗∗
Estimates for the effects of umpire favorability, 𝛿 ig , on TotalScore, OpeningTotal, and ClosingTotal, respectively, along with the fixed effects from Table 2, are presented in columns 1, 2, and 3. Likelihood ratio tests (L-R Test) are presented for each model to test the improvement in fit by adding 𝛿 ig to the model, along with Wald tests evaluating the equality of coefficients on 𝛿 ig across columns. Standard errors are in parentheses. ∗∗∗, ∗∗, and ∗ refer to statistical significance at the 99%, 95%, and 90% levels, respectively.
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
33
Table 4 Effects of Aig, ΔOig , and ΔCig on totals lines & actual score.
(1) TotalScore
(2) TotalScore
(3) TotalScore
(4) OpeningTotal
(5) OpeningTotal
(6) ClosingTotal
(7) ClosingTotal
Constant
9.300∗∗∗ (0.334)
9.508∗∗∗ (0.172)
9.507∗∗∗ (0.172)
8.804∗∗∗ (0.070)
8.881∗∗∗ (0.036)
8.70∗∗∗ (0.073)
8.84∗∗∗ (0.038)
Aig
0.024 (0.030) – – – –
– – 0.027 (0.031) – –
– – – – 0.027 (0.031)
0.009 (0.006) – – – –
– – 0.007 (0.006) – –
0.0160∗∗ (0.007) – – – –
– – – – 0.014∗∗ (0.007)
14,578 0.018
14,578 0.018
14,578 0.018
14,578 0.191
14,578 0.191
14,578 0.194
14,578 0.194
ΔOig ΔCig N R2
All models include yearly fixed effects and a control variable for games played in Colorado. We test the usefulness of our lagged average actual score for umpire i in addition to the lagged average deviation from the opening and closing total score for umpire i (Aig , ΔOig , and ΔCig , respectively). Estimated effects of these variables on actual game scores (TotalScore) are presented in columns 1–3. Regressions presented in columns 4–5 test the integration of this salient information into OpeningTotal lines set by oddsmakers, while regressions presented in columns 6–7 test integration into ClosingTotal lines, presumably through asymmetric betting. Standard errors are in parentheses. ∗∗∗, ∗∗, and ∗ refer to statistical significance at the 99%, 95%, and 90% levels, respectively.
that the umpires’ strike zone decision making tendencies alone have a particularly strong influence with respect to MLB game outcomes. EMH would therefore predict that this information is fully integrated into each of the OpeningTotal and ClosingTotal, resulting in a 𝛽̂ coefficient estimate approximately equal in our subsequent regression tests. However, our data tell a divergent story both with respect to the effect of 𝛿 ig on the totals market, and the way in which
this information contributes to OpeningTotal or ClosingTotal, respectively. The estimate, 𝛽̂ in column 2 of Table 3, shows that
OpeningTotals do not reflect any of the information held within the 𝛿 ig measure. Alternatively, the estimate of 𝛽̂ in column 3 of Table 3 reveals that ClosingTotal does integrate some of this information, but not to the full extent that would be implied by the coefficient magnitude in the TotalScore regression. Wald tests confirm that 𝛽 values in each of the dependent variable regressions are all statistically different from one another. Table 3 also presents the results of the likelihood-ratio test of equation (6) against equation (5) from Table 2, indicating that 𝛿 ig adds information to both the TotalScore and ClosingTotal models, but
not the OpeningTotal estimation. We further confirm differences in 𝛽̂ across columns with Wald tests in the bottom two rows of Table 3. The implications of these findings are twofold. First, it appears that the totals market does not integrate umpire information efficiently when setting opening lines, nor does it fully adjust the line by the close of betting. Second, the differences in OpeningTotal and ClosingTotal provide evidence for betting volume impacting the over/under line between when betting opens and when it closes in a direction consistent with umpire assignment. In Table 4, we find no statistically significant impact of any of the three more salient measures of umpire favorability on TotalScore. This indicates that there is little to no relationship between actual total scoring and lagged macro-level scoring outcomes tied to umpire assignment in prior games. Further, we find no effect of either Aig or ΔOig on the OpeningTotal, indicating that odds makers are (correctly) not using this information to set totals lines in any substantive way. However, there is a statistically significant and positive relationship between both Aig and ΔCig on the ClosingTotal. The effect for ClosingTotal implies that bettors are making bets consistent with the information provided in umpire own-lagged outcome values, despite their lack of statistical significance in outcomes for upcoming games. Given this result, the possibility exists that bettors are overreacting to this more salient information when placing wagers. These coefficients, however, are only about one-half to one-third in magnitude of the non-significant coefficients estimated in the TotalScore regressions. 5.3. Test of efficiency at information release
As previously noted, the importance of knowing when scheduling information about umpires is released is key to our identification of the ability for prices, in this case totals lines, to integrate that information when it is available. Because information about umpire assignment is not released for all games, our estimates of the integration of 𝛿 ig are likely to be downward biased by including all games in the estimation sample. Further, it is unclear whether our result is related to unobserved factors associated with umpire assignment, or if the integration is specific to the information release. We therefore separate Game 1 and Games 2 through 4 into two samples and estimate separate models for each subsample of games. In Game 1, umpire assignments are unknown, and therefore the totals market would not be expected to integrate any of the information revealed in 𝛿 ig , either at opening or at closing. However, for Games 2 through 4 in a series, the perfectly predictable umpire rotation allows for the betting market to integrate this information into either or both of OpeningTotal and ClosingTotal. Table 5 presents the estimation results from this identification strategy. We begin with a falsification test. Columns 3 and 5 present results of a test of the integration of 𝛿 ig into OpeningTotal and ClosingTotal for Game 1 of a regular season series, when
34
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
Table 5 Effects of 𝛿 ig on totals lines by series game number.
Game # Constant
𝛿 ig N R2 𝛽 Wald 𝜒 2
(1) TotalScore = 1
(2) TotalScore > 1
(3) OpeningTotal = 1
(4) OpeningTotal > 1
(5) ClosingTotal = 1
(6) ClosingTotal > 1
9.515∗∗∗ (0.320) −6.463 (4.459)
9.561∗∗∗ (0.201) −10.311∗∗∗ (2.966)
8.826∗∗∗ (0.062) −0.202 (0.896)
8.914∗∗∗ (0.044) 0.235 (0.642)
8.820∗∗∗ (0.066) −0.757 (0.924)
8.869∗∗∗ (0.046) −1.347∗∗ (0.660)
4,738 0.014 –
9,840 0.022 0.52
4,738 0.193 –
9,840 0.196 0.16
4,738 0.188 –
9,840 0.193 0.27
We split our estimations from Table 3 for games in which the home plate umpire is unknown at betting open (Game# = 1; columns 1, 3, and 5), and for those games in which the umpire is known (Game# > 1; columns 2, 4, and 6). All models include fixed effects for year and games played in Colorado. Differences in these coefficients for each measure are tested using a Wald test. Standard errors are in parentheses. ∗∗∗, ∗∗, and ∗ refer to statistical significance at the 99%, 95%, and 90% levels, respectively.
Table 6 Effects of Aig, ΔOig , and ΔCig on totals lines by series game number.
Game # Constant Aig
ΔOig ΔCig N R2
(1) OpeningTotal = 1
(2) OpeningTotal > 1
(3) OpeningTotal = 1
(4) OpeningTotal > 1
(5) ClosingTotal = 1
(6) ClosingTotal > 1
(7) ClosingTotal = 1
(8) ClosingTotal > 1
8.856∗∗∗ (0.121) −0.003 (0.011) – – – –
8.777∗∗∗ (0.086) 0.014∗ (0.008) – – – –
8.829∗∗∗ (0.063) – – −0.007 (0.011) – –
8.906∗∗∗ (0.044) – – 0.014∗ (0.008) – –
8.836∗∗∗ (0.126) −0.002 (0.011) – – – –
8.630∗∗∗ (0.089) 0.025∗∗∗ (0.008) – – – –
8.821∗∗∗ (0.066) – – – – −0.004 (0.012)
8.850∗∗∗ (0.046) – – – – 0.023∗∗∗ (0.008)
4,738 0.188
9,840 0.194
4,738 0.188
9,840 0.194
4,738 0.193
9,840 0.196
4,738 0.193
9,840 0.196
We present results from identical estimations to Table 5 using the salient measures of umpire favorability, Aig (columns 1, 2, 5, and 6), ΔOig (columns 3 and 4), and ΔCig (columns 7 and 8). All models include yearly fixed effects and a control variable for games played in Colorado. Standard errors are in parentheses. ∗∗∗, ∗∗, and ∗ refer to statistical significance at the 99%, 95%, and 90% levels, respectively.
umpire assignment information is not available. As expected, we find no evidence that the totals market integrates umpire strike zone information for these games. Columns 4 and 6 of Table 5 present our results of a direct test of efficiency in the face of the release of this information, revealing mixed results. First, there is no evidence that odds makers initially set OpeningTotal lines in a way that includes any information provided by 𝛿 ig . However, as in the previous section, ClosingTotal lines show a statistically significant effect of 𝛿 ig , indicating some integration of umpire information and a downward bias of the estimated relevance of this information for ClosingTotal in the pooled estimation from Table 3. However, we note that despite the nearly 0.60 run
difference in the 𝛽̂ coefficients for the ClosingTotal estimations, this difference in magnitude is not statistically significant.18 The salient umpire variables show similar asymmetry across the first game in a series, and all other series games (Table 6). For the OpeningTotal, there is a small effect, but statistically significant only at the 10% level, that indicates odds makers may act upon this own-lagged macro-level umpire outcome information when it is available after the first game of the series. As with the pooled regressions for the ClosingTotal, both Aig and ΔCig show statistically significant effects, but only for later series games. Again, this indicates that the market acts upon information capturing historical umpire outcomes, but only when that information is available. This is consistent with expectations, providing support that the release of information about umpire assignments is relatively clean for later series games.19 As before, it seems clear, however, that this information is not especially useful in predicting actual total expected runs, based on the pooled TotalScore estimations with each explanatory variable in Table 4. Of course, the result from our 𝛿 ig estimation is somewhat unsatisfying. Not only do ClosingTotal lines not fully integrate 𝛿 ig into their values, there is evidence that odds makers’ OpeningTotal lines are completely unaffected by umpire strike zones even when that information is readily available. Given this, we assume that the drift of totals lines is particularly impacted by the ease of accessibility of information, allowing for a subset of sophisticated bettors to place wagers on a favorable side of the OpeningTotal based on umpire information. Access to this type of repeated decision-making behavior therefore provides an advantage to experts that may be able to exploit the lack of integration of 𝛿 ig into the OpeningTotal set by odds makers. But our
Statistical differences in the umpire strike zone deviation coefficient estimate, 𝛽̂, across models are presented in the last row in Table 5. Indeed, the coefficient estimates for our measures are extremely small and in the opposite direction as would be expected if odds makers or bettors were acting upon this information. 18 19
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
35
Table 7 Effects of 𝛿 ig on line movement.
Game #
(1) LineMove All
(2) LineMove = 1
(3) LineMove > 1
Constant
−0.032∗∗∗
−0.006∗∗∗
−0.045∗∗∗
(0.012) 0.009 (0.015) −0.007 (0.014) 0.021 (0.014) −0.008 (0.014) −0.016 (0.015) −0.022 (0.015) 0.116∗∗∗ (0.020) −1.249∗∗∗ (0.202)
(0.022) −0.015 (0.026) −0.029 (0.025) 0.005 (0.025) −0.036 (0.025) −0.030 (0.025) −0.033 (0.026) 0.158∗∗∗ (0.034) −0.555 (0.352)
(0.015) 0.020 (0.018) 0.003 (0.018) 0.029∗ (0.017) 0.005 (0.018) −0.009 (0.018) −0.017 (0.018) 0.095∗∗∗ (0.024) −1.582∗∗∗ (0.246)
14,578 0.008 –
14,578 0.009 –
14,578 0.008 5.73∗∗
2009 2010 2011 2012 2013 2014 Colorado
𝛿 ig N R2 𝛽 Wald 𝜒 2
We estimate the change in the line from opening to closing with LineMoveas it relates to 𝛿 ig for all games (column 1), Game 1 only (column 2), and all other series games only (column 3). Identical estimations using Aig, ΔOig , and ΔCig are available upon request from the authors. Standard errors are in parentheses. ∗∗∗, ∗∗, and ∗ refer to statistical significance at the 99%, 95%, and 90% levels, respectively.
estimates suggest that only certain individuals have been able to capitalize on this inefficiency, such that lines do not fully drift to expectations indicated in the TotalScore estimations. Further, the lack of statistical difference in Table 5 between the Game 1 coefficient and the coefficient for Games 2 and later, due to rather large standard errors, may indicate some information leakage to bettors regarding umpire assignments shortly before totals lines close for betting. We therefore focus on changes to the totals line from opening to closing, LineMove, to more directly establish how lines move across these two game types and establish any substantive differences in the magnitude of drift across game types. The LineMove estimations directly evaluate the difference-in-differences (DD) by game type and totals type, and can be directly calculated from the 𝛽 values in Table 5. This calculation is equivalent to 𝛽ClosingTotalG1 − 𝛽OpeningTotalG1 and 𝛽ClosingTotalG2+ − 𝛽OpeningTotalG2+ for Game 1 and Games 2 through 4, respectively, and a test of the differences in these coefficients defines the DD effect.
5.4. Totals line drift and umpire tendencies This final test identifies whether the drift from OpeningTotal to ClosingTotal moves in the direction predicted by 𝛿 ig for a given game, and whether this varies by series game number, as defined by the DD exposition in the previous section.20 We estimate this DD by using LineMove as our dependent variable to simplify the analysis and directly interpret 𝛽 as the DD effect of 𝛿 ig with respect to drift from OpeningTotal to ClosingTotal. These results are presented in Table 7. Column 1 in this table presents a pooled effect of 𝛿 ig on LineMove for all games, similarly downward biased as in previous pooled estimations due to the inclusion of Game 1 observations, where this information should not be available. Columns 2 and 3 in Table 7 present the effect of 𝛿 ig for the Game 1 sample and the Games 2 through 4 sample, respectively. For Game 1, there is no statistically significant effect of 𝛿 ig on the drift in totals lines from wagers between opening and closing of betting. However, there is a statistically significant effect for Games 2 through 4. The difference in the magnitude of these coefficients is statistically significant, as indicated by the Wald 𝜒 2 statistic presented in the bottom row of Table 7. As in the previous section, we propose that this drift is likely due to a small number of expert bettors with access to micro-level behavioral information who place wagers on one side of the OpeningTotal, which moves the line slightly to avoid overexposure by odds makers. However, as the drift is substantially different from the expected effect on actual scoring (TotalScore), it appears that the volume of bettors acting on this information is limited. Given this result, we test the propensity for this information to provide a market advantage to those bettors with the capability to make simple bets based on this additional information.
20
We estimated identical regressions using Aig , ΔCig , and ΔOig as before. The results of these estimations are available upon request from the authors.
36
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
6. Economic efficiency 6.1. A simple betting strategy Although our models find statistical inefficiency with respect to umpire assignment, this does not infer that betting based on this information alone would be profitable in the totals market. In other words, if the size of our effect is not large enough, then a bet is not actionable. We therefore exhibit a simple betting strategy in which we place bets on a limited number of games chosen based only on umpire favorability in their ball-strike calls. Further, the vigorish attached to the bet, or fee paid to the book maker in the case that the bet loses, can remove any inefficiency in the probabilistic outcome of bets placed. While the standard vigorish for a bet across a variety of sports is −110 (bet $110 to win $100), this is often not the norm in MLB totals wagering. Due to the relatively low number of total runs scored, book makers are less likely to move the actual betting line (i.e., over or under 8 total runs scored), and instead more regularly alter the price of betting on a specific side. Depending on the magnitude of betting volume and degree of bookmaker liability, this subsequently makes certain bets more expensive (i.e., −130 requires risking $130 to win $100), while other bets may have prices that return more than the amount wagered (i.e., +130 requires risking $100 to receive $130). Therefore, the evaluation of profitability from any statistical inefficiency is slightly more involved and requires knowing the price attached to each betting line.21 Our evaluation requires directly simulating a betting strategy on existing games according to their respective betting lines and prices. We illustrate both aggressive and conservative estimates of our returns by demonstrating how the approach fares when betting against either the OpeningTotal or the ClosingTotal. Betting against the OpeningTotal presumably provides the highest expected returns, consistent with the finding that the market moves betting lines in the correct direction with respect to umpire favorability. Betting at the close of the market represents a more conservative return for our approach, as the market has at least partially adjusted against any betting lines that may have initially been mispriced. For the ClosingTotal evaluation, we use the given betting lines and prices from the Sports Insights data, allowing a direct calculation of returns under our betting rule(s). However, our OpeningTotal data do not include the accompanying opening price. We therefore simulate returns against three different common prices, all of which likely put our estimates at a disadvantage relative to actual prices. Specifically, we evaluate our returns against the OpeningTotal using prices −110, −115, and −120 for both sides of all bets. This assumes that payouts are consistently favorable to the odds maker with respect to price (i.e., bet $110 to win $100, rather than bet $100 to win $110). We therefore propose that any returns estimated under these conditions are conservative. For example, in our sample, approximately 39.4% of prices for betting the over on the ClosingTotal are positive, while the average non-positive price is −110.74. On under bets, approximately 30.9% of prices are positive, with an average non-positive price of −112.03.22 Under the assumption that prices are similarly distributed for OpeningTotal, the estimates using −115 and −120 for all bets are likely to greatly underestimate our returns. We use a simple betting rule in which we place bets on games from 2009 through 2014, restricted to games that are not the first in a series, if an umpire’s 𝛿 ig is below the 5th percentile or above the 95th percentile of distribution of 𝛿 ig over the past 180 days (approximately one full season).23 If 𝛿 ig is below the 5th percentile of this distribution (much less likely to call strikes than average), we place a bet that the TotalScore will exceed the ClosingTotal or the OpeningTotal (over bet). Alternatively, if 𝛿 ig is above the 95th percentile of this distribution (much more likely to call strikes than average), we place a bet that the TotalScore will be less than the ClosingTotal or the OpeningTotal (under bet). Using this rule, we identify 777 favorable bets based only on 𝛿 ig ; 322 bets are identified as favorable on the over, while 455 bets are identified as favorable on the under. If our bet matches the outcome, we consider this a winning bet, and if the outcome is opposite our bet, we consider this a losing wager. If the TotalScore is exactly equal to the ClosingTotal or the OpeningTotal, this is considered a push, and our entire bet is returned, as is standard practice in the betting market. In betting against the ClosingTotal, we also attempt to reduce uncertainty in our bets by conditioning a wager based on the volatility of 𝛿 ig for an individual umpire over the last 90 days. Here, we only place a bet if the umpire working the game has a standard deviation of 𝛿 ig less than the median standard deviation of 𝛿 ig over the past 180 days of observations (𝜎𝛿ig < 𝜎p50 ). In other words, we only place a wager if the umpire’s individual volatility entering the game is in the lower half of the distribution of volatility of all umpires. This strategy results in fewer total bets (397) due to the more conservative conditions for the wagering choice, with a larger proportion of bets being placed on the under. For all wagers, we use a flat baseline bet of $100, adjusted by the price if it is negative (we bet $110 for a price of −110, $115 for a price of −115, and so on). In the event of a winning bet, we either collect the payout established by the betting price plus our original bet if that price is positive, or collect our original bet plus $100 if the price is negative. For example, for a winning bet with a price of +110, we would receive our original bet plus $110 in winnings, or $100 + $110 = $210. In this case, our net payout is $110. Alternatively, if the price on a winning bet was −110, we would place a $110 bet to win $100, and receive $110 + $100 = $210. In this case, our net payout is $100. In either scenario, in the event of a push, our original bet amount is refunded and we receive $100 or $110, respectively. In the case of a loss, where the price is +110, we lose our original wager of $100. Similarly, in a losing wager with a price of −110, we would lose our original bet of $110.
21 22 23
We note, however, that maximum profitability for the odds maker does not necessarily require an even book (Levitt, 2004; Paul and Weinbach, 2008). We note that bets with positive prices tend to be less in absolute value, at +106.5 and + 105.6 for over bets and under bets, respectively. We begin our test starting in July of 2009, since this is the first full season of games allowing the creation of a 180 day lagged distribution of 𝛿 ig .
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
37
Table 8 Results from a simple betting strategy.
LineBetting Rule Price Assumption
(1) OpeningTotal Level−110 Line
(2) OpeningTotal Level−115 Line
(3) OpeningTotal Level−120 Line
(4) ClosingTotal LevelActual
(5) ClosingTotal Level & SpreadActual
Average Bet Max Bet Min Bet
$110.00 $110.00 $110.00
$115.00 $115.00 $115.00
$120.00 $120.00 $120.00
$108.38 $139.00 $100.00
$108.40 $139.00 $100.00
Total Bets Over Bets Under Bets Over Success Under Success Pushes
777 322 455 158 258 39
777 322 455 158 258 39
777 322 455 158 258 39
777 322 455 155 248 49
397 132 265 64 149 21
Success Rate Under Success Over Success
56.4% 60.0% 51.3%
56.4% 60.0% 51.3%
56.4% 60.0% 51.3%
55.4% 58.5% 51.0%
56.6% 59.8% 50.4%
Avg. Net Payout Avg. Under Payout Avg. Over Payout
$8.37 $16.00 -$2.27
$6.19 $14.00 -$4.71
$4.01 $12.00 -$7.14
$8.00 $14.38 -$0.89
$10.57 $16.87 -$1.77
Columns 1, 2, and 3 present results from betting on the OpeningTotal with the assumption of a flat −110, −115, or −120 line for all bets, respectively, using the level condition for a bet (i.e. a value of 𝛿 ig below the 5th or above the 95th percentile). Columns 4 and 5 present ClosingTotal betting simulation results using actual closing prices for the bet. Results presented in column 4 use the same betting rule as columns 1–3. Results presented in column 5 use the condition that the umpire’s 𝛿 ig volatility was below the median (50th percentile) over the last 180 days.
6.2. Results of betting simulation Columns 1, 2, and 3 in Table 8 present outcomes betting against the OpeningTotal. The simulation in each column assumes a price of −110, −115, or −120, respectively, for each side of a bet. For example, in column 3, betting on either the over or under carries a price of −120 and would require risking $120 to win $100. Of the bets against the OpeningTotal, 416 were wins and 39 were pushes. After removing the pushes, this results in a success rate of 56.4%. The average net payout is $8.37 per bet assuming a bet price of −110, which decreases as the take of the bookmaker increases to −115 and −120, respectively, on losing bets. However, even under the strong assumption that each side of the bet is priced at −120, this approach still results in an average net payout of $4.01 per bet. We also break out our success rates and payouts when betting on the over or the under. It is clear that profits from this strategy come largely from bets on the under, with an average net payout of $12.00 to $16.00 per bet, and a success rate against the OpeningTotal of 60.0% on under bets. For over bets, there is a loss between $2.27 and $7.14 per bet, substantially reducing the overall return. The dichotomy in these returns indicates that our strategy of identifying umpire strike zone tendencies is more favorable to the under. It is well known that bettors are more likely to place bets on the over, and this likely results in more favorable lines and prices on under bets. For example, within our data, the TotalScore exceeds the OpeningTotal only 46.2% of the time, and is less than the OpeningTotal 49.0% of the time (4.7% of games result in a push). For the ClosingTotal bets, the average wager was $108.38, lower than assumed for any of the OpeningTotal evaluations (column 4). Despite these more favorable prices, the overall net payouts per bet ($8.00) and success rates (55.4%) were lower against the ClosingTotal than in our most favorable OpeningTotal assumption. Using actual prices does, however, improve upon the unfavorable OpeningTotal price assumptions of −120 and −115. Lastly, we evaluate our more conservative bets, conditioning on both the level and spread of umpire 𝛿 ig , against the ClosingTotal and actual closing prices, presented in column 5 of Table 8. With this betting rule, our overall success rate increases to 56.6% and the average payout is $10.57 per bet, or about 9.75%. As before, these gains come largely from betting on the under, with a 15.6% return on under bets. Given the results of our simplistic betting strategy, the ability for particularly informed bettors to take advantage of information about umpires in later games in a series seems actionable. However, it is important to note that our analysis ended after the 2014 season, and adjustments to this type of decision-making data may have taken place since, particularly with a proliferation of the accessibility of the data from MLB itself and associated cost reductions in obtaining the necessary information. Nevertheless, it seems clear that inefficiencies in the market have existed, and that odds makers did not immediately integrate this information into their totals market prices when it became available during the time of our data set. 7. Summary & conclusions Our inquiry addresses whether data capturing repeated micro-level decision making of influential individuals is integrated into asset pricing markets. We use a unique natural experiment to identify the timing of information release and test whether this information is implicit in MLB over/under betting lines set by odds makers for games in which the information is available.
38
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
Estimates reveal that while there is statistically significant drift toward efficiency in the totals market with respect to this information, economic inefficiencies remain to be taken advantage of by informed bettors. The partial movement of the line in the correct direction is consistent with previous work (Gandar et al., 2000; Smith et al., 2009) in which sophisticated betting behavior tends to provide additional information on top of what odds makers provide at market open. However, this result is in contrast to Croxson and Reade (2014), who find immediate and full adjustment when new information is released, albeit in a different sport (soccer) and different type of market (prediction markets) than addressed here. We also find that aggregated data capturing micro-level decision-making provides advantages in estimating future outcomes relative to aggregated data capturing more salient lagged outcomes tied to the decision making agent. We presume that the small movement in the line that exists is driven by asymmetry in volume, but note that limited betting volume on some games could be driving the inability for lines to move to full efficiency (Akbas et al., 2016). We highlight that the setting for our inquiry is somewhat specific, where MLB umpires have a substantial amount of control over game outcomes. Despite this uniqueness, the proliferation of data on individual decisions may also have important effects in valuing both assets and labor productivity or asset values beyond the sports world. For example, recent work on surgeon performance found that evaluation of underlying surgical behaviors and skills were more predictive of future complications than own-lagged surgical outcomes (Birkmeyer et al., 2013). Persistence of information in various markets and games has also recently been addressed theoretically by Peski and Toikka (2017), and others have found that under- and overreaction is dependent on system stability and signal precision (Massey and Wu, 2005). Similar results may hold for other markets, where influential agents make various decisions, or regularly behave in persistent, measurable ways, that may be exploited through accessible individual data. Future work would be well-served to explore the importance of underlying persistence in behavior as a useful measure of asset value and future expected returns, and the role of reduced costs of obtaining this information in aiding expert advantages. Statement of competing interests The authors did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Acknowledgements We would like to thank Charlie Brown, Stefan Szymanski, Thomas Peeters, Jason Winfree, Rodney Fort, Daniel Stone, Tony Krautmann, Brian Soebbing, Rodney Paul, Michael Lopez, conference participants at the Western Economic Association meetings, and seminar participants at the University of Michigan and Bowdoin College for helpful comments on past versions of this work. Appendix A. Supplementary data Supplementary data related to this article can be found at https://doi.org/10.1016/j.finmar.2018.07.002. References Akbas, F., Armstrong, W.J., Sorescu, S., Subrahmanyam, A., 2016. Capital market efficiency and arbitrage efficacy. J. Financ. Quant. Anal. 51, 387–413. Bahill, T.A., Baldwin, D.G., Ramberg, J.S., 2009. Effects of altitude and atmospheric conditions on the flight of a baseball. Int. J. Sports Sci. Eng. 3, 109–128. Ball, R., Brown, P., 1968. An empirical evaluation of accounting income numbers. J. Account. Res. 6, 159–178. Berkowitz, J.P., Depken, C.A., Gandar, J.M., 2015. Information and accuracy in pricing: evidence from the ncaa men’s basketball betting market. J. Financ. Market. 25, 16–32. Bernard, V.L., Thomas, J.K., 1989. Post-earnings-announcement drift: delayed price response or risk premium? J. Account. Res. 27, 1–36. Birkmeyer, J.D., Finks, J.F., O’Reilly, A., Oerline, M., Carlin, A.M., Nunn, A.R., Dimick, J., Banerjee, M., Birkmeyer, N.J.O., 2013. Surgical skill and complication rates after bariatric surgery. N. Engl. J. Med. 369, 1434–1442. Braun, S., Kvasnicka, M., 2013. National sentiment and economic behaviour: evidence from online betting on european football. J. Sports Econ. 14, 45–64. Brown, A., Yang, F., 2016. Limited cognition and clustered asset prices: evidence from betting markets. J. Financ. Market. 29, 27–46. Brown, K.C., Harlow, W.V., Tinic, S.M., 1988. Risk aversion, uncertain information, and market efficiency. J. Financ. Econ. 22, 355–385. Brunnermeier, M.K., 2005. Information leakage and market efficiency. Rev. Financ. Stud. 18, 417–457. Chen, D.L., Moskowitz, T.J., Shue, K., 2016. Decision making under the gambler’s fallacy: evidence from asylum judges, loan officers, and baseball umpires. Q. J. Econ. 131, 1181–1242. Croxson, K., Reade, J.J., 2014. Information and efficiency: goal arrival in soccer betting. Econ. J. 124, 62–91. Daniel, K., Titman, S., 2006. Market reactions to tangible and intangible information. J. Financ. 61, 1605–1643. Dare, W.H., Dennis, S.A., Paul, R.J., 2015. Player absence and betting lines in the nba. Financ. Res. Lett. 13, 130–136. Davis, N., Lopez, M., 2015. Umpires Are Less Blind than They Used to Be. FiveThirthyEight, https://fivethirtyeight.com/features/umpires–are–less–blind–than– they–used–to–be/. De Bondt, W.F.M., Thaler, R., 1985. Does the stock market overreact? J. Financ. 40, 793–805. Della Vigna, S., Pollet, J.M., 2009. Investor inattention and friday earnings announcements. J. Financ. 64, 709–749. Dornbusch, R., 1980. Exchange rate economics: where do we stand? Brookings Pap. Econ. Activ. 1, 143–185. Fama, E.F., 1970. Efficient capital markets: a review of theory and empirical work. J. Financ. 25, 383–417. Fama, E.F., 1998. Market efficiency, long-term returns, and behavioral finance. J. Financ. Econ. 49, 283–306. Fama, E.F., Fisher, L., Jensen, M.C., Roll, R., 1969. The adjustment of stock prices to new information. Int. Econ. Rev. 10, 1–21. Feddersen, A., Humphreys, B.R., Soebbing, B.P., 2017. Sentiment bias and asset prices: evidence from sports betting markets and social media. Econ. Inq. 55, 1119–1129.
B.M. Mills and S. Salaga / Journal of Financial Markets 40 (2018) 23–39
39
Gandar, J.M., Zuber, R.A., Dare, W.H., 2000. The search for informed traders in the totals betting market for national basketball association games. J. Sports Econ. 1, 177–186. Gray, P.K., Gray, S.F., 1997. Testing market efficiency: evidence from the nfl sports betting market. J. Financ. 52, 1725–1737. Green, E.A., Daniels, D.P., 2017. Bayesian Instinct. Social Science Research Network Working Paper https://papers.ssrn.com/sol3/papers.cfm?abstract_id= 2916929. Hakkio, C.S., Pearce, D.K., 1987. The reaction of exchange rates to economic news. Econ. Inq. 23, 621–636. Hardouvelis, G.A., 1987. Macroeconomic information and stock prices. Econ. Inq. 39, 131–140. Hirshleifer, D., Hsu, P.H., Li, D., 2013. Innovative efficiency and stock returns. J. Financ. Econ. 107, 632–654. Hirshleifer, D., Lim, S.S., Teoh, S.H., 2011. Limited investor attention and stock market misreactions to accounting information. Rev. Asset Pricing Stud. 1, 35–73. Hirshleifer, D., Teoh, S.H., 2003. Limited attention, information disclosure, and financial reporting. J. Account. Econ. 36, 337–386. Ho, T.S.Y., Michaely, R., 1988. Information quality and market efficiency. J. Financ. Quant. Anal. 23, 53–70. Huberman, G., Regev, T., 2001. Contagious speculation and a cure for cancer: a nonevent that made stock prices soar. J. Financ. 56, 387–396. Igan, D., Pinheiro, M., Smith, J., 2015. A study of a market anomaly: “white men fan’t jump,” but would you bet on it? J. Econ. Behav. Organ. 113, 13–25. Jackson, A., Johnson, T., 2006. Unifying underreaction anomolies. J. Bus. 79, 75–114. Kim, J.W., King, B.G., 2014. Seeing stars: Matthew effects and status bias in major league baseball umpiring. Manag. Sci. 60, 2619–2644. Klibanoff, P., Lamont, O., Wizman, T.A., 1998. Investor reaction to salient news in closed-end country funds. J. Financ. 53, 673–699. Larsen, T., Price, J., Wolfers, J., 2008. Racial bias in the nba: implications in the betting market. J. Quant. Anal. Sports 4, 1–19. Levitt, S.D., 2004. Why are gambling markets organised so differently from financial markets? Econ. J. 114, 223–246. Massey, C., Wu, G., 2005. Detecting regime shifts: the causes of under- and overreaction. Manag. Sci. 51, 932–947. Michaely, R., Thaler, R.H., Womack, K.L., 1995. Price reactions to dividend overreaction and omissions: overreaction or drift? J. Financ. 50, 573–608. Mills, B.M., 2014. Social pressure at the plate: inequality aversion, status, and mere exposure. Manag. Decis. Econ. 35, 387–403. Mills, B.M., 2017a. Policy changes in major league baseball: improved agent behavior and ancillary productivity outcomes. Econ. Inq. 55, 1104–1118. Mills, B.M., 2017b. Technological innovations in monitoring and evaluation: evidence of performance impacts among major league baseball umpires. Lab. Econ. 46, 189–199. Moskowitz, T.J., Wertheim, L.J., 2011. Scorecasting: the Hidden Influences behind How Sports Are Played and Games Are Won. Crown Archetype. Parsons, C.A., Sulaeman, J., Yates, M.C., Hamermesh, D.S., 2011. Strike three: discrimination, incentives, and evaluation. Am. Econ. Rev. 101, 1410–1435. Paul, R.J., Humphreys, B., Weinbach, A.P., 2014a. Bettor belief in the hot hand: evidence from detailed betting data in the nfl. J. Sports Econ. 15, 636–649. Paul, R.J., Weinbach, A.P., 2008. Price setting in the nba gambling market: tests of the levitt model of sportsbook behavior. Int. J. Sport Financ. 3, 137–145. Paul, R.J., Weinbach, A.P., Weinbach, C., 2014b. The impact of atmospheric conditions on the baseball totals market. Int. J. Sport Financ. 9, 249–260. Peng, L., Xiong, W., 2006. Investor attention, overconfidence, and category learning. J. Financ. Econ. 80, 563–602. Peski, M., Toikka, J., 2017. Value of persistent information. Econometrica 85, 1921–1948. Poteshman, A.M., 2001. Underreaction, overreaction, and increasing misreaction to information in the options market. J. Financ. 51, 851–876. Purdum, D., 2016. A Record $1 Billion Bet on Mlb in 2016. ESPN, http://www.espn.com/chalk/story/_/id/18220934/mlb–record–1–billion–was–bet–major– league–baseball–nevada–2016. Sauer, R.D., 2005. The state of research on markets for sports betting and suggested future directions. J. Econ. Financ. 29, 416–425. Sauer, R.D., Brajer, V., Ferris, S.P., Marr, M.W., 1998. Hold your bets: another look at the efficiency of the gambling market for national football league games. J. Polit. Econ. 96, 206–213. Smith, M.A., Paton, D., Williams, L.V., 2009. Do bookmakers possess superior skills to bettors in predicting outcomes? J. Econ. Behav. Organ. 71, 539–549. Snowberg, E., Wolfers, J., 2010. Explaining the favorite-long shot bias: is it risk-love or misperceptions. J. Polit. Econ. 118, 723–746. Solar, D., 2017. Should You Fade the Public when Betting Mlb Totals? Sports Insights, https://www.sportsinsights.com/blog/should–you–fade–the–public–when– betting–mlb–totals/. Stein, J., 1989. Overreactions in the options market. J. Financ. 44, 1011–1023. Tainsky, S., Mills, B.M., Winfree, J., 2015. An examination of potential discrimination among mlb umpires. J. Sports Econ. 16, 353–374. Trick, M.A., Yildiz, H., 2012. Locally optimized crossover for the traveling umpire problem. Eur. J. Oper. Res. 216, 286–292. Trick, M.A., Yildiz, H., Yunes, T., 2011. Scheduling major league baseball umpires and the traveling umpire problem. Interfaces 42, 232–244. Vives, X., 1995. Short-term investment and the informational efficiency of the market. Rev. Financ. Stud. 8, 125–160. Wood, S.N., 2003. Thin-plate regression splines. J. Roy. Stat. Soc. 65, 95–114. Wood, S.N., 2006. Generalized Additive Models: an Introduction with R. Chapman and Hall/CRC. Wood, S.N., Goude, Y., Shaw, S., 2015. Generalized additive models for large data sets. Appl. Stat. Ser. C 64, 139–155.