The Impact of Delay Announcements on Hospital

8 downloads 0 Views 2MB Size Report
The Impact of Delay Announcements on Hospital. Network Coordination and Waiting Times. Jing Dong. Northwestern University, [email protected].
The Impact of Delay Announcements on Hospital Network Coordination and Waiting Times Jing Dong Northwestern University, [email protected]

Elad Yom-Tov Microsoft Research, [email protected]

Galit B. Yom-Tov Technion—Israel Institute of Technology, [email protected]

We investigate the impact of delay announcements on the coordination within hospital networks using a combination of empirical observations and numerical experiments. We show that patients take delay information into account when choosing emergency service providers and that such information can help increase coordination in the network, leading to improvements in performance of the network, as measured by Emergency Department wait times. Our numerical results indicate that the level of coordination that can be achieved is limited by the patients’ sensitivity to waiting, the load of the system, the heterogeneity among hospitals, and, importantly, the method hospital use to estimate delays. We show that delay estimators that are based on historical average may cause oscillation in the system and lead to higher average waiting times when patients are sensitive to delay. We provide empirical evidence which suggests that such oscillations occurs in hospital networks in the US. Key words : Delay Announcements, Emergency Department, Queueing Network Coordination, Join the Shortest Queue, Pooling, Cost of Waiting

1.

Introduction

Delay announcements, commonplace in service systems, can be used to influence quality perceptions and customer sentiment towards the service provider. In addition, such announcements can affect customer choices, with follow-on effects on actual system operations. In consequence, delay announcements have over recent years attracted the attention of the operations research and management communities, with research streams dedicated to both understanding the impact of delay announcements and developing methods to support them. Thus far, most research in this area has concentrated on call center announcements, where delay information has been shown to influence (for example) customer abandonment (Mandelbaum and Zeltyn 2013, Yu et al. 2014). In recent years, a growing number of hospitals have begun posting their Emergency Department (ED) waiting times on websites, billboards, and smartphone apps (see for example Figure 1(a))—a trend that evidence suggests is welcomed by consumers. As can be seen in Figure 1(b), the volume 1

2

of Google search engine queries for “hospital wait time” and “ER [Emergency Room] wait time” (trends.google.com) has been rising steadily over the past four years. Yet it is unclear whether, and how, such information actually affects patients’ choices—and their downstream effects on hospitals’ performance. Although patients’ primary consideration in selecting an ED is generally its timely provision of treatment, other factors have bearing as well, including the reputation of the hospital, its expertise, limitations imposed by patients’ medical insurance plans, and recommendations by the primary physician (Marco et al. 2012). Given the effort required to provide waiting time information for hospital ED services, a number of questions demand to be addressed. First, do customers actually want and use this information? Second, is the proportion of people who seek such information large enough to have an operational impact on the healthcare system, and on hospital networks in particular? Third, do hospitals provide the right information to help achieve coordination (resource pooling) in the network?

Service&Operations&33& Lariviere(Slide&8(

8(

(a) Real time delays announcement web-page of emergency (b) Google Trends query volume regarding ED wait departments in north Texas, USA. Figure 1

times

ED wait time Internet query trends and real-time delay announcements

Here we examine the effect of ED delay information (i.e., delay announcements or waiting time announcements) on patients’ choice between hospital emergency departments, using a combination of empirical analysis and numerical experiments. We base our analysis on two primary sources of data: real ED delay announcements from more than 200 hospitals in the US over a period of three months; and anonymized queries made to the Bing search engine by people seeking ED delay information during that period. The empirical data provide objective evidence for the use of delay information, including public interest in such delay announcements and the influence of such announcements on the delays themselves.

3

Drawing on insights from queueing theory, we then study the operational influence of delay announcements on correlations between different hospitals’ waiting times and on future delays. We use a stylized simulation model, calibrated with real data, to investigate how system characteristics such as patients’ sensitivity to waiting, load, and different delay estimators influence the phenomena observed in the data. Throughout the paper we refer to the correlation between two hositals’ (or EDs’) waiting times as synchronization. Two EDs with correlated waiting times are referred to as synchronized EDs. 1.1.

Scientific Background

Delay announcements have a measurable influence on customer satisfaction (Carmon and Kahneman 1996, Larson 1987). However, such announcements can vary widely in their specificity, from vague information on the current load to relatively precise details about the customer’s location in the queue or their expected waiting time. The effects of these messages differ. Munichor and Rafaeli (2007) showed that, in a call center environment, informing customers about their location in the queue results in lower abandonment rates and higher customer satisfaction compared to other waiting time fillers, such as music or apologies. Allon et al. (2011) developed a game theoretic model, based on a strategic service provider and a strategic customer, which provides a theoretical basis for determining how vague or specific delay announcements should be. One of the challenges in implementing detailed delay announcements is producing a credible estimation of the delay. In a series of papers, Ibrahim and Whitt proposed several delay estimators, based on queueing theory, for customers joining a multi-server service system. They considered queuing systems with a time-homogeneous, as well as time-varying, arrival process (Ibrahim and Whitt 2009, 2011). Their proposed estimators are based on a real-time history of the queue, including last-to-enter-service (LES) information (i.e., the delay experienced by the last customer entering service) and head-of-line (HOL) data (the total delay experienced by the customer currently at the head of the line). These estimators perform well in reality, as was shown in Senderovich et al. (2014). Senderovich et al. (2014) used queue mining techniques to solve the on-line delay prediction problem, validating the theory-based queueing predictors with real data. Estimating delays in EDs is substantially more difficult than in call centers because of the inherent complexity and transient nature of these systems. ED patients do not wait in a single queue, but instead undergo a process involving multiple resources (physicians, nurses, labs, etc.) and generally taking 3 to 6 hours to complete. The complexity is even greater when we consider that patients’ arrival rate is time-varying, patients are prioritized according to severity, and that any given patient’s route is unknown ahead of time. Plambeck et al. (2014) developed a forecasting method for estimating ED delays based on a combination of queueing and machine learning methods. None of the hospitals from which we drew our data use such sophisticated models. Instead,

4

they publish historic average waiting times using a 4-hour moving average, a measure which has become the convention in US hospitals. As suggested above with respect to call centers, delay announcements influence not only customer satisfaction but also customer waiting costs and, in response, customer actions (Yu et al. 2014). Announcing the expected delay as customers enter the system, especially during heavily loaded periods, may cause customers to balk (leave the system upon arrival) or abandon after a short time (Mandelbaum and Zeltyn 2013, Yu et al. 2014). Delay announcements may thus serve as a means to reduce the load on a service system, in that customers who abandon shorten the waiting time for those customers behind them in the queue. This feedback may make it difficult to analyze the system’s steady-state performance. Armony et al. (2009) explored the effect of delay announcements in an M/GI/s + GI queue, proposing two approximations for the steady-state performance of such systems. Ibrahim et al. (2013) developed delay estimators that took these feedback effects into account. Given the potential influence of delay announcements on customer behavior, they can be used as an operational tool. For example, delay announcements are used in call centers to help customers choose their time of service via a call-back option (Armony and Maglaras 2004). In theme parks, delay announcements can enhance resource allocation by helping customers choose preferred queues (Kostami and Ward 2009). The above-mentioned research investigated the impact of delay announcements on the company which provides the information. However, it is also important to investigate the impact of such announcements on social welfare in a network setting, as is the case when several EDs are located in the same area. In such settings, announcements by one service provider may impact demand at other providers. Moreover, in the case of EDs, service providers are not only in competition, but also have incentives to cooperate. On the one hand, hospitals want to attract patients, and EDs are considered the ‘gateway’ to a range of hospital services. On the other hand, the expensive nature of ED services limites their resources and capacity—leading to occasional high congestion and long waiting times, with the potential for diminished quality of care (Chalfin et al. 2007) and increased mortality (Bennidor and Israelit 2015). At such times, therefore, the hospital has an incentive to ease some of the load by reducing the arrival rate. Some hospitals attempt to share the load through ambulance diversions. However, ambulance diversions are inefficient for this purpose because the patients being diverted in this way are those most in need of urgent medical care. What hospitals want is to influence the behavior of those least in need of immediate treatment: the non-acute patient population, which can account for up to 90% of ED visits (Plambeck et al. 2014). Delay information, unlike ambulance diversions, can be used when the ED is crowded to encourage those non-acute patients to respond to long wait times

5

by choosing a different ED or delaying their visit. Assuming patients take such information into account, announcements are thus expected to smooth the demand for hospital services throughout the day and to balance patient loads between nearby hospitals. Figure 2 compares the simulated sample path of two systems, each with two hospitals. Figure 2a shows the wait times of two uncorrelated hospitals, where patients randomly choose which hospital to attend. Figure 2b shows the same for two correlated (fully synchronized) hospitals, where patients always choose the hospital with the shortest waiting time. We base our analysis on the observation that in the latter, the wait times of the two hospitals are synchronized. Therefore, in this paper, we identify connections between delay announcements and correlations between workloads at geographically proximate EDs. 60

40 30 20 10 0

30 25 20 15 10 5

0

5

10

15

20

t (hours)

(a) Two

independent

0

0

5

10

15

20

t (hours)

(uncorrelated) (b) Two fully synchronized (correlated)

queues Figure 2

W1(t) W2(t)

35

W(t) (minutes)

50

W(t) (minutes)

40

W1(t) W2(t)

queues

Unsynchronized vs. synchronized systems.

From a theoretical point of view, hospitals are akin to independent agents, meaning that they do not have a social planner that balances the load between them. However, even some amount of coordination—for example, a system that enables the customers to always Join the Shortest Queue (JSQ)—improves the efficiency of the network to almost the level of a fully pooled system (Foley and McDonald 2001). This policy is advantageous even if only a small fraction of customers follow the JSQ policy (Turner 2000). Hence, one of the operational rationales for providing delay announcements is that they may improve the efficiency of the hospital network even if not all patients behave strategically or are sensitive to waiting. This paper uses Internet search engine logs to explore correlations between waiting time observations and people’s choice of emergency departments. Search engine queries have been shown to reflect people’s activities in the physical world as well as the virtual one. For example, Ofran et al. (2012) found a high correlation between the number of searches for specific types of cancer and their

6

population incidence. Similarly, a high correlation was observed between the number of searches for certain medications and the number of prescriptions written for those drugs (Yom-Tov and Gabrilovich 2013). Additionally, search engine logs have been used to monitor influenza activity (Polgreen et al. 2008), to examine the association between online exposure to underweight celebrities and the development of eating disorders (Yom-Tov and boyd 2014), and to discover potential adverse effects of certain medicines (Yom-Tov and Gabrilovich 2013). In the context of emergency departments, aggregated search logs were used to predict ED visits (Ekstr¨om et al. 2014). 1.2.

Goals and main contributions

We make a number of contributions. Specifically, our empirical results show that: • Patients explore delay information provided by hospitals, and do so at a growing rate. • The number of customers who take delay information into account when selecting ED providers

is sufficient to have a pooling effect on the hospital network, as evidenced by the significant effect of the number of delay-reporting hospitals per unit area and the number of queries about them in explaining synchronization levels in hospital networks. The distance between hospitals also influences synchronization: the greater the distance between two hospitals, the lower their effect on one another. Our numerical results show that: • Synchronization between two hospitals is influenced by how sensitive customers are to delay,

the load of the system, the scale (size) of the system, and the hospitals’ heterogeneity in terms of customer preferences and size. • If the appropriate method is used for delay estimation (and loads are fairly balanced), the social

welfare of patients in this network increases. That is, delay announcements will reduce average waiting times for all hospitals in the network. • The accuracy of the waiting time estimator, as well as the delay of the estimator in reflecting

true waiting times, have a profound impact on the effectiveness of delay announcements. We find that the commonly used method of a 4-hour moving average could be problematic. This method can cause the high oscillation of the load between hospitals when customers are very sensitive to delay. Our empirical analysis supports this finding and suggests that small differences in wait times between geographically proximate hospitals translate, on average, to large future differences and vice versa.

2.

Empirical study of announcement impact

In this section, we draw on the data described below for objective indicators that patients indeed use delay information in choosing emergency service providers, and that this has a profound influence on network coordination. We also explore how sensitive patients are to differences between waiting times at different hospitals.

7

2.1.

Data sources used in this research

First, we identified 211 US hospitals which published their waiting times using RSS feeds as of March 2013. We collected these waiting times every 5 minutes by polling the hospitals’ RSS feeds between March and June 2013 (inclusive). The waiting times refer to the time expected to elapse from when the patient enters the ED to when he/she is first seen by a medical provider. No real-time information is provided on the patient’s length of stay or classification according to severity. All the hospitals included in the research use the same estimation method, namely, a moving average over a 4-hour time window. All the hospitals explain in their websites that the reported wait times do not apply to urgent patients, as these patients are prioritized according to their medical condition. Hence, hospitals urge patients to ignore this information if they are in immediate threat to their lives. From our data it appears that this information is updated every 15 minutes. To estimate the total number of hospitals in each area we obtained a list of 36,438 hospitals using the Bing local search application. Finally, we extracted all queries made using the Bing search engine between March and June 2013 (inclusive) which resulted in a visit to the web page of one or more hospitals on our list. We assume that all visitors to these pages noticed the wait times, which are prominently displayed. Each query contained an anonymized user identifier, a time stamp, user location details (GPS information for mobile users and zip-code information for other users), the query text, and the pages which were clicked as a result of the query. 2.2.

Results

As a preliminary step, we clustered the 211 hospitals into three groups according to their wait times, using the K-means algorithm. The resulting clusters partition the hospitals into low wait, medium wait, and high wait times. Figure 3 shows the average wait times for the three groups over a 3-week period, at 5-minute resolution. The figure thus demonstrates daily patterns as well as the variation in waiting times among the three groups. In most of the empirical analyses, we want to exclude synchronization that is due to diurnal patterns. Hence, for each hospital, we removed the hourly and daily waiting time trends from the observed waiting times in the following manner. First, we computed the average wait times for each hospital at each hour of the day and each day of the week. Then, for each hospital, a linear predictor was trained to predict waiting times using the appropriate hourly and daily waiting times, and the predicted waiting times were removed from the observed waiting times. We refer to the resulting detrended waiting times as Residual Waiting Times (RWT). The average ratio between the weight of the hourly trend and the weight of the daily trend was 0.999, indicating that the terms had almost identical effects. Together, these two variables explained 85.0% of the variance of the original waiting times.

8

70

60

Wait time [min]

50

40

30

20

10

12/05

19/05

26/05

Date

Figure 3

Average waiting times for the three groups of hospitals—low wait, medium wait, and high wait times— at 5-minute resolution.

Figure 4 plots the RWT of two hospital pairs. As the figure suggests, some ED pairs are more synchronized than others. The first factor to influence synchronization is distance: we expect geographically close hospitals to be more synchronized than those which are geographically distant. Indeed, the distance between hospitals is negatively correlated with the level of synchronization between them (Spearman, ρ = −0.138 (P < 10−5 )). Observation 1. Distance influences synchronization, such that synchronization is stronger for closer hospitals. The greater the distance between two hospitals, the lower the correlation between their wait times. This geographical synchronization could be attributed to several factors, including information provided to ambulance services. What we seek to understand is whether announcements of antici-

80

80

70

70

60

60

50

50

Wait time [min]

Wait time [min]

pated delays for the general population contribute to this synchronization, and to what extent.

40

40

30

30

20

20

10

10

0 12/05

19/05

26/05

0 12/05

Date

(a)

Highly synchronized EDs (32 km (b)

apart) Figure 4

19/05

26/05

Date

Poorly synchronized EDs (1257 km

apart)

Waiting times announced by different EDs.

We note in passing that wealth does not seem to be associated with more hospitals publishing their wait times. For each of the 36,438 hospitals identified in the Bing local search application,

9

we used IRS data to extract the 2012 adjusted gross income for the closest zip code. Using these data we found that in locations where wait times are published, the median income is the same to within 0.5% (P = 0.017, ranksum test). 2.2.1.

Do patients use delay information when selecting or deciding to use ED ser-

vices? As noted above, the past several years have seen steady growth in the number of people searching online for delay information. To investigate whether and how patients use this delay information, and to understand its influence on synchronization, we model the level of synchronization between clusters of geographically close hospitals (less than 20 km apart) using five variables. To define ‘geographically close hospitals’, we clustered the hospitals which published wait times according to their geographic location by performing Agglomerative Hierarchical Clustering (Duda et al. 2001) with closest-link aggregation until no hospitals were found within 20km. This resulted in 46 clusters. Then, we trained a regression tree (see Figure 5) to predict the average correlation of RWT within a cluster and modeled each cluster using the following variables: 1. Number of EDs reporting wait times per square km within a cluster. 2. Number of hospitals per square km (whether reporting wait times or not) within a cluster. 3. Number of queries made to the Bing search engine about the reporting hospitals within the cluster. 4. Average wait times of the hospitals within the cluster. 5. Number of pediatric EDs within the cluster (some EDs do not cater to children, or do so only during certain hours). Applying the leave-one-out cross-validation method to estimate the performance of the model, we found that the Spearman correlation between predicted and actual average RWT correlations was ρ = 0.397 (P = 0.006), indicating that the variables provide good predictive power for the dependent variable—i.e., the synchronization level as measured by RWT correlations. Observation 2. Customers seem to take delay information into account in their decision making. As is evident from Figure 5, the number of delay-reporting hospitals per unit area and the number of queries about waiting times are significant indicators of hospital network synchronization. The higher the number of reporting hospitals per unit area, and the greater the number of queries, the higher the correlations observed. The existence of a pediatric ward is also associated with greater synchronization. 2.2.2.

How sensitive are patients to waiting time differences? In this section, we inves-

tigate the sensitivity of patients to waiting times. In many customer decision models (e.g., Anderson et al. (1996)), this sensitivity is represented by a cost-of-waiting factor. Naturally, a patient whose

10

Figure 5

Regression tree classifier for predicting in-cluster RWT correlations. Decision nodes show the splitting variable and the average correlation at the node (in parentheses). Leaf nodes show the average correlation at the node.

condition is more severe has a higher cost of waiting. Note that the most urgent patients are usually transported by ambulance to the nearest ED and are prioritized in the queue, so the delays reported in the website do not apply to them. Therefore, we examine here the impact of delay announcements on the non-urgent population, a population which may check waiting times before choosing an ED. If the cost of waiting is high, even small differences in wait times between geographically proximate hospitals will influence the patient’s choice. The patient’s choice, influenced by the cost of waiting, will then further influence the gap between wait times at two geographically close hospitals, and the current gap may influence future gaps. Therefore, we explore here properties of the wait-time difference process. This process measures the difference between the waiting times of two proximate (or distant) hospitals, as we define below. We computed the average future difference in waiting times between adjacent hospitals as a function of the current difference. Let Wi (T ) denote the waiting time of hospital i at time T . First, we calculated the difference in waiting times ∆T = Wi (T ) − Wj (T ) at each time instance T between pairs of close (far ) hospitals separated by 20km or less (100–250km). We further calculated ΩT = min(Wi (T ), Wj (T )), which is the lower of the two hospitals’ waiting times. Then, we computed

11

the average difference between each pair at time T + τ , where τ = [1, 2, 3, 4, 5] hours, and stratified this by ∆T at 5-minute increments, that is, 0 < ∆T ≤ 5 up to 15 < ∆T ≤ 20 minutes. The results of this calculation are shown in Figure 6. Figure 6(a) demonstrates several effects. First of all, smaller initial differences in waiting times are associated with larger future differences, and vice versa. While this is expected for general stationary processes, we observe differences in the magnitude of convergence between pairs of close and more distant hospitals. Specifically, within 5 hours, all initial wait time gaps converge towards similar values. However, hospital pairs which are close to each other converge to a wait time of approximately 10.5 minutes, compared to approximately 12 minutes for pairs which are farther away. As a result, for all initial gaps, the gap after 5 hours is smaller for closer hospitals compared to more distant ones. In accordance with our results in the previous section, we may attribute this difference to coordination between the closer hospitals. More precisely, we hypothesize that these observations reflect the cost of waiting coupled with the method by which waiting times are given, i.e., as an average of the past 4 hours of wait times. A linear cost of waiting (such as the one we use in Section 3) implies that if the difference in wait times is small, patients choose hospitals randomly, and consequently the difference tends to grow. If the difference is large, patients will tend to choose the less loaded ED and the difference will shrink. As we will show in Section 3.3, as the cost of waiting increases, the gaps close faster. In addition, we will show in Section 3.2 that the use of 4-hour moving average estimators may cause extra oscillation in waiting time differences. Figure 6(b) shows that future time differences are also dependent on actual waiting times for both EDs at time T , not only on the difference between them. The figure shows ∆T +τ for τ = 5 for close hospitals as a function of two parameters, namely ∆T and ΩT . As the figure demonstrates, when the initial difference is small (< 10 min), after 5 hours it increases (above the dotted line), whereas when the initial difference is large (> 10 min), it decreases. We also note that different absolute values for the same difference in wait times (i.e., points along a vertical line) result in different values for ∆T +τ . The higher the initial wait ΩT , the greater the gap after 5 hours. We interpret this finding as a manifestation of the Weber-Fechner law (Ross and Murray 1996), which states that resolution of perception diminishes for stimuli of greater magnitude. We attempted to model the future difference between wait times of adjacent hospital pairs using the current difference and the number of queries made about both hospitals in the current time period. That is, given the difference in wait times ∆T between two hospitals at time T and the number of queries made about these hospitals, N q, between T and T +1 hours, we predict ∆T +τ /∆T where τ = [1, 2, 3, 4] using a linear regression model: ∆T +τ /∆T = w1 · ∆T + w2 · N qT + w3

12

(a) Average future wait times as a function of

(b) Average ∆T +τ for τ = 5 as a function of the

the difference between the wait times of pairs of

difference between the wait times at T , for four

hospitals. Bold lines denote hospital pairs within

levels of ωT in close hospitals (< 20km). Dotted

20km of each other. Thin lines correspond to hos-

diagonal line indicates ∆T = ∆T +τ .

pitals 100-250km apart. Figure 6

Average future wait times.

We then analyzed the correlation between the model parameters and coefficients using ∆T values of between 0 and 20 minutes at 2-minute intervals—that is, 0 ≤ ∆T < 2, 2 ≤ ∆T < 4 (time units in minutes), etc. Specifically, the R2 between the middle of each ∆T range (e.g., 1min for the 0–2min range) and the weight in the regression model of the actual ∆T (w1 ) is 0.38, while the R2 between the minimal ∆T and the weight of the queries (w2 ) is 0.36. The correlation coefficient was positive for the former and negative for the latter. Thus, the higher the level of ∆T , the higher will be the ratio ∆T +τ /∆T . This is to be expected, since larger gaps are more difficult to close than smaller ones. More importantly, the more queries about hospital wait times are seen, the smaller ∆T +τ /∆T will be at T + τ . In other words, the greater the number of people searching for wait time information, the smaller the eventual gap. This suggests strategic behavior by patients when choosing EDs.

3.

Simulation analysis of queueing models

To improve our understanding of the empirical results, we use simulation models to test how factors such as patients’ sensitivity to delay, load, and network symmetry may affect the level of synchronization that can be achieved between two hospitals. We also analyze the impact of different waiting time estimation methods (delay estimator). There has been a significant volume of work analyzing the JSQ strategy, where customers join the shortest of several parallel queues upon arrival. Most results in this area are established for networks of single-server queues in the heavy-traffic asymptotic regime. The main result of relevance to our discussion is that we only need a small fraction of customers to act strategically (choose the

13

shortest queue) to achieve a high level of resource pooling (state-space collapse in the limit) (Reiman 1984, Turner 2000). In this study, we consider a more complex system—two multi-server queues operating in parallel, where customers choose which queue (server pool) to join through a rational autonomous decision-making approach that takes the delay factor into account. Specifically, we use a Multinomial Logit Model (MNL) (see Anderson et al. (1996, §2.6)) where the utility for being seen in hospital i with reported delay ri is ui (ri ) = βi − αri . βi is a hospital-dependent parameter that reflects differences between hospitals in terms of their service quality, ED capacity (staffing), etc., which may affect how customers perceive the “value” of the service. α measures the “cost” of delay. The probability of choosing hospital 1 is p1 (r1 , r2 ) =

exp(β1 − αr1 ) , exp(β1 − αr1 ) + exp(β2 − αr2 )

where r1 and r2 are the reported waiting times of hospital 1 and hospital 2, respectively. By rearranging the above equation, we have p1 (r1 , r2 ) =

1 . 1 + exp((β2 − β1 ) − α(r2 − r1 ))

When the difference between the reported waiting times of the two hospitals is small, patients will choose the less loaded one with slightly greater probability. When the difference between the reported waiting times is large, patients will almost certainly (with probability 1) choose the less loaded one. For the same waiting time difference, the larger the sensitivity parameter α, the more likely the patient will choose the less loaded hospital. We extend the previous JSQ literature in three directions: a) We introduce a choice model to capture the phenomenon that people are relatively insensitive to small differences in waiting times, but are more sensitive to larger gaps. In our model, we assume everyone gets to choose which queue to join, but they only begin to act strategically when they see a relatively large difference in reported waiting times. Accordingly, we analyze how the sensitivity of customers to delay (the value of α) affects the synchronization of the system. The choice model also allows the inclusion of preferences which are not related to delays at each ED (by varying βi ), thus allowing more heterogeneity between hospitals. b) As we are only conducting numerical experiments, our study offers more flexibility in terms of allowing multiple servers for each queue, different system sizes, and general assumptions on interarrival and service time distributions. We also gain more insights into the dynamics of smallscale systems. c) We analyze how different delay announcements affect the synchronization of the system. This is motivated by the fact that most hospitals report a moving average of historical waiting times instead of the true waiting time or the current queue length. We are also able to distinguish between the effect of estimator accuracy and the delay of the estimator.

14

In what follows, we shall start with an idealistic model where each system is able to report its true waiting time. Note that if, in addition, α = ∞, then every arriving customer chooses to join the queue with the shortest waiting time and we achieve complete resource pooling, i.e. the two queues act as a fully pooled one. We analyze how the sensitivity parameter α, the offered load (i.e., the offered load of each system when α = 0), system scales, asymmetries in patients’ preferences, and system sizes affect synchronization and system performance (Section 3.1). We then investigate the effect of other waiting time announcements that differ in their accuracy (Section 3.2). Last, we check how the sensitivity parameter affects the gaps in waiting times between two queues (Section 3.3). The simulation model allows full flexibility in terms of inter-arrival time and service time distributions. For simplicity, we consider the classical Markovian setting where arrivals follow a Poisson process with rate λ. Service times are exponentially distributed with rate µ. We denote Wi (t) as the true waiting time (delay) of queue i at time t, Ri (t) as the reported delay of queue i at time t, and ni as the number of servers in queue i for i = 1, 2. We measure the system’s synchronization by the correlation between waiting times as seen by arriving customers.

Model Calibration: We choose the parameter values according to data from the Medicare website and the data collected above. Figure 7 summarizes data on the average length of stay (LOS) of patients who are discharged after ED treatment and the average ED waiting time (both collected from the Medicare website) and the size of the ED as measured by the number of beds (based on data from 24 hospitals which publish this information on their websites). We set the mean service time (1/µ) to be 108 minutes, which is the national average LOS of patients who are discharged after ED treatment minus the national average waiting time for these patients (134 − 26 = 108). We note that according to the Medicare website, the average LOS of more severely ill or injured patients, who need to be hospitalized after ED treatment, is 274 minutes, which is much longer than the LOS we use. We concern ourselves with less urgent cases because we think these are the patients who will use delay information to choose which hospital to visit. We set the number of servers (beds) to be in the range 10–40. Finally, we set the offered load of the system at 0.85–0.95, under which most waiting times are in the range of 10–30 minutes. This range is typical for EDs according to our data and the Medicare website. 3.1.

Ideal model

In an ideal world, each hospital will be able to provide patients with the true virtual waiting time. We divide the analysis into two cases: symmetric and non-symmetric. In the symmetric case, we

30 10

10

100

20

20

30

150

40

200

50

40

15

(a) Average LOS (minutes) Figure 7

(b) Number of beds

(c) Average waiting time (minutes)

Box plot for average LOS, number of beds, and average waiting time for different hospitals

assume that both hospitals (queues) have the same number of servers and service time distributions, and are of the same quality. Hence the system is symmetric in terms of hospital capacity (n), load (ρ), and patient preferences (β). In the non-symmetric case we consider hospitals with either different preference parameters (β1 6= β2 ) or different capacities (n1 6= n2 ). 3.1.1.

Symmetric case: The impact of the cost of waiting (α), offered load and

system scale In this case, we assume β1 = β2 , µ1 = µ2 = µ and n1 = n2 . Then, p1 (r1 , r2 ) =

1 . 1 + exp(−α(r2 − r1 ))

We analyze how the sensitivity parameter, α, the total offered load, ρ = λ/((n1 + n2 )µ), and the system scale, n1 and n2 , affect synchronization between the two hospitals, measured by the correlation between the two waiting times as seen by arrivals, and system performance, measured by average waiting times as seen by arrivals. Our numerical experiments lead to the following observations: Observation 3. Sensitivity to wait times increases synchronization, and greater synchronization leads to better performance for both queues. For the same sensitivity level and system scale, load increases synchronization. For the same sensitivity level and offered load, smaller-scale systems gain greater synchronization. We next demonstrate Observation 3 through some numerical examples. Figure 8 plots synchronization as a function of α for different values of offered load ρ. We observe that synchronization increases with α. As previous research on the JSQ policy implies, the increase in synchronization is not linear. A small value of α will lead to a significant increase in synchronization. This suggests that even if customers are only sensitive to very large gaps between waiting times, we still achieve a high degree of synchronization. In addition, for fixed α, the more loaded the hospitals, the more synchronization we gain when patients choose strategically based on reported delays. Synchronization leads to better load balancing and resource pooling, and therefore to better performance as measured by expected waiting times. Figure 9 plots the expected waiting time

16

1

2

Cor(W ,W )

0.8

1

0.6 0.4 0.2 0 0

0.02

0.04

0.06

0.08

0.1

,

Figure 8

Synchronization level as a function of α for different offered loads (n1 = n2 = 25, upper: ρ = 0.95, middle: ρ = 0.9, lower: ρ = 0.85).

as a function of α for different values of offered load ρ. We observe that the expected waiting time decreases with α, and most of the improvement is achieved with small values of α. In all offered load levels, the expected waiting time can be reduced by more than half if some patients act strategically. This is consistent with the queueing literature on resource pooling (Tsitsiklis and Xu 2012). 12

30

70

25

60

11 10 9

50

E[W]

E[W]

E[W]

20

8 7 6

40

15 30

5 4

10

20

3 2

0

0.02

0.04

0.06

,

(a) ρ = 0.85 Figure 9

0.08

0.1

5

10

0

0.02

0.04

0.06

,

(b) ρ = 0.9

0.08

0.1

0

0.02

0.04

0.06

0.08

0.1

,

(c) ρ = 0.95

Expected waiting time (in minutes) as a function of α for different offered loads (n1 = n2 = 25).

Figure 10 plots synchronization levels as a function of α for different values of system scale, n1 and n2 . We observe that for the same values of α and ρ, the smaller system (n1 = n2 = 10) achieves greater synchronization compared to the larger system (n1 = n2 = 40). The intuition behind this is that large systems are relatively well-balanced when acting alone, while small systems benefit more from the load-balancing effect when pooled with another system. More specifically, for an M/M/n queue, the conditional waiting time, W |W > 0, follows an exponential distribution with rate nµ(1 − ρ) (Whitt 1992). Thus, for the same offered load ρ, the larger the system scale n, the

17

smaller the mean and variance of the conditional waiting times. We are more likely to see large differences in waiting times between two unsynchronized small hospitals than between two large ones. Therefore, when customers act strategically, we gain more benefit from the load-balancing effect for small systems. 1.2

1

1

0.8

0.8

2

Cor(W ,W )

0.6

1

Cor(W 1 ,W2 )

1.2

0.4 0.2 0 -0.2

0.6 0.4 0.2 0

0

0.02

0.04

0.06

0.08

0.1

-0.2

0

0.02

0.04

,

(a) n1 = n2 = 10 Figure 10

0.06

0.08

0.1

,

(b) n1 = n2 = 40

Synchronization level as a function of α for different hospital sizes (upper: ρ = 0.95, middle: ρ = 0.9, lower: ρ = 0.85).

3.1.2.

Non-symmetric case: The impact of patient preferences and ED size Here we

consider two non-symmetric cases: hospitals that differ in terms of patients’ preferences (βi ) and hospitals that differ in the size (capacity) of their ED, expressed by a difference in ni for i = 1, 2. The delay announcements themselves are still exact. We start by assuming that the EDs are the same size but that patients have a predetermined preference for the second hospital (i.e., β2 > β1 ), whether due to (for example) differences in the quality of care provided or constraints imposed by insurance providers that drive a higher proportion of the nearby population to a specific facility. As a result, when α = 0, p1 (r1 , r2 ) = exp(−β1 )/(exp(−β1 ) + exp(−β2 )) < 0.5. Hence, hospital 2 has a higher offered load than hospital 1 to start with. Expressed mathematically, if ρ1 = λp1 (r1 , r2 )/(n1 µ) and ρ2 = λp2 (r1 , r2 )/(n2 µ), then ρ2 > ρ1 . We also notice that since p1 (r1 , r2 ) =

1 , 1 + exp((β2 − β1 ) − α(r2 − r1 ))

we can measure the heterogeneity in preferences by |β2 − β1 |. Again, our numerical experiments lead to the following important observations: Observation 4. Preference heterogeneity reduces synchronization. Synchronization always leads to better performance of the more preferred (more loaded) queue, while performance of the less preferred (less loaded) queue may rise or fall depending on the level of heterogeneity between them.

18

Figures 11 and 12 provide a numerical example in this setting. We define system A to have β1 = 1 and β2 = 1.1 and system B to have β1 = 1 and β2 = 2. Both systems have the same total load (ρ = λ/(µ(n1 + n2 )) = 0.9), but the partition of the load between the hospitals differs. In system A, ρ1 = λ

exp(−β1 ) 1 ≈ 0.855 exp(−β1 ) + exp(−β2 ) n1 µ

and ρ2 = λ

1 exp(−β2 ) ≈ 0.945. exp(−β1 ) + exp(−β2 ) n2 µ

Both EW1 and EW2 decrease with α. In system B, when α = 0, ρ2 > 1 (i.e., hospital 2 is unstable when acting alone, so we start the plot of expected waiting times from α = 0.01). We observe that, as in the symmetric case, synchronization increases as α increases, and most of the pooling effect is achieved with small values of α. For the same value of α, synchronization decreases with |β2 − β1 |. In both systems, the expected waiting time of the more loaded system decreases with α. In system B, synchronization assures stability. Nevertheless, this comes at the cost that EW1 rises slightly with α. In general, for small values of |β2 − β1 |, the expected waiting time of the less loaded system falls with α, while for large values of |β2 − β1 |, it rises with α.

1

Cor(W 1 ,W2 )

0.8 0.6 0.4 0.2 0 0

0.02

0.04

0.06

0.08

0.1

,

Figure 11

Synchronization level as a function of α for different values of β2 (n1 = n2 = 25, upper: system A (β1 = 1, β2 = 1.1), lower: system B (β1 = 1, β2 = 2)).

We make similar observations when β1 = β2 and the heterogeneity relates to staffing levels (n1 , n2 ). Synchronization increases with α, and the expected waiting time of the more loaded system (the system with fewer staff) falls with α. For a fixed value of α, synchronization decreases with |n2 − n1 |. For small values of |n2 − n1 |, the expected waiting time of the less loaded system (the system with more staff) decreases with α, while for large values of |n2 − n1 |, the expected waiting time of the less loaded system increases with α. Here synchronization also assures stability.

19 100

60

90 80

50

70 60

E[W]

E[W]

40 30

50 40

20

30 20

10

10

0

0

0.02

0.04

0.06

0.08

0.1

,

(a) System A (β1 = 1, β2 = 1.1) Figure 12

0

0

0.02

0.04

0.06

0.08

0.1

,

(b) System B (β1 = 1, β2 = 2)

Expected waiting time as a function of α for different value of β2 (n1 = n2 = 25, upper: EW2 , lower: EW1 ).

3.2.

The importance of timely and accurate delay announcements

The waiting time estimator used by hospitals will err from time to time; indeed, such errors are inevitable with the moving average estimators employed in all of the hospitals we observed. In this section, we show that these errors limit the degree of synchronization that can be achieved. We also show that delays of the wait time estimator in reflecting the true waiting time may make the system more volatile (oscillating), and cause synchronization to fall (instead of rise) with α. We compare the following two delay estimators: 1. Moving average: Historical average over time windows of specific lengths (the method currently used by most hospitals). Let l be the time window for the moving average function. We demonstrate two cases: a) l = 4 hours; b) l = 30 minutes. 2. Head-of-Line (HOL) wait: Time waited by the customer who is currently at the head of the line when a new customer arrives. This method could be considered as a moving average with l = 0. This method was proved to be quite accurate for estimating delays in multi-server queuing systems (Ibrahim and Whitt 2009). The hospitals in our empirical study calculate waiting times by averaging the time waited by all patients who entered the queue during the past 4 hours (a moving average with a 4-hour window). Other websites and online apps report historical averages using even longer periods of time—up to 1 year (see, for example, the online app “ED Wait Watcher” (Groeger et al. 2014)). Our results shows the potential drawbacks of such delay announcements. In recent years, a few hospitals have started employing waiting time estimators that are based on shorter time windows. For example, Stanford Hospital reported a moving average with a one-hour window. Our results also quantify the potential benefits of such a practice in improving system performance.

20

Figure 13 shows how synchronization levels change with the sensitivity parameter α for different window lengths l. Here, in contrast to what we observed with the idealistic model, synchronization first increases, but then decreases with α. When l = 4 hours, synchronization actually falls to negative levels, while when l = 30 minutes and HOL (l = 0), synchronization is positive for all values of α. Figure 14 shows how the expected waiting time changes with the sensitivity parameter α for different window lengths l. We observe that when the averaging window is long (l = 4 hours), the expected waiting time is increases with α for moderate to large values of α. This suggests that the system is better off without announcements if patients are highly sensitive to delays. But if the window used is small enough, the expected waiting time falls as sensitivity rises. 0.6

0.7

0.2

0.5

0.6

0.4

0.5

0.3

0.4

Cor(W1,W2)

Cor(W1,W2)

0 −0.2 −0.4

0.2 0.1

−0.6 −0.8 −1

0

0.02

0.04

α

0.06

0.08

0.2 0.1

−0.1

0

(a) l = 4 hours Figure 13

0.3

0

−0.2

0.1

Cor(W1,W2)

0.4

0

0.02

0.04

α

0.06

0.08

−0.1

0.1

(b) l = 30 minutes

0

0.02

0.04

α

0.06

0.08

0.1

(c) HOL Wait (l = 0)

Synchronization level as a function of α (n1 = n2 = 25, ρ = 0.9)

350 300

30

30

25

25

20

20

200 150

E[W]

E[W]

E[W]

250

15

15

10

10

100 50 0

0

0.02

0.04

α

0.06

(a) l = 4 hours Figure 14

0.08

0.1

5

0

0.02

0.04

α

0.06

0.08

0.1

(b) l = 30 minuntes

5

0

0.02

0.04

α

0.06

0.08

0.1

(c) HOL Wait (l = 0)

Expected waiting time as a function of α (n1 = n2 = 25, ρ = 0.9).

Our observations reflect differences in accuracy between the estimators. Indeed, the error rate of the reported waiting times with respect to the true waiting times increases with l. Specifically, when α = 0.1, l = 4 hours, the root mean square error of the reported waiting times for hospital

21

p

p E[(R1 − W1 )2 ], is approximately 341.99; when l = 30 minutes, E[(R1 − W1 )2 ] ≈ 32.81, and p when l = 0 (HOL), E[(R1 − W1 )2 ] ≈ 12.17. Figure 15 shows the sample path of the true waiting

1,

times versus the reported waiting times for different values of l.

300 W1(t) R 1(t)

900

250

Delay (minutes)

Delay (minues)

800 700 600 500 400 300

W1(t) W2(t)

70 60

200 150 100

50 40 30 20

200

50

10

100 0

80

W1(t) R 1(t)

Delay (minutes)

1000

0

0

20

40

Time (hours)

(a) l = 4 hours Figure 15

60

0

20

40

60

Time (hours)

(b) l = 30 minutes

0

0

20

40

60

Time (hours)

(c) HOL (l = 0)

Sample path of the true and reported waiting times in hospital 1 (α = 0.1, n1 = n2 = 25, ρ = 0.9).

Interestingly, it is not the error alone which drives the phenomenon of performance deterioration with α. To validate this, we analyze a modified-idealistic model where we report the true waiting time plus an error term that is normally distributed with mean 0 and standard deviation σ. Large values of σ lead to inaccurate delay announcements, but there is no delay effect as with the moving average method. Figure 16 shows how σ affects synchronization levels. We observe that synchronization increases monotonically with α for each value of σ, which is in contrast to the nonmonotonic effect of α we observed in Figure 13. Still, the inaccuracy has an impact: synchronization will not reach its full potential when the announcement is inaccurate. This is reasonable, as patients relying on inaccurate information may inadvertently choose the more loaded queue. Hence, as the error size rises, the maximal synchronization level will fall. To understand why a moving average estimator performs so poorly, we next take a closer look at the sample path of the two queues for α = 0.1 (see Figure 17). We observe that when l = 4 hours, the waiting time processes of the two queues take an alternating oscillating form. Specifically, when W1 (t) is large (small), W2 (t) is small (large). This explains the negative correlation we observe between the two waiting times. When l = 30 minutes or 0 (HOL), the two waiting time processes are closer to each other. This observation also explains the counter-intuitive result shown in Figure 6, where initially small differences in wait times translate to large differences, and vice versa. This phenomenon is known in control theory as self-oscillation, where systems with delayed feedback may oscillate solely because of the delay (Jenkins 2013). Here the delay announcement and its

22

1

Cor(W 1 ,W2 )

0.8 0.6 0.4 0.2 0 0

0.02

0.04

0.06

0.08

0.1

,

Figure 16

Synchronization level for different values of α and σ (upper: σ = 10, lower: σ = 100, n1 = n2 = 25, ρ = 0.9).

influence on customer choice can be considered a control mechanism. This suggests that the delay effect is the main reason for the “desynchronization” when patients are very sensitive to delay. We summarize the observations in this section as follows: Observation 5. More accurate delay estimators lead to greater synchronization and better performance. The time lag has a negative effect on synchronization and thus performance. Large lags may desynchronize the system and lead to worse performance when patients are very sensitive to delay. 300

1000

W1(t) W2(t)

900

250

500 400 300

W(t) (minutes)

600

200 150 100 50

40 30

10

100 0

20

40

Time (hours)

(a) l = 4 hours Figure 17

3.3.

50

20

200

0

W1(t) W2(t)

70 60

700

W(t) (minutes)

W(t) (minutes)

800

80

W1(t) W2(t)

60

0

0

20

40

Time (hours)

(b) l = 30 minutes

60

0

0

20

40

60

Time (hours)

(c) HOL (l = 0)

Sample path of the waiting time process in the two hospitals (α = 0.1, n1 = n2 = 25, ρ = 0.9).

Closing the gap during times of delay

Inspired by the empirical results of Section 2.2.2, we investigate here how the sensitivity parameter α affects the waiting time gap between two hospitals, and the rate at which such a gap will close. Following the delay announcement analysis we conducted in Section 3.2, we first test how the gap between two facilities’ waiting times, E[|W1 − W2 |], changes with the sensitivity parameter α

23

for different delay announcement estimators (Figure 18). We observe that for l = 4 hours and l = 30 minutes, the average gap first shrinks and then grows with α, and for l = 4 hours, the growth in the gap is quite significant, making the situation worse than if no announcements are made. For HOL (l = 0), the gap tends to shrink with α. These results are consistent with our analysis of delay announcement estimators (Section 3.2).

E[|W1 -W 2 |]

E[|W1 -W 2 |]

400 300 200

40

40

35

35

30 25 20

100

15

0

10 0

0.02

0.04

0.06

0.08

E[|W1 -W 2 |]

500

0.1

Figure 18

25 20 15 10

0

0.02

0.04

0.06

0.08

0.1

5

0

,

,

(a) l = 4 hours

30

(b) l = 30 minutes

0.02

0.04

0.06

0.08

0.1

,

(c) HOL

Average waiting time gap, E[|W1 − W2 |] (minutes), as a function of α.

We next look at how fast a specific gap between two sets of waiting times closes for different values of the sensitivity parameter α. Specifically, we start the two queues from different initial states, one empty and the other with a certain number of patients in the system, and observe the time required, τ , for the two systems to close the gap to less than 10 minutes for different values of the sensitivity parameter α. We assume the systems report delays based on head-of-line waiting times. Figure 19 shows the numerical results from a simulation with one system starting empty and the other system starting with 30 patients, which leads to an initial waiting time of around 20 minutes. We observe that as patients become more sensitive to waiting, more of them choose the hospital with the shorter reported (and very likely true) waiting times, thus helping to balance the load faster. We also observe that even small sensitivity levels can reduce the time needed for the gap to close by an order of magnitude.

4.

Conclusion and future research

In this paper, we investigated the impact of ED delay announcements on patients’ choices, and the effect of patients’ choices on hospital synchronization and expected waiting times. We provide empirical evidence that patients take delay information into account and that they act on this information when the gap between waiting times at different facilities is large enough. Using numerical simulations on a simplified version of the model, we observed that synchronization between systems increases with patients’ sensitivity to waiting, the load of the system, and the accuracy of

24 1200 1000

E[=]

800 600 400 200 0

0

0.02

0.04

0.06

0.08

0.1

, Figure 19

Synchronization speed, E[τ ] (minutes), as a function of α (n1 = n2 = 25, ρ = 0.9).

delay announcements. We also found that a mismatch between delay announcements and actual delays may cause extra oscillations in the system load when patients are very sensitive to delays. With respect to the link between our empirical study and numerical experiments, we note that although the hospitals in our empirical study all use a moving average with a 4-hour window for delay announcements, the correlation of RWT between hospitals is still positive. This suggests that the number of patients who use this information when choosing which hospital to go to is currently relatively small. In this case, even with only a few patients acting strategically, we still gain improvements in performance as measured by expected waiting times. However, as the number of people who use this delay information grows, as evident from Figure 1(b), we will need better delay estimators to achieve optimal performance. This paper opens several directions for future research. First, the instability caused by the mismatch between historical averages and future delays calls for more accurate machinery for delay announcements. We hypothesize that customers will be better served by estimates of future waiting times, given that wait time reports are accessed before patients travel to the ED and enter the system. Second, the delay until a physician is first seen represents only very partial information about the actual load in the ED. One might want to consider the influence of other information indicators such as ED Length-of-stay on patient choices. Estimating future LOS is hard, as the reason for the patient’s visit is unknown, and the treatment required for him, as well as the requirements of resources in the system are unknown. Nevertheless, we believe that LOS is an important indicator for patients when considering which hospital to go to.

References Allon, Gad, Achal Bassamboo, Itai Gurvich. 2011.

“we will be right with you”: Managing customer

expectations with vague promises and cheap talk. 10.1287/opre.1110.0976.

Operations Research 59(6) 1382–1394.

doi:

25 Anderson, S. P., A. de Palma, J.-F. Thisee. 1996. Discrete Choice Theory of Product Differentiation. MIT Press, Cambridge, MA. Armony, Mor, Constantinos Maglaras. 2004. On customer contact centers with a call-back option: Customer decisions, routing rules, and system design. Operations Research 52(2) 271–292. Armony, Mor, Nahum Shimkin, Ward Whitt. 2009. The impact of delay announcements in many-server queues with abandonment. Operations Research 57(1) 66–81. Bennidor, Raviv, Shlomo H. Israelit. 2015. Emergency department intermediate stay unit - a failed model. Technion working paper. Carmon, Ziv, Daniel Kahneman. 1996. The experienced utility of queuing: real time affect and retrospective evaluations of simulated queues. Working paper, Duke University. Chalfin, Donald B., Stephen Trzeciak, Antonios Likourezos, Brigitte M. Baumann, R.P. Dellinger. 2007. Impact of delayed transfer of critically ill patients from the emergency department to the intensive care unit. Critical Care Medicine 35(6) 1477–1485. Duda, Richard O., Peter E. Hart, David G. Stork. 2001. Pattern classification. John Wiley and Sons, Inc, New-York, USA. Ekstr¨ om, Andreas, Lisa Kurland, Nasim Farrokhnia, Maaret Castr´en, Martin Nordberg. 2014. Forecasting emergency department visits using internet data. Annals of Emergency Medicine 65(4) 436–442.e1. doi:10.1016/j.annemergmed.2014.10.008. Foley, Robert D., David R. McDonald. 2001. Join the shortest queue: Stability and exact asymptotics. The Annals of Applied Probability 11(3) 569–607. Groeger, Lena, Mike Tigas, Sisi Wei. 2014. Ed wait watcher. URL http://projects.propublica.org/ emergency/. Ibrahim, Rouba, Mor Armony, Achal Bassamboo. 2013. Does the past predict the future? the case of delay announcements in service systems. Working Paper. Ibrahim, Rouba, Ward Whitt. 2009. Real-time delay estimation based on delay history. Manufacturing & Service Operations Management 11(3) 397–415. Ibrahim, Rouba, Ward Whitt. 2011. Real-time delay estimation based on delay history in many-server service systems with time-varying arrivals. Production and Operations Management 20(5) 654–667. Jenkins, Alejandro. 2013. Self-oscillation. Physics Reports 525(2) 167–222. Kostami, Vasiliki, Amy R. Ward. 2009. managing service systems with an offline waiting option and customer abandonment. Operations Research 11(4) 644–656. Larson, Richard C. 1987. OR forum—perspectives on queues: Social justice and the psychology of queueing. Operations Research 35(6) 895–905.

26 Mandelbaum, Avishai, Sergey Zeltyn. 2013. Data-stories about (im)patient customers in tele-queues. Queueing Systems 75(2-4) 115–146. Marco, Catherine A., Mark Weiner, Sharon L. Ream, Dan Lumbrezer, Djuro Karanovic. 2012. Access to care among emergency department patients. Emergency Medicine Journal 29(1) 28–31. Munichor, Nira, Anat Rafaeli. 2007. Numbers or apologies? customer reactions to telephone waiting time fillers. Journal of Applied Psychology 92(2) 511–518. Ofran, Yishai, Ora Paltiel, Dan Pelleg, Jacob M Rowe, Elad Yom-Tov. 2012. Patterns of information-seeking for cancer on the internet: an analysis of real world data. PloS one 7(9) e45921. Plambeck, Erica, Mohsen Bayati, Erjie Ang, Sara Kwasnick, Mike Aratow. 2014. Forecasting emergency department wait times. Working paper, Stanford University. Polgreen, Philip M, Yiling Chen, David M Pennock, Forrest D Nelson, Robert A Weinstein. 2008. Using internet searches for influenza surveillance. Clinical infectious diseases 47(11) 1443–1448. Reiman, Martin I. 1984. Open queueing networks in heavy traffic. Mathematics of Operations Research 9(3) 441–458. Ross, H.E., D.J. Murray. 1996. E.H. Weber on the tactile senses. 2nd ed. Erlbaum (UK) Taylor and Francis. Senderovich, Arik, Matthias Weidlich, Avigdor Gal, Avishai Mandelbaum. 2014. Queue mining–predicting delays in service processes. Advanced Information Systems Engineering. Springer, 42–57. trends.google.com. 2011–2015. Google trends. URL trends.google.com. Tsitsiklis, J.N., K. Xu. 2012. On the power of (even a little) resource pooling. Stochastic Systems 2 1–66. Turner, Stephen R. E. 2000. A join the shorter queue model in heavy traffic. Journal of Applied Probability 37(1) 212–223. Whitt, W. 1992. Understanding the efficiency of multi-server service systems. Management Science 38(5). Yom-Tov, Elad, danah M boyd. 2014. On the link between media coverage of anorexia and pro-anorexic practices on the web. International Journal of Eating Disorders 47(2) 196–202. Yom-Tov, Elad, Evgeniy Gabrilovich. 2013. Postmarket drug surveillance without trial costs: discovery of adverse drug reactions through large-scale analysis of web search queries. Journal of medical Internet research 15(6). Yu, Qiuping, Gad Allon, Achal Bassamboo. 2014. How do delay announcements shape customer behavior? an empirical study. Forthcoming in Management Science.