APPOINTMENT SCHEDULING WITH OVERBOOKING TO MITIGATE PRODUCTIVITY LOSS FROM NO-SHOWS Linda R. LaGanga & Stephen R. Lawrence Mental Health Center of Denver Leeds School of Business, UCB 419 4141 East Dickenson Place University of Colorado at Boulder Denver, CO 80222 Boulder, CO 80309-0419 (303) 504-6665 (303) 492-4351
[email protected] [email protected]
ABSTRACT The challenge of balancing the interests of patients with those of healthcare providers is increased when patients fail to show up for scheduled appointments. Overbooking appointments mitigates the lost productivity caused by no-shows but increases patient wait time and provider overtime. In this paper, simulation analysis is used to develop and test the performance of scheduling rules that are designed specifically to accommodate excess overbooked appointments. Our analysis provides new insights into rules that perform well to increase provider productivity while balancing the increased waiting time and overtime costs of overbooked schedules. Keywords: Appointment Scheduling, No-shows, Overbooking, Service Operations, Simulation
INTRODUCTION When patients fail to show up for their scheduled appointments, provider productivity and effective clinic capacity are reduced (Cayirli & Veral, 2003). To mitigate this loss, health care clinicians have experimented with a number of alternative appointment scheduling policies. Some clinics overbook appointments by double-booking patients into common appointment times and relying on no-shows to allow the schedule to catch up (Chung, 2002). Others have experimented with “wave scheduling” policies that build extra appointments into a schedule to boost provider productivity and that leave other appointment slots empty (Silver, 1975; Schroer
1
& Smith, 1977; Barron, 1980). This combination allows a schedule to catch up after a backlog occurs, thus reducing patient waiting and reducing the need for clinic overtime. Practitioners have reported success in managing appointment schedules with these and other similar approaches, but their accounts have been anecdotal and do not analyze or describe how schedule performance relates to no-show rates or other system characteristics (Chesanow, 1996; Baum, 2001; Chung, 2002). In this paper, we build upon and extend the double-booking, block scheduling, and wavescheduling policies devised by practicing clinicians to develop and measure the performance of a number of scheduling rules based on these policies. We adjust traditional appointment scheduling performance measures to capture the operating dynamics of overbooked appointment scheduling systems, determine their effectiveness when overbooking is used to compensate for the lost productivity of no-shows, and provide recommendations for improving performance in overbooked appointment scheduling systems. Our analysis is useful for schedulers and health care providers to identify and evaluate operational and policy changes that will boost clinic productivity and improve patient service.
OVERBOOKING AND PROVIDER PRODUCTIVITY The practice of booking multiple appointments at the same appointment start time is intended to reduce the time that providers wait for patients to show up, thereby increasing productivity (Bailey, 1952). However, to increase daily productivity, an increased number of patients must be booked and served in each clinic session. Overbooking is often interpreted as “doublebooking,” the practice of scheduling two patients to arrive at the same time (Rohleder & Klassen, 2002). Double-booking is a specific case of block-booking, which schedules a multiple number of patients to show up at the same time, and is not the only option for overbooking.
2
Overbooking a clinic session can often be accomplished without assigning more than one patient to the same appointment time. Depending on how schedule performance is measured, other scheduling rules could be more effective. Chung (2002), for example, “loads up” the schedule by double-booking the first appointment of the hour to ensure that the provider doesn’t lose productivity if one of the two patients doesn’t show up. He reports that this method increased his bottom line profit by almost 15 percent without increasing his overhead costs, and he describes “modified-wave” scheduling as the practice of “loading up the front end of each hour and leaving open slots in the schedule later on to catch up.” This is intended to prevent long patient wait times because if the physician begins to run late, the effect isn’t cumulative. The unscheduled time at the end of the hour can be “borrowed” if needed to serve the patients who show up and may require extra service time. The scheduling rules that we develop and test in our simulation experiments are designed specifically to analyze the effects of the placement of the extra appointments in an overbooked appointment schedule. We compare traditional double-booking and other multiple-booking scheduling patterns suggested by providers with alternative scheduling patterns that use compressed inter-appointment arrival times instead of, or in combination with, multiple booking.
RESEARCH METHODOLOGY We model overbooked appointment schedules with deterministic service times D and clinic capacity N fixed as the total number of patients that can be served within the normal operating time of a clinic session without overtime. Each patient is assigned to a specific provider, and providers do not service each other’s patients. We assume patients who show up are punctual. Appointment scheduling researchers have considered an alternative approach of compressing appointment intervals to accommodate excess appointments (Vissers & Wijngaard, 1979;
3
LaGanga & Lawrence, 2007). For a given show rate S, we schedule appointments for a total of K patients during a clinic session, where K = N/S rounded to the nearest integer (Shonick & Klein, 1977). The expected number of patients served in a clinic session is therefore E ( X ) = SK = N (or is very close to N) and the total number of overbooked patients is equal to K-N. For each of five show rate levels S ={0.9, 0.7, 0.5, 0.3, 0.1}, we modeled thirteen scheduling rules. To attempt to improve schedule performance, we modeled an additional rule for S ={0.9, 0.7} for a total of fourteen scheduling rules, as shown in Table 1. The extra rule could not be implemented for S = {0.5, 0.3, 0.1} because the number of appointments K scheduled at these show rates became too large to be accommodated by the additional scheduling rule. ------------------------------Table 1 about here. ------------------------------Modeling was done in two stages. First, we developed spreadsheet-based planning models of all experimental schedules to calculate patient wait time at every appointment time and for the entire schedule for the case in which every scheduled patient shows up. This analysis was useful in identifying where large wait times are likely to accumulate and to support experimentation and development of alternative scheduling rules. Then, to test the performance of various schedules with stochastic no-shows, we simulated the operation of the planning models for various levels of show rate S. The Clinic Model and General Scheduling Rules We model a realistic clinic session using the parameters of an actual outpatient psychiatric clinic that we studied. In this clinic, a normal morning clinic session runs for four hours from 8:00
4
a.m. – 12 p.m. The service time for each patient is 20 minutes without variation so that the clinic size is N = 12. If every patient showed up with certainty, then there would be no need to overbook, there would be no patient wait time, the provider would be utilized 100% of the time, and there would be no overtime required to serve all patients. However, the clinic experiences a significant no-show rate (S < 1), so if the target number of patients to be served remains at N = 12, then additional overbooked appointments must be added to the clinic schedule. One way to schedule extra appointments into the clinic session is to compress the inter-arrival time between appointments T proportionally to the show-rate S, so that T =DS, or for the clinic under consideration, T = 20S. Table 2 summarizes the inter-arrival times between appointments for ten show-rate levels and appointment durations of D = 20 minutes. ------------------------------Table 2 about here. ------------------------------Table 2 shows that, in most cases, the calculated compressed appointment inter-arrival times T do not correspond to practical appointment times such as 10, 15, 20, 30, 40, 45, or 50 minutes after the hour. For the scheduling rules tested in this paper, appointment start times were set to practical clock times, multiples of 10 or 15 minutes. Thus, for every show rate tested, we included an adjusted compressed rule to move the calculated appointment start times forward or backward to the nearest practical time. We also tested schedules that compressed all appointment times uniformly by the same 5-minute multiple less than the calculated compressed time T. In an abridged representation of our planning models, Table 3 illustrates these scheduling rules in clock time for show rate S = 0.9. The number of appointments to be scheduled is 12/0.9 = 13.33, rounded down to 13.
5
------------------------------Table 3 about here. ------------------------------The average patient wait time, maximum patient wait time, and overtime shown in Table 3 are for the case in which every one of the 13 scheduled patients shows up. The probability of this scenario occurring, for show rate S = 0.9, is 0.2542. If every patient shows up, then, for all of these scheduling rules in this scenario, the amount of overtime is equal to the service duration of one filled appointment, 20 minutes, because providing service to all of the patients who are scheduled requires one extra occurrence of service time duration beyond normal clinic capacity. Wait time, however, varies among rules because of varying amounts of patient backlog caused by the time intervals between appointments. As shown in Table 3, compressing all inter-appointment times to 15 minutes leads to average patient wait time of 30 minutes, with maximum wait time of one hour, when every patient shows up. The advantage of such tight schedule compression is that expected provider overtime is reduced because appointments are scheduled to begin earlier, but at the expense of extended patient wait times. If patients and clinic administrators cannot accept these wait times occurring in at least 25% of the clinic sessions on average, then such a compression rule should be eliminated from consideration. For our evaluation of schedule performance, we include a similar compressed rule for all of the show rates tested. Analysis of Overbooking Dynamics Overbooking a scheduling system introduces new challenges in constructing schedules and comparing performance. The first consideration is the process of scheduling the extra K-N appointments. Overbooking is achieved by adjusting the time intervals between appointments, using block scheduling of multiple patients at one or more schedule times, or combinations of these approaches to fit the extra appointments into the schedule. For high show rates, developing
6
potential schedules can be handled by fitting a small number of extra appointments into the schedule through adding individual appointments to the baseline schedule of N appointments spaced D time units apart. This becomes more challenging as show rates decrease because the number of appointments that must be added to the schedule becomes large, as shown in Figure 1. ------------------------------Figure 1 about here. ------------------------------When K-N exceeds the number of appointment slots available, then block booking of multiple patients at the same appointment time becomes unavoidable. For example, when S = 0.1, then K = 120, which is almost 4 times as large as 33, the number of available scheduling times in our clinic model, because we allow scheduling at 0, 10, 15, 20, 30, 40, 45, and 50 minutes after the start of each of the four hours in the clinic session, and at the end of session at 12:00 p.m. Another way to interpret the impact of overbooking is as the increased work added to the system, expressed as the load factor or multiplier of the number of appointments scheduled (Fetter & Thompson, 1966). Another challenge in developing overbooked schedules is that the number of possible schedules is extremely large. Each of the 33 possible clock times in our schedule represents a scheduling block in which the number of appointments that could be scheduled is y j = {0,1,....K }, ∀ j = {1,...33} . The total number of possible schedules Π that can be ⎛ K + J − 1⎞ constructed for K appointments and J = 33 schedule times is Π = ⎜ ⎟ (Fries & Marathe, K ⎝ ⎠ 1981), as shown in Table 4 for varying show rate S, from which the number of appointments, K, is obtained as 12/S, rounded. ------------------------------Table 4 about here. -------------------------------
7
Thus, for even the smallest possible number of overbooked appointments, fitting that one extra appointment into the schedule can be done in over 73 billion different ways. Therefore, we need some general guidelines to identify a focused set of schedules for which we can evaluate and compare performance and choose among several alternatives that perform well. The calculation of binomial probabilities, shown in Table 5, is useful for determining how often all scheduled patients are likely to show up for an overbooked clinic session and for determining the risk that more than one patient shows up at the same time when a scheduling block has more than one appointment scheduled simultaneously. This information is useful when developing potential schedules. ------------------------------Table 5 about here. ------------------------------For example, if the probability of having more than one scheduled patient show up at the same time should be no greater than 0.25, then double-booked blocks should not be used unless
S ≤ 0.50 , and block sizes of 5 should not be used unless S < 0.20. For S = 0.8 and K = 15, we can expect all scheduled patients to show up in less than 4% of the clinic sessions on average. Unfortunately, if we assume all patients are equally likely to show up or not show up, the noshows could occur anywhere in the schedule. Since we cannot count on the no-shows to occur at particular points in the schedule, we need to design and evaluate the performance of potential schedules using the worst-case scenario in which all scheduled patients show up. We use maximum, rather than average wait time, as a performance measure to more accurately represent the system congestion that can occur when more patients show up during a time interval than a provider can serve, which causes one or more patients to experience very long wait time. LaGanga (2006) provides further analysis and examples.
8
Wait time is minimized when all appointments are scheduled as late as possible and are spread as far apart as possible. For example, to minimize wait time in our clinic model, 13 appointments would be scheduled by placing each of them 20 minutes apart. Because the interval is equal to the service time duration, there would be no wait time, but overtime would be maximized even if no-shows were possible because the last appointment starts at the latest possible time. This analysis considers only reasonable schedules – if all appointments were scheduled at the latest time, then overtime would be at its true maximum, but it would not be reasonable to keep all of the earlier appointment times unscheduled. Next, overtime is minimized when all appointments occur as early as possible in the schedule, which means that all appointments are scheduled at the same time at the start of the clinic, but this schedule maximizes patient wait time. Thus, in our planning models, the contribution to overtime is considered as a function of the lateness of a schedule time and the number of patients scheduled at that time, based on the observation that schedules with heavier appointment weighting toward the end of the schedule tend to incur more overtime. We construct and evaluate potential wave schedules, identify schedule block sizes that contribute to congestion and overtime, and modify the schedule to attempt to alleviate these conditions. Although we can calculate the probability of every possible number of patient shows or no-shows occurring, the analytical calculation of expected wait time and overtime is intractable because overtime occurs not only when capacity is exceeded, but also as the result of the time-based appointment positions in which patients show up. Thus we turn to simulation to model the performance of our schedules at varying patient show rates. Results and analysis are presented in the next section.
9
SIMULATION RESULTS AND ANALYSIS We developed a simulation model with deterministic patient no-shows for each of the 67 schedules that we developed and analyzed in our planning models. For each schedule simulated, we completed 10,000 replications for a total of 670,000 observations of the schedules’ performance. The number of replications was determined by conducting pilot studies preceding the main experiment that indicated that the half-widths of the 95% confidence intervals were less than 1% of the point estimates for the performance measures of interest. In this section, we graphically present our results to display and analyze the performance trade-offs between wait time and overtime. For each show rate S, we consider the objective of choosing the schedule with the best performance, measured as the minimum of the weighted sum of maximum patient wait time, W, and provider overtime, O. Then, for π = the cost per minute of maximum patient wait time and ω = cost per minute of provider overtime, the expected cost of using a particular schedule is
C = πW + ω O.
(1)
The results are useful in separating the schedules that should be considered for implementation from those that should not because they are dominated by one or more others that have smaller values for both patient wait time and provider overtime (Fries & Marathe, 1981). Following the work of Ho and Lau (1992), we can identify the best schedule that minimizes cost among those tested by finding the point where π/ω has a value between the slopes of two adjacent line segments on the efficient frontier. For example, simulation results shown in Figure 2 illustrate that the efficient frontier is formed by the three points corresponding to plotted wait time and overtime results for schedules
10
MAF, MAS, and RDI. Because we seek to minimize cost, the efficient frontier must form the lower bound of a convex region and therefore, it consists of only these three points. ------------------------------Figure 2 about here. ------------------------------For example, the slope between the plotted costs for scheduling rules MAF and MAS is 17.94 − 3.61 = 0.89 (with the negative sign omitted because all the slopes are negative) and 0 − 16.054
between schedules MAS and RDI is
3.61 + 1.08 = 0.18 . Moving from left to right along 16.054 − 42.757
the efficient frontier, the first schedule, MAF, schedules the one overbooked appointment at the end of the clinic session, which minimizes wait time for patients and maximizes overtime for the provider, and according to our efficient frontier analysis, this rule results in the lowest cost for
π > 0.89 . The next schedule, MAS, results from the rule established by Bailey (1952) and ω Welch and Bailey (1952) that schedules two appointments at the same time at the start of the clinic session and individual successive appointments at intervals equal to the average service duration. This scheduling rule was designed to balance provider idle time with patient wait time and would be the cost-minimizing schedule if 0.18