An Empirical Investigation into the Tradeoffs that ... - Semantic Scholar

0 downloads 0 Views 608KB Size Report
Jun 19, 2008 - Impact On-Time Performance in the Airline Industry. Kamalini Ramdas ..... West, Northwest, ATA, TWA, United, USAir and Southwest. Of all flights ..... Living by the ”Golden Rule”: Multimarket Contact in the U.S. Airline. Industry.
An Empirical Investigation into the Tradeoffs that Impact On-Time Performance in the Airline Industry Kamalini Ramdas University of Virginia, Darden Graduate School of Business, 189 Faculty Office Building, 100 Darden Blvd., Charlottesville, Virginia 22903, [email protected]

Jonathan Williams University of Virginia, Department of Economics, 220 Wilson Hall, Charlottesville, Virginia 22903, [email protected]

June 19 2008

Abstract We investigate the tradeoff between aircraft capacity utilization and on-time performance, a key measure of airline quality. Building on prior theory (Porter 1996, Schmenner and Swink 2004) and empirical work (Lapre and Scudder 2004) we expect that airlines that are close to their productivity or asset frontiers would face steeper tradeoffs between utilization and performance, than those that are further away. We test this idea using a detailed 10-year airline industry data set, drawing on queuing theory to disentangle the confounding effects of variability in travel time and capacity flexibility along an aircraft’s route. In accord with and building on the findings of Lapre and Scudder (2004), we find that greater aircraft utilization results in higher delays, with this effect being worse for airlines that are close to their asset frontiers in terms of already being at high levels of aircraft utilization. Also, we find that the negative effect of utilization on delays is greater for aircraft that face higher relative variability in travel time along their routes, and is lower for aircraft on routes with higher capacity flexibility - in terms of the ability to substitute in a different aircraft for a particular flight than the one that was originally scheduled. Additionally, we examine how load factor, a measure of how full an airline’s flights are and therefore a key revenue driver, affects on-time performance. Our analysis enables us to explain differences in on-time performance across airlines as a function of key operational variables, and to provide insight on how airlines can improve their on-time performance or their aircraft utilization. Key words : On-Time Performance, Capacity Utilization, Asset Frontiers, Capacity Flexibility

1

2

1.

Introduction

On-time performance is critical to customers when choosing which airline to fly, so it is a key competitive dimension in the airline industry. For example, in 2005, twelve percent of all flights flown within the continental United States arrived over half an hour past their scheduled arrival time.1 Carriers with better on-time performance display comparative statistics on flight delays prominently on their websites. At the same time, airlines that are plagued by excessive flight delays receive a great deal of negative publicity. For example, a recent article about flight delays was entitled “Northwest Ranked Most Tardy Carrier” (New York Times, 2005). The following is a typical example of the type of press coverage that poor on-time performance attracts: “Few things are certain in air travel today, but one comes close: If you’re on Delta Connection Flight 5283 from New York to Washington, you can expect to be late. The flight had the nation’s worst on-time performance in September, arriving late 100 percent of the time at Reagan National Airport, according to a recent government report” - The Washington Post, November 13, 2006 Since on-time performance is a key dimension of airline quality, airlines can choose to incur higher costs in order to improve on-time performance. Research in strategy and operations has shown that the tradeoff between cost and quality can vary depending on a firm’s strategic positioning. Porter (1996) suggests that firms that are close to the productivity frontier in terms of offering any particular quality level at the lowest possible cost face a tradeoff between cost and quality, whereas firms within the frontier can improve on both dimensions. Schmenner and Swink (1998) develop the idea of a firm’s asset frontier, which is a function of its investments in plant and equipment. They suggest that firms which are operating close to their asset frontiers are forced to make tradeoffs between cost and quality. In an empirical study in the airline industry, Lapre and Scudder (2004) find support for these theories. They develop a measure of asset utilization based on aircraft utilization, and report that airlines that are closer to their asset frontiers are more 1

Data source: Bureau of Transportation Statistics. The Bureau of Transportation Statistics defines arrival delay as the time difference between the actual and scheduled arrival time for a flight.

3

likely to incur tradeoffs between cost and quality than those further away from the frontier, where cost is measured by unit cost, defined as in the industry by annual operating cost per available seat mile, and quality is measured by the number of consumer complaints filed annually to the US Department of Transportation. We build on the work of Lapre and Scudder (2004), by examining the tradeoff between one particular way that airlines can reduce cost - namely increasing aircraft utilization - and a specific quality dimension that it is closely related to - on-time performance. We examine how this tradeoff changes as a function of an airline’s positioning relative to its asset frontier. An aircraft’s utilization is a measure of the fraction of the time that the aircraft is actually transporting passengers, out of the available time during which it could be transporting passengers. Based on results from queuing theory, we would expect that other things being equal, for any airline, reducing unit cost by increasing aircraft utilization would hurt on-time performance, due to longer delays (e.g. Taylor and Carlin 1984, Ross 1996). Linking to the notion of closeness to the asset frontier, we would expect that airlines that are operating close to their asset frontier, in that they are already at a high level of aircraft utilization, would incur a heavier penalty in terms of worsened on-time performance when they further increase aircraft utilization, relative to airlines operating further away from their asset frontier. In order to test these ideas empirically, we need to keep in mind that aircraft utilization decisions are made in the context of other related operational decisions and environmental factors. When airlines make strategic choices regarding aircraft utilization, these decisions are operationalized via decisions about individual aircraft routes and flight schedules. An aircraft route refers to an ordered sequence of flight segments to which an aircraft is assigned, where a flight segment refers to a non-stop flight between two airports. For example, ATL-EWR, EWR-BWI, BWI-CLE, CLEMDW is a route, and ATL-EWR is a segment.

2

Airline routes vary in terms of a host of factors

including distances flown, congestion levels at airports along the route, the relative variability in 2

ATL is short for Atlanta, EWR for Newark, BWI for Baltimore-Washington Airport, CLE for Cleveland and MDW for Chicago Midway.

4

travel times along the route 3 , and the degree of capacity flexibility built into the route in terms of the airline’s ability to swap in aircraft and crews when needed. Thus depending on the nature of a route, any particular choice of aircraft utilization might have very different implications for on-time performance. We draw on queuing theory to disentangle some of these confounding effects, and use additional controls for the others. Our goal is to carefully examine how lowering unit cost via increased aircraft utilization affects on-time performance quality, and how this tradeoff varies according to an airline’s strategic positioning. Based on queuing theory fundamentals, the negative impact of increased utilization on average waiting times should be even worse for a queuing system with greater relative variability in service times or inter-arrival times. Therefore in the airline context, we would expect that other things being equal, as aircraft utilization increases, routes with greater relative variability in travel times should incur relatively longer delays. Also, queuing theory predicts that as utilization increases, a system with greater capacity flexibility should incur a smaller increase in average waiting time. At the same time, for two queuing systems with the same average utilization, the one with greater capacity flexibility should exhibit shorter average waits – implying the familiar result that a single ”snake line” that feeds multiple servers will result in lower average waits than a system where there is a separate line at each server, with no movement across lines. In the airline context, we would expect that all else equal, routes with greater capacity flexibility in terms of the ability to swap in a different aircraft than the one scheduled for a flight, should incur lower delays, at the same level of aircraft utilization. In our empirical analysis, we are able to test for these contingent effects. Importantly, doing so enables us to perform an apples-to-apples comparison of cost - quality tradeoffs across different airlines. One way that airlines can improve profits is by lowering unit costs through increasing aircraft utilization, provided the negative implications of accompanying reductions in on-time performance quality do not annul the lower unit costs. Another important way in which airlines can increase 3

Relative variability refers to the extent of variability in travel time relative to the mean travel time.

5

profits is by increasing revenues, by flying fuller planes. Load factor, defined in the industry as the ratio of revenue passenger miles to available seat miles, is essentially a measure of how full an aircraft is. Due at least in part to OR-related improvements in revenue management (e.g. Talluri and van Ryzin 2004, Jacobs et al. 2000), airlines have made huge strides in increasing load factor over time. However, since flying fuller planes increases the time needed for boarding, deplaning and ground activities, we would expect that if aircraft utilization is high, increasing load factor could simultaneously worsen on-time performance. We conduct our analysis using data obtained from the Bureau of Transportation Statistics at the US Department of Transportation. We test our hypotheses using data on flights flown within the continental US in the years 1995 - 2005. 4 Our empirical analysis strongly supports our predictions about how decreasing unit costs via increasing aircraft utilization impacts on-time performance quality. We find that increasing aircraft utilization hurts on-time performance, and that this effect is even worse on routes with more unpredictable travel times or higher load factors, and mitigated on routes with greater capacity flexibility. Comparing across airlines, we find that airlines that are operating closer to their asset frontiers face significantly steeper tradeoffs between cost and quality: they pay a much higher penalty in terms of worsened on-time performance when they attempt to lower costs via increased aircraft utilization. These results are in accord with and build on the findings of Lapre and Scudder (2004). Importantly, we develop estimates of the extent to which increasing aircraft utilization affects on-time performance for different airlines, and we also develop estimates of the contingent effects of variability, capacity flexibility and load factors on this tradeoff. While closed form queuing models and queuing simulations have been used for almost half a century to schedule resources in a wide arena of applications (Cachon and Terwiesch 2005), to the best of our knowledge there is no research that empirically examines the magnitude of queuing-related effects such as the impact of capacity utilization, capacity flexibility and variability on average delays, in specific industrial contexts. For 4

As explained later, due to data limitations we drop the year 2005 for certain carriers.

6

example, in a recent review of call center operations, Gans et al. (2003) plot actual waiting times as a function of agent utilization, yet these authors do not consider the use of regression analysis to simultaneously analyze the many distinct variables that impact waiting times. Knowing the magnitude of queuing-related effects is important as it provides insight into how best to make operational changes within a particular industry context. While queuing models help predict outcomes such as delays, in practice there are many different decisions that impact delays, several of which are not captured in the models. In our empirical approach, we use regression analysis to control for these other factors that may affect delays. By doing so, we are able to provide insight into how airlines might improve their on-time performance, or alternately, increase the utilization of their fleets with the lowest on-time performance penalty. While airline managers understand the tradeoffs involved in managing aircraft utilization, models such as ours can help navigate these complex tradeoffs. In section 2 below, we develop our hypotheses. In section 3 we describe the data and variables. In Section 4 we specify the model and discuss estimation. Section 5 contains a discussion of the results. Section 6 contains concluding remarks.

2.

Hypothesis Development

A stream of research in strategy and operations has examined how a firm’s tradeoffs between cost, quality, delivery and flexibility depend on its strategic positioning. Porter (1996) distinguishes conceptually between price and non-price differentiation, where non-price differentiation includes attributes such as quality, delivery and flexibility. In any industry, he describes firms that offer any particular level of non-price differentiation at the lowest possible price as being on the productivity frontier of that industry. He suggests that firms that are on the frontier need to make tradeoffs between price and non-price differentiation, whereas those operating within the frontier can improve along both dimensions, by improving operational effectiveness. Schmenner and Swink (2004) introduce the concept of asset frontiers, which are determined by structural choices that a firm makes, and they suggest that firms operating close to their asset frontiers need to make

7

tradeoffs between performance dimensions, whereas firms operating further from their frontiers can advance along multiple dimensions. Lapre and Scudder (2004) are the first to test these theories empirically, in the context of the airline industry. These authors measure an airline’s yearly unit cost as operating expenses per available seat mile - a popular industry metric, and quality as number of consumer complaints filed to the US Department of Transportation. Examining nine major carriers over a ten year period, they find that airlines operating closer to their asset frontiers, in terms of fleet utilization, face tradeoffs between cost and quality, whereas those operating further way from their asset frontiers are able to improve upon both dimensions simultaneously. We build on the work of Lapre and Scudder (2004). Rather than focus on unit cost, we focus on a particular driver of unit cost - aircraft utilization

5

- and examine the tradeoff between this

cost driver and a directly related quality measure - on-time performance - as a function of a firm’s strategic positioning relative to its asset frontier. By viewing an airline network as a queuing system, we can derive some insight about the tradeoff between aircraft utilization and on-time performance. An aircraft can be viewed as a resource in a queuing system, with flights scheduled on the aircraft akin to jobs in a queuing system. The service time for a flight can be viewed as the total time spent in boarding, taxiing from the departure gate up until takeoff, flying, taxiing to the arrival gate after landing, and deplaning. Service time for a particular flight - e.g. the daily 6PM US Airways flight from Philadelphia to Pittsburgh, varies dayto-day due to variation in weather, airport congestion by day-of-week, etc. For an aircraft scheduled for several flights or “jobs” in the course of a day, service time also varies systematically across jobs –e.g. a flight from Philadelphia to San Francisco should take longer on average than the above flight. Similarly, there is variability in the arrival process for each flight or “job” in this queuing system. Consider again the 6PM US Airways flight above. Variability in actual departure time due to waiting for passengers from connecting flights is an example of arrival process variability. The 5

It is quite reasonable to assume that an airline’s unit costs are decreasing in its aircraft utilization.

8

queuing system described above is conceptually similar to a doctor’s office that admits patients by appointment only. From queuing theory, it is reasonable to expect that all else being equal, increased utilization would worsen on-time performance. Further, tying back to the notion that the nature of tradeoffs can differ depending on a firm’s positioning relative to its asset frontier, we expect this effect to be stronger for airlines operating closer to their asset frontiers. H1: An airline’s on-time performance worsens as its aircraft utilization increases, and this negative effect is greater for airlines operating close to their asset frontiers. To test this hypothesis empirically, we need some additional theory, in order to separate out the confounding effects of other variables that affect on-time performance. While increasing aircraft utilization should worsen an airline’s on-time performance, the magnitude of this effect may be higher or lower depending on other factors. One such mediating factor is relative variability in the route, i.e. the amount of variability in travel time relative to the mean travel time. As an example, consider an aircraft that spends all day every day flying roundtrips from Phoenix to Albuquerque in clear desert climate, to another that spends all day every day flying roundtrips from Minneapolis to Chicago, a route with about the same travel time, but with much less predictable weather. The second aircraft would exhibit greater relative variability in travel time. Queuing theory predicts that increasing utilization has a worse effect on waiting times when relative variability is higher (Ross 1996). Therefore, other things being equal, flights on the second aircraft are likely to experience greater delays than flights on the first aircraft, as aircraft utilization increases. H2: The negative effect of aircraft utilization on on-time performance is greater along routes with greater relative variability in travel time. Another mediating factor that impacts how aircraft utilization affects on-time performance is the degree of capacity flexibility. Van Mieghem (2003) and Tayur, Ganeshan and Magazine (2000) review the literature on flexible capacity management. In general, more flexible capacity results in lower waiting times. In the airlines context, aircraft come in different categories, such as jets vs. turboprops, which are further subdivided into aircraft families. For example the Boeing 737,

9

767 and 777 are families of jets. Within each aircraft family or “fleet”, aircraft models have the same cabin configuration and crew-rating, where the latter refers to the certification required by crew members in order to work on an aircraft. Pilots are typically certified to fly only one aircraft family, while other flight crew can typically fly all families. Capacity flexibility in the airline industry is a function of flight schedules, aircraft assignments and crew schedules. Flight schedules are developed a year prior to departure and updated every 3 months. Due to government regulations and contractual obligations, crew scheduling decisions are made well in advance of the departure date – e.g. 8 to 12 weeks out at United Airlines (Bish et al. 2004). Since aircraft assignments are an essential input into crew scheduling, these decisions must be made even earlier. In recent years, airlines have begun to adjust aircraft assignments a few weeks prior to the date of departure based on better demand information, to better match capacity with demand. This is done by choosing aircraft of closest to the right size given projected demand for a flight, and is called “demand-driven swapping” (e.g. Berge and Hopperstad 1993). More recently, researchers have developed procedures that allow for relatively late changes in aircraft assignments without changing the crew schedule, by only swapping aircraft within the same aircraft family (e.g. Bish et al. 2004, Sherali et al. 2005). Aside from demand-driven swapping of aircraft in the weeks prior to departure, on-the-fly modification of aircraft routes also occurs frequently on the day of departure itself. When an aircraft is unexpectedly grounded6 or delayed, an airline is unable to fulfill its flight schedule. Operations managers attempt to minimize disruptions via modifying the flight schedule, changing the aircraft routing, and reassigning crew to flights. For example, Continental Airlines’ System Operations Control Center located at its headquarters in Houston TX monitors operations, tracks the execution of schedules, anticipates disruptions and determines the post-disruption recovery path (Yu et al. 2003). Arguello et al (1997) describe a procedure to modify the aircraft routing in response to grounding and delays. As is typical in practice, they alter the routing by substituting for the 6

Grounding may be due to reasons such as unscheduled maintenance, or an FAA imposed “ground delay”, a restriction on capacity at an airport suffering bad weather, during which a reduced number of flight arrivals and departures are mandated for a certain time period.

10

aircraft originally assigned to fly any segment with other aircraft from the same “fleet” or aircraft family. Thus resource adjustments using flexible capacity are made both in the months and weeks preceding departure, and on the actual day of departure. Our focus is on the latter, i.e. the last-minute adjustments. For any airline, if there are many flights using the same type of aircraft and crew as a particular flight scheduled to depart close in time, capacity flexibility is greater. As a direct consequence of queuing theory we expect last-minute capacity flexibility of the type discussed above to mediate the impact of aircraft utilization on on-time performance. H3: The negative effect of aircraft utilization on on-time performance is mitigated along routes which exhibit greater capacity flexibility at airports along the route. The above three hypotheses enable us to examine how the tradeoff between unit cost - driven specifically by aircraft utilization - and on-time performance varies based on an airline’s strategic positioning. Reductions in unit cost can improve profitability, provided they do not hurt revenues. Another major lever that airlines can and do use to influence profitability is the extent to which their planes are full, measured by load factor. While airlines have made dramatic improvements in load factor over time, we hypothesize that increased load factor can hurt on-time performance, particularly when aircraft utilization is high. The logic is simple. With fuller planes, the time for boarding and deplaning should increase. For aircraft scheduled with smaller turnaround times and high utilization, the chances of not being able to turn a plane in time increase. All else being equal, if turnaround times are short and aircraft utilization is high, an increase in load factor is likely to have a worse effect on on-time performance than in the case of long turnaround times and lower aircraft utilization, because there is a smaller time window available in which to turn the plane. H4: Increasing load factor will have a worse effect on on-time performance for highly utilized aircraft than for less utilized aircraft. In section 3, we discuss data and variables. While we tested our hypotheses for both on-time arrival and on-time departure performance, for space reasons we discuss only on-time arrival performance. Our results for on-time departure performance were qualitatively very similar.

11

3.

Data and Variables

Each year, the Bureau of Transportation Statistics at the US Department of Transportation lists carriers that contribute at least one percent of total domestic scheduled-service passenger revenues as major carriers. In our analysis, we consider all carriers listed as major carriers for all or some of the years 1995-2005, with the exclusion of carriers which serve largely as a connector service to other major carriers, such as Comair which serves as a connector for Delta Airlines. By doing so, we consider thirteen major carriers: American, Alaska, JetBlue, Continental, Delta, Air Tran, America West, Northwest, ATA, TWA, United, USAir and Southwest. Of all flights flown in the continental US by these carriers in 1995-2005, we only use those flights for which we have information on the specific tail number of the aircraft that flew the flight.7 Based on discussions with airline industry experts, we drop flights with obviously bad data. This results in dropping less than 1% of the flights for which we have tail number information. These include flights with negative flying time or taxi time, taxi time in excess of 180 minutes or delays in excess of 240 minutes 8 . Our unit of analysis is the individual flight, i.e. a non-stop flight connecting a pair of airports within the continental US on a particular day in the period 1995-2005 9 . In what follows, for each observation let i denote a flight operated by a specific carrier. Let t denote the scheduled arrival time of the flight, where t also specifies the date of the flight. Note that i and t together specify a particular flight that occurred on a particular date and time. We will sometimes refer to this flight in short form as flight (i,t). Let k denote the route that was flown by the aircraft that was used for flight i with scheduled arrival time t, and l the partial route comprised of flights in route k that were scheduled to have been completed at time t. 7

The tail number, which is assigned by the Federal Aviation Association, is a unique identifier that identifies a particular aircraft. A significant fraction of those flights in the data for which a tail number is not reported are cancellations. In order to focus on aircraft that are utilized primarily for flights in the continental US, we drop all tail numbers equipped to fly over large water bodies. This data is only available for 1995 - 2004, so for carriers with international routes we use data from 1995 - 2004 only, and for carriers with only domestic routes we use data from 1995 - 2005. 8

In doing this, we lose far less than 1% of the upper tail of taxi times.

9

1995-2004 for carriers with international routes.

12

3.1.

Dependent Variable

Arrival Delay: arrdelay itkl is the difference between flight i’s actual and scheduled arrival time 10 3.2.

Explanatory Variables

Aircraft Utilization: An aircraft is typically scheduled to do several flights each day. For each flight, the relevant measure of utilization in analyzing on-time arrival is the aircraft’s utilization up until the scheduled arrival time of the flight in question. Depending on how flights flown over the day are scheduled for an aircraft, its utilization up until the time of each flight may go up and down over the day. Our measure of aircraft utilization captures this idea. For arrival delays, aircraft utilization util itl for flight i with scheduled arrival time t, using an aircraft that was scheduled to have completed flying partial route l at time t, is defined as the ratio of the average time spent taxiing and flying partial route l to the total time available from the scheduled departure time of the first flight on route l on the same date as the observation, till the scheduled arrival time t. The average in the numerator of this ratio is taken over all aircraft belonging to the same carrier as the observation at hand that actually flew partial route l in the same month as the observation at hand. We average over the month in which the observation falls because utilization on any partial route may vary systematically by month, and we restrict the average to flights flown by the same carrier because taxiing and flying times may vary systematically by carrier on any partial route, for example due to use of different aircraft types. For each carrier, we plot the distribution of the average number of flights in each partial route in a month, and drop all partial route - month combinations in which the number of flights is less than the maximum of either 25 flights

11

or the lowest 50th percentile of this distribution.

12

As an example of how we calculate utilization, consider the following USAir partial route: PHL - PIT, PIT - CHO, CHO - PHL, depicted in Figure 1. Suppose we are looking at an observation 10

Departure delay was similarly defined. As mentioned earlier, for space reasons we report results only for arrival delays. Further, consumers care about arrival delays, not departure delays. 11 12

We also used 50 flights as a robustness check and obtained similar results.

We also tried an alternative definition of aircraft utilization, in which the starting time in the denominator is 5AM, rather than the scheduled departure time of the first flight of the day. Using this definition results in qualitatively similar results, which we do not report.

13

on March 12 2002, for which the scheduled departure time for the first flight in the route, PHL PIT, was 7AM and the scheduled arrival time at PHL after the third flight was 12 noon. Then the denominator for the utilization calculation would be 5 hours, i.e. 7AM - 12 noon. If the average flying and taxiing time 13 for each of the three segments in this route was one hour in March, then utilization would take on a value of 3/5, i.e. 60 percent. The Bureau of Transportation Statistics defines the “scheduled block time” for a flight as the difference between its scheduled arrival time and its scheduled departure time. We could alternatively have defined the numerator in the aircraft utilization calculation above as the sum of the scheduled block times for all flights flown on partial route l. However by doing this, our measure of aircraft utilization would be biased upward relative to true utilization, to the extent that airlines might include some padding in their scheduled block times in order to improve their on-time performance. Shumsky (1993) provides evidence that airlines do indeed buffer their schedules, so using scheduled times in the numerator would result in a biased measure of utilization. By measuring aircraft utilization as we do, we avoid this bias.

14

In addition, prior research has established that

there is little variation in airlines’ scheduled utilization, despite predictable weather patterns and daily demand patterns (Mayer and Sinai, 2003A). Relative Variability: relvar itl captures relative variability in service time, and is defined as the coefficient of variation of the total time spent taxiing and flying to complete partial route l.

15

The

variance and mean are calculated using all aircraft of the same carrier as the observation at hand that actually flew partial route l in the same month as the observation at hand, where the partial route l includes all flights that were scheduled to have been completed before or at time t.16 For ease of interpretation, we use standardized values for relative variability in the regression analysis. 13

Taken over USAIR flights only.

14

Note that we do not include the time taken for boarding and deplaning in this calculation. From interviews with several airline executives, we learned that the airlines omit boarding and deplaning times, and include taxiing and flying times as we do, in calculating aircraft utilization. 15

In another specification, we replaced relative variability by variance, and obtained nearly identical results, suggesting that within a route, differences in relative variability across observations are driven primarily by differences in variance. 16

We allow the relative variability to vary by month to capture monthly differences in weather patterns and congestion, and by carrier to capture systematic differences in relative variability across carriers along a particular partial route, due for example, to the use of different aircraft types.

14

Capacity Flexibility: For arrival delays, flex itl is defined as the average number of aircraft of the same carrier and aircraft family as the aircraft used for flight (i,t), that were scheduled to depart in the same hour as the aircraft used for flight (i,t), at each relevant departure airport in partial route l, with the average taken over all departures scheduled prior to time t at airports in partial route l.17 For ease of interpretation, we use standardized values for capacity flexibility. Load Factor: Load factor for flight (i,t), load itkl is defined as the average of the ratio of revenue passenger miles to available seat miles over all flight segments comprising partial route l, in the year and month in which date t falls, for the airline in question.

18

For ease of interpretation, we

use standardized values for load factor. Aircraft Age: This variable measures the age of the aircraft. We include it as a control in case older aircraft are more prone to unexpected maintenance. Aircraft Family: We use aircraft crew rating as a proxy for aircraft family, where crew rating is a dummy variable indicating the crew rating that certifies a pilot to fly the aircraft. As an example, a single crew rating applies to all aircraft in the Boeing 737 family. With only a few exceptions, each crew rating corresponds to one aircraft family. One exception is that the Boeing 757 and 767 families fall under a single crew rating. We use the aircraft family variable to control for any potential effects of aircraft size on delays. For example, our discussions with airline executives suggest that larger aircraft take longer to turn around. It is also possible that an airline may give priority to the larger aircraft in its takeoff queue, in a situation where multiple flights are delayed. Route-Partial-Route Fixed Effect: For an observation concerning flight i with scheduled arrival time t, this fixed effect, (FEikl ), is an indicator variable that indicates the route k flown by the aircraft that was used for the flight, and the partial route l that was scheduled to have been completed by time t. For example, for the route CHO-PHL, PHL-PIT, PIT-PHL, PHL-CHO, 17

In a second variant of this definition, we replace the average in the above definition by the minimum flexibility along the partial route. As a third variant, we replace the average above by the flexibility at the last scheduled departure prior to time t in partial route l. With both of these alternative definitions, we obtained qualitatively similar results. We report results only for the first measure. 18

Alternately, we also measure load itkl by the average of the ratio of revenue passenger miles to available seat miles for the last flight in partial route l, in the year and month in which date t falls. We obtained qualitatively similar results with either definition.

15

CHO-PHL19 , there would be five such fixed effects, one for each flight in the route. It is reasonable to assume that an airline’s scheduling department takes account of the entire route, as well as the position of each particular flight in the route, when making scheduling decisions. For example, in the above route, the first and the last flights, although both from CHO to PHL, may be viewed very differently for scheduling purposes. This fixed effect takes account of these differences, while also controlling for haul lengths along the route, the number of each flight in the route – e.g. 1st vs. 5th flight of day, and a number of other factors that remain fairly constant for a carrier over a route, such as the number of gates the relevant carrier operates at each airport along the route, their size, accessiblity, etc. In addition to the above variables, we use dummies for each year, month, day of week and departure time-of-day measured in four hour time intervals. Such dummies are common in airlines research, e.g. Mayer and Sinai (2003A, B), Mazzeo (2003). Roth et al. (2008) highlight the importance of examining the validity and reliability of research measures. We find that the correlation between arrival and departure delays for each airline is 0.8 or higher, which indicates a high level of convergent validity. All major airlines today are equipped with an aircraft communications addressing and reporting system (ACARS), which is a digital datalink system for transmission of small messages between aircraft and ground stations via radio or satellite. ACARS systems automatically report taxi out, flying and taxi in times with high accuracy. Further, due to the highly regulated nature of the airline industry, all variables reported by the airlines to the Department of Transportation are closely scrutinized.20 Audit reports, e.g. Rehmann (1995) provide external evidence of high reliability of the reported measures. Table 1 shows descriptive measures of our key variables by carrier. We separate the carriers into two broad categories: low cost carriers and full service carriers.21 As can be seen in more detail in Figure 2, which shows average aircraft utilization 19

22

over time by carrier, the low cost

CHO is short for Charlottesville, PHL for Philadelphia, and PIT for Pittsburgh.

20

e.g. see US General Accounting Office, Report to Congressional Requestors, GAO-02-710, July 2002.

21

This is similar to the classification used by the Department of Transportation.

22

In Table 1 and Figure 2, the average of utilization is computed using each aircraft’s entire day’s route, for each carrier.

16

carriers, JetBlue, Airtran, America West, ATA and Southwest, are operating relatively closer to their asset frontiers in terms of higher aircraft utilization, than the other major carriers. Figure 3 shows a histogram of arrival delays, which exhibits a typical waiting time distribution. Figure 4 shows average load factor by carrier over time, and we see a clear increasing trend in load factor.

4.

Model Specification and Estimation

To test our hypotheses, we run separate regressions for each carrier. Notice that the dataset for each carrier can be viewed as a panel, in which each route and partial route combination represents an individual, with multiple observations for each individual over time. The advantage of this panel structure is that we can use fixed effects or within estimation (see Hayashi 2000) for each carrier, grouping by the route-partial-route fixed effects. If we were to instead run OLS without the routepartial-route fixed effects, one could think of the error term for OLS as being composed of two components. Following common convention, e.g. Evans and Kessides (1994), the first component could be assumed to remain constant over time and known to the carrier but not to the econometrician. The second component could be assumed to be i.i.d. over time and unknown to the carrier while making choices. Clearly, the presence of the first component of the error term would result in a violation of the OLS exogeneity assumption, since the error term would be correlated with the regressors. For example, if some determinants of delays - such as the average amount of congestion at airports along a route - were known to the carrier but not the econometrician, the carrier might take this information into account while scheduling utilization along a particular route. By using a fixed effects estimator, the portion of the error term that is constant over time and known to the carrier is captured by the route-partial-route fixed effects, and variations in the independent variables within a route over time are used to get consistent coefficient estimates. For each carrier, the starting point for a fixed effects estimator is an equation of the type below, in which the dependent variable is arrival delay (arrdelay itkl ), and the independent variables are aircraft utilization (util itl ) , relative variability (relvar itl ), capacity flexibility (flex itl ) and load factor (load itl ), as well as a control variable for aircraft age (age it ) and dummy variables for aircraft

17

family (familyit ), year (yeart ), month (montht ), day-of-week (dayt ) and departure time-of-day (depblockit ), and route-partial-route fixed effects (FEikl ). arrdelayitkl = θ0 + utilitl ∗ (θ1 + relvaritl θ2 + f lexitl θ3 + loaditl θ4 ) +ageit θ5 + familyit θ6 +yeart θ7 + montht θ8 + dayt θ9 +depblockit θ10 +FEikl θ11 + itkl In the above equation, itkl meets the exogeneity assumption required by OLS

23

. However, a

problem with the above equation is that the number of route-partial-route fixed effects F Eikl is very large, creating problems for OLS estimation. To deal with this problem, we transform the dataset as follows. For each observation, we subtract from each variable its mean value within its relevant carrier, route and partial route. This results in the following equivalent specification. arrdelayitkl − arrdelayitkl = β0 + (utilitl − utilitl )θ1 + (utilitl relvaritl − utilitl relvaritl )θ2 +(utilitl f lexitl − utilitl f lexitl )θ3 +(utilitl loaditl − utilitl loaditl )θ4 +(ageit − ageit )θ5 + (familyit − familyit )θ6 +(yeart − yeart )θ7 + (montht − montht )θ8 +(dayt − dayt )θ9 + (depblockit − depblockit )θ10 + (itkl − itkl ) One can now run a simple OLS regression on the transformed model above, whose error term, itkl − itkl is i.i.d. and as a result, satisfies the exogeneity assumption required for OLS to be valid. 23

Note that we do not inlcude relvaritl and f lexitl as direct effects, because queuing theory predicts that these variables impact delays only through their interaction with utilitkl . See, for example, Hopp and Spearman (2000), page 270, for the equation for average waiting time in an G/G/1 queue, in which variability enters only as an interaction with utilization. Including as main effects variables that enter the regression as an interaction can reduce the significance of the coefficient of the interaction term, due to multi-collinearity. Econometric theory suggests that unless there is theoretical justification, variables that enter as interaction terms should not be entered separately as main effects (e.g. see Pindyk and Rubenfeld (1991), pages 164-165.

18

In the above specification, the coefficients of all variables are identified only using variation in these variables within a route and partial route, for each carrier. This type of specification enables us to control for a number of variables that vary systematically across routes and partial routes, within each carrier. For example, an aircraft utilization of 80% can be achieved by flying two long flights with long layovers, or by flying several short flights with short layovers, each of which has very different implications for delays. Even if all flights were about the same distance, an aircraft utilization of 80% might be achieved via flying a few flights connecting airports notorious for long delays, or several flights connecting easy airports, again with different implications for delays. The route-partial-route fixed effects control for these as well as many other systematic differences. Since flights are scheduled several months in advance of the departure date, average aircraft utilization along an aircraft’s route as well as the flexibility and variability that it will encounter along its route cannot be changed by the carrier on the day of the flight in response to an unobservable variable such as actual weather on the day of the flight, which is in the error term. Unobservable variables that affect delays can influence the setting of schedules only to the extent that they can be predicted months in advance of a flight. We expect that with the inclusion of year, month, dayof-week and departure time of day dummies, in addition to our route-partial-route fixed effects, our variables of interest will be largely uncorrelated with the error term. Therefore the within estimation procedure (Hayashi 2000) is appropriate. Our observations display a correlated error structure, since any unobservable that results in the delay of one flight affects delays on all subsequent flights flown in that day by the same aircraft. For example, if the first flight of day for an aircraft is delayed due to engine trouble, all subsequent flights will also likely be affected by this event. To account for this error structure, we calculate robust standard errors (Greene 2003). The results of our within estimation regressions on using robust standard errors are reported in Table 2.

5.

Results

Table 2 contains the results of our regression analyses by carrier, for the thirteen carriers included in our study. Across all thirteen carriers, we find as hypothesized that the coefficient of aircraft

19

utilization is positive and highly significant. Also, as noted earlier from Table 1, certain carriers, notably SouthWest, JetBlue and Airtran are operating closer to their asset frontiers than the other carriers, in terms of having higher aircraft utilization. In Table 2, we see that for these carriers, increasing aircraft utilization has a significantly worse impact on on-time performance than for carriers that are operating further away from their asset frontiers, providing unambiguous support for hypothesis 1. Further, we find that for twelve out of thirteen carriers, the coefficient of the interaction of aircraft utilization with relative variability along the route is positive and highly significant, providing strong support for hypothesis 2. Also, the coefficient of the interaction of aircraft utilization with capacity flexibility is negative and highly significant for ten out of thirteen carriers, providing strong support for hypothesis 3

24

. Thus increasing aircraft utilization results in greater delays, and at

any level of utilization, greater relative variability along the route further exacerbates delays, while greater capacity flexibility reduces delays. The regressions in Table 2 also show that the interaction of utilization and load factor is positive and significant for twelve out of thirteen carriers, providing strong support for hypothesis 4. Increasing load factor leads to greater delays when utilization is high, than when utilization is low. Only for JetBlue, increasing load factor has no significant impact on delays. From Table 1 and Figure 3, we see that while JetBlue’s load factor is consistently about 10 percent higher than for all other carriers, it has the lowest standard deviation, which might explain the insignificant coefficient. In constructing our utilization measure, the numerator of utilization is calculated as the mean time required for a carrier to complete the partial route of interest in a particular month. As with any statistic, sampling error may be of concern. To demonstrate that this sampling error is not the source of the results we performed an identical set of within regressions using a larger number of data points in computing the average utilization. Specifically, for each carrier, we drop 24

Jetblue showed a positive and significant coefficient for this term. However, when the alternative definition of flexibility was used, the coefficient was negative and significant. Continental and America West showed insignificant coefficients

20

all route-month combinations in which the number of flights is less than a maximum of either 50 flights, rather than 25, or the lowest 50th percentile of flights in that route-month.25 We find nearly identical results using this robust sub-sample, providing further support for our findings. In the discussion to follow, we focus on the results using the full sample. Consider next the magnitude of the effects. For brevity, we pick one full service carrier, American, and one low cost carrier, Southwest. For American, from column 1 in Table 2, the coefficient of aircraft utilization for any fixed level of relative variability along the route, capacity flexibility and load factor can be written as (35.201+1.993*RELVAR -1.425*FLEX+0.889*LOAD). Since we used standardized values for relative variability, capacity flexibility and load factor, 35.201 represents the coefficient of aircraft utilization when the other three variables are at their mean levels, zero. Further, in this scenario, a one standard deviation increase in capacity utilization results in a 3.978 minute increase in average arrival delays. On the other hand, for Southwest, when relative variability, capacity flexibility and load factor are at their mean levels, a one standard deviation increase in capacity utilization results in a 9.110 minute increase in average arrival delays. Thus for a similar increase in average utilization, Southwest pays almost twice the penalty in terms of worse on-time performance. For American (Southwest), with aircraft utilization at its mean level, a one standard deviation increase in relative variability along the route keeping capacity flexibility and load factor at their mean levels raises expected arrival delays by 1.816 (0.308) minutes, while a similar one standard deviation increase in capacity flexibility lowers expected arrival delays by 1.298 (0.629) minutes, and a one standard deviation increase in load factor increases expected arrival delays by 0.810 (1.101) minutes. The size of these effects compares favorably with the impact of other important determinants of airline delays that have been studied in past research. For example, Mazzeo (2003) examines the impact of competition on flight delays, using a solo dummy to indicate whether or not a route is flown by only one carrier, and a market share variable defined as the percentage 25

This results in a smaller dataset for each carrier. Also, for space reasons, we do not report these results.

21

of flights on a route that are flown by a carrier. Based on Tables II and III in Mazzeo (2003), a one standard deviation increase in percentage of solo routes would increase average delays by 0.66 minutes, and a one standard deviation increase in a carriers market share on a route would increase average delays by 1.18 minutes. Clearly, airlines can improve their on-time performance by reducing aircraft utilization. Our discussions with managers reveal that reductions in aircraft utilization come from creating padding in an aircraft’s schedule, either by increasing the scheduled block time for flights on its route, or by increasing the scheduled turnaround times. Executives at one airline mentioned to us that when adding padding to scheduled block times, they examine the distribution of actual block times using historical data, and pick the scheduled block time for each flight so as to obtain what they consider to be a desirable service level. This analysis ignores the cascading effect of delays over the course of the day. On the other hand, airline managers also at times increase scheduled turnaround times, which reduces the impact that each flight’s actual block time has on the on-time performance of subsequent flights. While reducing utilization improves on-time performance, our discussions with airline executives reveal a number of costs associated with reducing aircraft utilization, be it by padding scheduled block times or turnaround times. One major cost is the cost of additional aircraft that are needed if the airline intends to maintain its set of flight offerings, as padding of the schedule eats into precious capacity that would otherwise be used to schedule more flights. Airline managers take the carrier’s entire flight schedule into consideration to evaluate this implication of padding. For example, managers at one airline told us that a ten minute padding added to the turnaround time on each of their flights would result in about 40 minutes taken out of each aircraft’s daily route. Cumulating this lost time over just a few aircraft would cover the time required for one scheduled flight. Extending this logic, an across-the-board increase in turnaround time by 10 minutes would result in the need for three extra aircraft for the fleet, to maintain the same set of flight offerings. If the airline were to decide not to purchase or lease extra aircraft, then it would forego the revenues from the flights that would be eliminated due to padding. As an example, a manager at one airline

22

shared a simple model that estimates lost revenues per day from not flying an aircraft, as a function of the aircraft’s characteristics (seat capacity and cruise speed), and airline characteristics such as revenue per available seat mile, average segment length, and aircraft utilization. In Table 3, using this model, and data from JetBlue’s 2004 annual report, the lost revenue from not flying a 150-seat Airbus A320 is estimated at 61,710 dollars per day. In addition to the costs of either additional aircraft or lost revenues, padding of turnaround times would also significantly increases ground costs, such as the cost of leasing gates, which can be as high as one million dollars for one gate for one year. Of course, these costs of increasing utilization, although significant, need to be compared with the cost of incurring delays. Delay costs vary based on whether delays are incurred on the ground or in the air. Both of these types of delays have certain costs in common, in particular the costs of accommodating passengers who have missed connections: hotel accommodations, taxi receipts and meal vouchers, and rebooking costs. The costs associated with lost future revenues due to passengers switching to other carriers due to poor service are also in common. In addition, with airborne delays the airline incurs additional crew costs

26

and additional fuel costs, while with

ground delays the airline incurs additional ground staff and gate costs. We learned from our discussions with airline executives that while they do often try to pad the schedules on some routes, at the same time a major issue they struggle with is how to increase overall aircraft utilization, to create more revenue generating opportunities from their extremely expensive asset base. As with decreasing utilization, increasing utilization requires navigating the same tradeoff between operating costs and delay costs. Our regression-based approach can help airline managers in the complex task of aircraft scheduling, by providing a way to estimate the impact on delays of increasing utilization along specific routes. As discussed above, evaluating the schedule on a flight-by-flight basis does not take into account the cascading impact of delays. Our approach builds in the cascading impact in examining the impact of altering an aircraft’s flight 26

At many carriers, crew are paid based on the maximum of actual and scheduled block time.

23

schedule. Our approach also links the on-time performance penalties from increasing utilization to an airline’s strategic positioning.

6.

Concluding remarks

In this study, we test whether the strategic positioning of airlines relative to their asset frontiers impacts the tradeoff between cost and quality, specifically examining the tradeoff between aircraft utilization and on-time performance. We find that increases in daily aircraft utilization are a significant source of flight delays, and that on-time performance worsens to a greater degree when aircraft utilization is increased by carriers operating close to their asset frontiers, than for those operating further away from their asset frontiers. This finding is in accord with the findings of Lapre and Scudder (2004), and suggests that firms need to plan their efforts to reduce costs in a way that is consistent with their strategic positioning, in order to avoid unforeseen penalties in terms of worsened quality. Further, we find that for any level of aircraft utilization, delays are significantly higher when the relative variability in travel times on the route is higher. Also, for any level of aircraft utilization, we find that delays are significantly lower if there is greater capacity flexibility on the route. To our knowledge, ours is the first study to empirically test the impact of changes in resource utilization, capacity flexibility, system variability and system load on waiting times, in a specific industry context. An important contribution of our work is that we are able to assess the magnitude of the impact of different operational decisions, such as aircraft utilization, capacity flexibility, variability along the route and load factor, on on-time performance. Our analysis also helps understand the drivers of differences in on-time performance on particular flight segments. Understanding how changes in operational variables such as aircraft utilization, variability in taxiing and flying time, capacity flexibility and load factor impact expected delays on a flight segment can be valuable to an airline, particularly when it is competing head to head with another carrier on the same flight segment. Passengers, and therefore airlines, care a great deal about flight delays, and also about flight cancellations. We are unable to examine how aircraft utilization, variability in taxiing and flying

24

time, and capacity flexibility impact the chances that a flight will be cancelled, as we do not have access to tail number information for all cancelled flights. Without this information one cannot track what the aircraft involved in a cancellation was doing prior to the cancellation. In this paper we take airlines’ scheduling decisions as given, and control for factors that influence these decisions. Future research could go one step further and examine the drivers of schedule choice itself.

Acknowledgments We are grateful to the Bureau of Transportation Statistics at the US Department of Transportation for providing us with the data used in this study, and in particular to Don Bright and his staff for many useful discussions. For sharing their industry perspective, we are grateful to Tim Jacobs and Hadi Purnomo at American Airlines, Ted Wang at USAirways, Esra Orhon and colleagues at Continental Airlines, Carolyn Deng and colleagues of Alaska Airways, Stuart Thomas and colleagues at Southwest Airlines, and Garth Monroe formerly at JetBlue. For useful comments, we are grateful to the departmental editor, the associate editor, and two anonymous referees. For useful discussions, we are also grateful to Sam Bodily, Federico Ciliberto, Peter Debaere, Sanjay Jain, Marvin Lieberman, Michael Meyer, Eve Rosenzweig, Robert Shumsky, Steven Stern, Christophe van den Bulte and seminar participants at the 1st Wharton Empirical Conference, Georgia Tech, London Business School, INSEAD, Kellogg, UNC Chapel Hill, Chicago and Dartmouth. We are also grateful to Paul Landefeld for assisting in data collection.

25

References Arguello, M., J. Bard and G. Yu. 1997. A GRASP for Aircraft Routing in Response to Groundings and Delays. Journal of Combinatorial Optimization. 5 211–228. Berge, M. and C. Hopperstad. 1993. Demand driven dispatch: A method for dynamic aircraft capacity assignment, models and algorithms. Operations Research. 14 153–168. Bish, E., R. Suwandechochai and R. Bish. 2004. Strategies for Managing the Flexible Capacity in the Airline Industry. Naval Research Logistics Quarterly. 51 654–685. Cachon, G. and C. Terwiesch. 2002. Matching Supply with Demand: An Introduction to Operations Management. The McGraw-Hill Companies, Inc. Evans, W. and Kessides, I. 1994. Living by the ”Golden Rule”: Multimarket Contact in the U.S. Airline Industry. Quarterly Journal of Economics. 109 341–366. Gans, N., G. Koole and A. Mandelbaum 2003. Telephone Call Centers: Tutorial, Review, and Research Prospects. Manufacturing & Service Operations Management. 5 79–141. Greene, W.. 2003. Econometric Analysis, 5th edition. Prentice Hall. Hayashi, F.. 2000. Econometrics. Princeton University Press. Hopp, W. and M. Spearman. 2000. Factory Physics: The Foundations of Manufacturing Management. McGraw Hill Publishers, 2nd edition. Jacobs, T., R. Ratliff and B. Smith. 2000. Soaring with Synchronized Systems: Coordinated scheduling, yield management and pricing decisions can make airline revenue take off ORMS Today. Mayer, C. and T. Sinai. 2003B. Why Do Airlines Systematically Schedule Their Flights to Arrive Late? Working Paper: The Wharton School, University of Pennsylvania. Mazzeo, M.. 2003. Competition and Service Quality in the US Airline Industry. Review of Industrial Organization. 22 275–296. New York Times Business Desk. Northwest Ranked Most Tardy Carrier. New York Times, November 4 2005. Pindyck, R. and D. Rubinfeld. 1991. Econometric Models and Economic Forecasts. McGraw Hill Press, 3rd edition. Ross, S.. 1996. Stochastic Processes. John Wiley Series 2nd edition, New York.

26 Roth, A.V., R. G. Schroeder, X. Huang and M. M. Kristal. 2008. Handbook of Metrics for Research in Operations Management: Multi-item Measurement Scales and Objective Items. Sage Publications. Sherali, H., E. Bish and X. Zhu. 2005. Polyhedral Analysis and Algorithms for a Demand-Driven Refleeting Model for Aircraft Assignment. Transportation Science. 39 349–366. Shumsky, R.. 1993. Response of U.S. Air Carriers to On-Time Disclosure Rule. Transportation Research Record. 1379 9–16. Talluri, K.T. and G.J. van Ryzin. 2004. The Theory and Practice of Revenue Management. Kluwer Academic, Norwell, Mass. Taylor, H. and S. Karlin. 1984. An Introduction to Stochastic Modeling. Academic Press, Inc. Tayur, S., R. Ganeshan and M. Magazine. 2000. Quantitative models for supply chain management. Kluwer Academic, New York. Van Mieghem, J.. 2003. Capacity Management, Investment, and Hedging: Review and Recent Developments. Manufacturing & Service Operations Management. 5 269–302. Yu, G, M. Arguello, G. Song, S. McCowan and A. White. 2003. A New Era for Crew Recovery at Continental Airlines. Interfaces, January-February. 5 5–22.

27 Figure 1

Calculation of Utilization

28

Figure 2

Figure 3

Aircraft Utilization

Arrival Delays Density

29 Figure 4

Load Factor

30 Table 1

Descriptive Statistics

31 Table 2

Regression Results

32 Table 3

Estimating Revenue per Aircraft Per Day