Cloud Computing for Air Traffic Management - Cost ...

GE Global Research ______________________________________________________________

Cloud Computing for Air Traffic Management - Cost/Benefit Analysis

Liling Ren, Benjamin Beckmann, Thomas Citriniti and Mauricio Castillo-Effen

2014GRC763, July 2014

Public (Class 1)

Technical Information Series

Copyright© 2014 General Electric. Published by the American Institute of Aeronautics and Astronautics, Inc. with permission.

ii

GE Global Research Technical Report Abstract Page Title

Cloud Computing for Air Traffic Management - Cost/Benefit Analysis

Author(s)

Liling Ren Benjamin Beckmann Thomas Citriniti Mauricio Castillo-Effen

Component

Supervisory Controls & Sys Integration Laboratory, Niskayuna

Report Number

2014GRC763

Date

July 2014

Number of Pages

22

Class

Public (Class 1)

SSO

200018361 200018737 200019935 200016210

Key Words: Air Traffic Management, Cloud Computing, Cost Analysis Abstract: A study of recent development in the field cloud computing indicated that, while challenges exist, with the proper alignment of technical and investment decisions, transitioning of Air Traffic Management functions to the cloud computing environment can be achieved much faster than one would perceive, and that it is already happening for applications in air transportation. This paper demonstrates a framework for transitioning Air Traffic Management functions to cloud computing, for cost savings, and gains in performance and efficiency. An application modeled after a dynamic flow management system was used in a case study to identify changes required by, and the cost associated with, transitioning the application to the cloud computing environment. Operating cost of the application hosted in the cloud computing environment was analyzed. The analysis revealed that significant operating cost savings benefit can be achieved through this transition, along with a system that will benefit future upgrades and developments.

Manuscript received Jul 28, 2014

iii

I. Introduction

C

LOUD computing technologies are intended to increase the efficiency of applications, their upgrades, operations, and to reduce duplication of development and data storage. These computing concepts enable the locations of use, processing, and storage to be separated or loosely coupled. Such separation allows for effective and consistent access to the same capability at multiple locations (or by multiple entities) in a synchronous or asynchronous manner. Further, modifications and upgrades of the capabilities need to be done only once, enabling reduced downtime of aircraft, Air Navigation Service Provider (ANSP), or aircraft operator’s Operations Control Center (OCC) capabilities for upgrades of software, models, database, and other functions that may be accessed through a network. Such separation also allows for minimal processing, storage, and software to reside at the location of use and enables a majority of functions to be located away from the location of use, either at a fixed base or on a mobile environment, but making them accessible on demand as needed. The connectivity and sharing of computational resources provide scalable and virtually unlimited capacity for a particular application. A reliable and secure networking capability is key to the successful utilization of these advanced computing concepts for safety critical Air Traffic Management (ATM) applications. As a result of these advanced networking concepts, different functional allocation schemes are possible within the decision triad of the OCC, the ANSP, and aircraft. Thus far, limited research exists that compares network-centric functional allocation strategies and their impacts on the total cost of ATM operations. In an earlier paper1, the authors presented a framework for transitioning ATM functions to cloud computing, for cost savings, and gains in performance and efficiency. With the established framework, initial analysis was carried out for NAS automation systems. The analysis indicated that it is technically feasible to transition most of the ATM functions to the cloud computing environment, with benefits significant to both the system owner and NAS end users. A study of recent development in the field also revealed that, while challenges exist, with the proper alignment of technical and investment decisions, transitioning of ATM functions to the cloud computing environment can be achieved much faster, and that it is already happening for applications in air transportation. This paper presents a case study to demonstrate the framework for transitioning ATM functions to the cloud computing environment. By examining several cloud based concepts of alternative function allocation schemes2, a dynamic flow management application (also referred to as the “Alpha System”) was selected as the candidate in the case study. The application was modeled after a system that was locally installed at the 20 Air Route Traffic Control Centers (ARTCCs) in the National Airspace System (NAS). Such an application is a good example of large scale, complex ATM systems in the NAS, thus it will demonstrate well the benefits of large scale resource pooling provided by cloud computing. Unlike Air Traffic Control (ATC) automation systems such as En Route Automation Modernization (ERAM), such a flow management system is not used for separation management and it has a much longer decision time horizon. For this reason, technically it could be a candidate for early adoption. Additionally, the legacy design of the system imposes some specific challenges to the transition to the cloud computing environment, thus the study would provide worthwhile insights to the transition of ATM systems in general. Through the case study, the concept of the cloud based Alpha System was introduced. Changes required by, and the cost associated with, transitioning the application to the cloud computing environment were identified. The operating cost in the cloud-based Alpha System was analyzed using tools provided by a major cloud service provider. To facilitate the discussion, background information on current ATM operations and the potential of cloud computing are reviewed first before details of the methods used in and the results from the case study are presented.

II. Background This section presents an overview of current ATM operations in the NAS and an overview of potential benefits of cloud computing. This overview paints a basic landscape of where cloud computing capabilities may help. A. Current ATM Operations The existing current NAS architecture can be illustrated by a dependency map of ATM functions in the NAS, as shown in Figure 1. This map was developed based on a review of existing and planned NAS automation systems. Details of this review and each of the ATM functions can be found in Ref. 3. Acronyms used in the map can be found in the Appendix. Dependency refers to requiring input from or relying on functionality of another function. Dependencies between two ATM functions may be unidirectional or bidirectional. In the figure, ATM functions are first grouped by their owner entity in the decision triad of ANSP, OCC, and Aircraft. ANSP functions are further grouped by major systems for operations in different domains. Dependencies may represent connections via different communication channels, including air-ground voice, air-ground data link, commercial data link, and

1

ground network. Connections may exist between functions within the same system and between systems owned by the same entity. They may also exist between functions within different systems owned by different entities, e.g., between a function in an ANSP system and a function in the OCC system. In Figure 1 the grouping of functions and sub-functions for the ANSP is done by aligning functions representing the same major system vertically, where possible. Dependencies between functions or sub-functions within the same major system are denoted by solid black arrows while dependencies between systems or functions within different systems are denoted by dark blue arrows. All the ANSP systems depend upon the Federal Aviation Administration (FAA) Telecommunications Infrastructure (FTI) and System Wide Information Management (SWIM). Though, this may not necessarily reflect the current state of all existing systems, because SWIM is still under development, but it does represent FAA’s vision of the Next Generation Air Transportation System (NextGen) network environment. As seen in the dependency map, many connections exist. However, many of these connections are not integrated with automation. This inevitably creates barriers to fully utilize the capabilities of current existing systems and causes operational inefficiencies. It should also be noted that disparity, gaps, and complexity in communication is not the only issue in current ATM operations. Ideally, all the ANSP, OCC, and airborne systems should rely on the same true and complete picture of the operational environment to make informed and robust decisions. This true and complete picture represents: 1) Configuration and condition of the airspace infrastructure; 2) Atmospheric and weather conditions; 3) Traffic demand and situation; and 4) Behavior of individual flights and aggregated air traffic as a whole. Traditionally, each of these systems has been developed with its own processes creating its own version of the picture, with discrepancies among assumptions, input data, and capabilities. Often, these supporting processes cost much more to develop and operate than the core decision support capabilities, not to mention the many conflicts and inconsistencies in decision making associated with this issue. Figure 1 provides a means to investigate this issue, and jointly with the analysis of advanced computing technologies, provides insights into potential improvements that may be achieved by leveraging advanced computing technologies, as discussed later in this document. In summary, the current NAS architecture can be characterized by:  Traditional ownership – ANSP, OCC, and Aircraft own their functions  Systems installed at facilities where users reside  Stove-piped system and rigid architecture tied to hardware and software platforms  Duplications across systems from same and different entities  Functionality and performance gaps in individual systems Bounded by this NAS architecture, ATM operations in the NAS are characterized by:  Responsive and tactical decision making with limited Four-Dimensional (4D) scope  Inconsistent decision making across different systems and entities; or lack of decision making process in some cases

2

OCC

Trajectory Management

Trajectory Prediction

Conflict Detection

Weather Processing

FOQA

Slot Management

Flight Following

Trajectory Advisory Perf. Database

EWINS Weather

Navigation Database

Weather Uplink

Manuals & Charts

Traffic Situation

SMS

Departure Planning

Upload Flight Plan

Upload Weather

Trajectory Prediction

Airborne Surveillance

Guidance

Update Flight Plan

Update Weather

Trajectory Control

ASCR

Navigation

Ground Surveillance

Air-Ground Data Link

Airborne Weather

Internet Connection through FTI Entry Boundary

Aircraft

Perf. Computation Flight Planning

CDM ADSI

ABRR

GIS

AFP

DAMS

GDP

GPS RAIM

GS

Dynamic Planner

ANSP

CTOP CPT

SDP

NOTAM

Inflight Weather

Field TMI

Trajectory Synthesizer

Surveillance

Flight Plan Processing

AIRAC

Preflight Weather

Demand Alert

Traffic

Trajectory Modeling

Trajectory Modeling

Flight Modeling

Weather

FDP

Procedure Design

Industry Training

AIP

Slot Reservation

CSS-Wx

Flight Planning

TBFM

TFMS Airspace Planning

Test & Evaluation

ATC Training

AIM

Slot Control

NWP

Flight Plan Filing

TFMS

TBFM

Auto Conflict Probe ATOP

ATOP

WDP

Fusion Tracking

SMS

OSPS

TFDM

PDARS

STARS

ERAM

STARS

System Wide Information Management (SWIM) FAA Telecommunications Infrastructure (FTI)

Figure 1. ATM function dependency map. B. Potential Benefits of Cloud Computing Today, advanced network based cloud computing environments4,5, can deliver integrated computational capabilities as a service using shared resources supported and maintained by third parties. The commoditization of these types of systems is a leap forward from traditional computation networks, and removes the need for users to learn technical specifics of the infrastructure, to financially invest in the infrastructure, and to perform infrastructure support and maintenance to achieve full benefits offered by such technology. Extending the capabilities of big infrastructure computing to a user’s figure tips is essential to acceptance and effective utilization. Thin client and portable field devices connect a human to the infrastructure in an acceptable and convenient way. These edge resources augment backend storage and processing resources by enabling the presentation and interpretation of data, user connectivity through secure channels, and work log auditability for certification and accreditation purposes. Moreover, an intelligent field device can self-prevision based on context, such as user role or location, to support specific job functions promoting device reuse to minimize hardware costs. As discussed above and in a wide selection of recent literature, as an integrated solution, cloud computing provides many benefits. Within the context of ATM operations, these benefits can be summarized as:  Lower time and cost barrier to deployment  Reduced and predictable operating cost, including maintenance cost  Scalable and elastic capacity  Increased availability  Improved life cycle development efficiency  Enabling otherwise impossible concepts  Standardized process/resource alignment  Accelerated pace of technology transition Potential benefits and benefit mechanisms listed above generally apply to any applications employing cloud computing but some of them are of vital significance to NextGen ATM operations. For instance, ATM technology transition has traditionally been a lengthy process, fraught with delay and budget overruns. With cloud-based system

3

integration architecture in place, newly developed technologies can be quickly integrated into ANSP facilities for operational evaluation with little to no upfront capital investment in infrastructure. Newly developed technologies can also deliver common services, allowing wider use of the technology without duplicate investments in infrastructure. These advantages can arise from a cloud-based deployment strategy: no installation, setup, and deployment of new backend processing infrastructure in facilities; reuse of self-contained functional components as services; and the portability and mobility of thin client frontend capabilities.

III. The ATM Cloud Transition Framework For the sake of completeness, this section provides a brief review of the framework for transitioning ATM functions to the cloud computing environment. This framework includes an attributed based analysis approach. Using this approach, candidate functions for transition were identified. Based this analysis, a high level NAS reference architecture was developed. An analysis framework was also included in this framework for analyzing specific transition functions. Additional details of the analyses and results can be found in Ref. 1, 3, 6 and 7. A. Attribute-based Analysis Approach Attribute-based allocation uses reasoning frameworks (whether qualitatively or quantitatively) based on quality attribute-specific models8. Attribute-based allocation is especially suitable in identifying candidate ATM functions for transitioning to cloud computing environment. This is because attribute-based allocation is not only attribute based, but also attribute specific. This provides reasoning to system decisions. Attribute-based allocation can also aid in discovering design tradeoffs. The approach used in this task is however, not a strict instantiation of the architecture described in8, rather, it is an application of the general principle in the specific problem to be addressed. Specifically, the analysis approach includes an attribute characterization of ATM functions at ANSP, OCC, and Aircraft, and an attribute characterization of cloud computing technologies. The following two subsections provide a summary of this analysis. Details of this analysis can be found in Ref. 6. 1. Attribute Characterization of ATM Functions The list of ATM functions shown in Figure 1 serves as a list of abstracted services offering given technical capabilities. An ATM function attribute schema was developed and refined. Individual ATM functions were analyzed to determine their characteristics according to the attribute schema, with a focus on those that are most significant to the transition to cloud computing environment. Among all attributes, the Decision Time Horizon and the Criticality were identified as the most significant attributes.  Decision Time Horizon. Possible attribute values span the Planning, Strategic Traffic Flow Management (TFM), tactic ATC, and monitoring and analysis phases of ATM operations. Using time units, this attribute ranges from years, to months, weeks, days, hours, minutes, seconds (including subseconds), and post operations.  Criticality. This is defined by combining several aspects of ATM operations. Possible values are defined as: o Class 1: Safety Critical. o Class 2: Wide Impact. Operation of the function itself may not have direct safety consequences but it is depended upon by many other functions, including potential safety critical functions. o Class 3: Non-safety Critical. However, effects accumulated over time may bring the system to a tipping point and seriously impact system wide efficiency and performance. o Class 4: Non-operationally Critical. Failure of these functions has no impact on the current day of operations. Attribute values were determined by characterizing operations of ATM functions and analyzing requirements for Mission Services defined in FAA’s “National Airspace System Requirements Document” (NAS-RD-2012)9. Where a mapping was found, the values from the NAS requirements were incorporated. For other functions, input from subject matter experts was incorporated. 2. Attribute Characterization of Cloud Computing Technologies A cloud computing attribute schema was also developed and refined. Corresponding to the attribute characterization of ATM functions, a set of most significant computing attributes was selected. A combination of qualitative analysis and experiments was employed to determine attribute values. The most significant attributes for cloud computing include Latency, Locality, Availability, and Security.

4



Latency and Locality. Cloud resource access latency and locality are two related attributes. Everything else equal, the farther away the cloud server is from the edge device, the longer the latency is. Actually, allocating cloud resources in regions close to major customer groups has been a means to guaranty latency, among other considerations. Ultimately, round-trip latency is what an application or user cares. One-way network link latency is an important attribute of the infrastructure.  Availability. For continuously operating applications, availability is a fundamental service quality attribute. Another related attribute is restoration time.  Security. Security concerns the impact on Confidentiality, Integrity, and Availability. An experiment yielded a regression of round trip latency shown in Eq. (1). The experiment was carried by accessing a simple application hosted in various cloud servers via public internet. The latency only includes a minimal application response time. , R-squared value: 0.8993

(1)

For FAA’s FTI, latency (one way) is defined as the total time required to successfully transmit a unit of information across a connectivity from one Service Delivery Point (SDP) to another. This also includes time required for retransmission of lost or corrupted frames or packets where the retransmission occurs wholly within the FTI network. Two Latency Limits (LLs) are defined for FTI Internet Protocol (IP) services. The specified LL is the 99th percentile latency over a rolling period of 24 hours. These two parameters are shown in Table 1. Also shown in the table is the latency for the new FTI optical backbone. Table 1. FTI Latency Limits. Parameter Latency Limit (ms) 1.

LL-1

LL-2

50

90

FTI Optical Backbone 1 per 100 miles interconnect 1

A general rule that has been measured from FTI optical backbone.

A cloud based application can be designed and implemented to leverage resources that support zero interruption failover and other disaster recovery mechanisms. However, cloud computing environments, although mature, are still subject to outages. An experiment was conducted to measure uptime of 135 distinct public cloud services over 90 days. The mean uptime of these services was 99.89% with a standard deviation of 0.8%. As seen in Figure 2, 43% of the services had no measured downtime over the 90 day period. FTI supports various levels of Quality of Service (QoS) to provide priority to critical services. The FTI NAS operations network supports several different availability levels (over a 12-month period) with the highest being 0.9999971. Parameters of different Reliability, Maintainability and Availability (RMA) levels are shown in Table 2. Cloud computing complicates security issues because the abstraction provided by cloud computing disconnects users from the resources being used. Moreover, the utilized resources may be controlled by multiple entities, and the delineation of responsibility among the participating entities can be difficult to define. Therefore, ensuring that proper information assurance controls are applied at all times may require a substantial amount of coordination and trust. It should be noted that security is a common challenge in today’s networked environment. Security challenges unique to cloud computing are being proactively addressed by both government and industry. Measures and practices include the selection of cloud deployment models, i.e. private cloud, community cloud, public cloud, and hybrid cloud, on premise infrastructure as well as measures unique to the cloud environment. Major cloud service providers are offering services to meet various industry and government security standards, including International Traffic in Arms Regulations (ITAR) regulations as well as the Federal Risk and Authorization Management Program (FedRAMP SM) requirements .

5

135

133

140

120

120

80

58

58

99.9999%

89

100

99.9990%

Number of Services Acceding Uptime Percent

160

60 40 20 99.9900%

99.9000%

99.0000%

90.0000%

0

Uptime Percent

Figure 2. Measured percent uptime of 135 cloud services over 90 days.

Table 2. FTI RMA categories. RMA1

RMA2 1

RMA3

RMA4

RMA5

Minimum Availability (within a 12month period)

99.99971%

99.99719%

99.98478%

99.79452%

99.72603%

Maximum Restoration Time (minutes)

0.1

0.98

8.00

180

240

Parameter

1.

FTI service configurations for RMA2 services were still under review and no IP RMA2 services had been implemented as of May 2010 10.

B. Candidate ATM Functions for Transitioning to Cloud Computing An analysis is conducted to convey the relationship between ATM function attributes and computing attributes. This is essentially a projection (or mapping) of computing attributes into the ATM function design space that are defined by the corresponding attribute characterization. With the attribute characterization, this analysis provides a formal reasoning for the identification of candidate ATM functions. If a match exists, a candidate will be identified. This process starts with the most significant attributes first, and extend to more attributes as needed. The results of this exercise are shown in Figure 3 through Figure 5, for ANSP, OCC, and Aircraft ATM functions respectively. Again, Acronyms used in these figures can be found in the Appendix. These figures show the location of various ATM functions in the design space defined by the Decision Time Horizon and the Criticality. In the Figure, major ATM systems are represented by thick outlined bubbles, and sub functions are represented by thin outlined bubbles. To reduce clutter, some functions are represented as stacked bubbles. The curved grey band (from upper left to lower right) in the background schematically illustrates the elastic relationship between criticality and decision time horizon. That is, if the decision time horizon is longer, the criticality tends to be lower. However, this is not always the case for a particular function. For example, flight procedure design has a long decision time horizon, but it is still a safety critical function. In these figures, shaded areas indicate limitations of the cloud computing capabilities. The gradual shade boundary towards the middle of the design space denotes the variability among different cloud infrastructures and different levels of service for a particular ATM function. While the latency and response time can be directly translated into limitations on the decision time horizon, limitations on criticality are more complicated. The latter is correlated to the decision time horizon. As the decision time horizon extends, limitations on criticality are normally relaxed as there is more time available to catch up in case of a system outage.

6

Flight Data

Primary Weather Conflict Probe NWP CSS-Wx ATOP ATOP Inflight Wx ERAM ERAM Wx GS Preflight Data STARS STARS ABRR Wx (TFM) TFDM TFDM (TFM) Field CDM TMI (TFM) CTOP (TFM) (TFM) AFP

Shaded vertical area schematically represents cloud latency limitations

ASDI (TFM)

PDARS

AIRAC (AIM) GIS (AIM)

ATC Training Airspace Planning

GDP TFMS

Sub sec

Seconds Minutes

Industry Training Slot Management

Hours

Procedure Design

AIP (AIM)

Slot Control

Test and Evaluation

Elasticity band of criticality vs. decision time horizon

OSPS

Post Operation

AIM NOTAM (AIM)

TBFM

Class 1 Class 2

GPS RAIM DAMS (AIM) (AIM) Flight Plan

Surveillance

Class 3 Class 4

Criticality

Shaded horizontal area represents cloud availability limitations. The limitations decrease as decision time horizon increases

Days

Months

Years

Decision Time Horizon

Class 2


Flight Following

Flight Planning

Weather EWINS Weather and Uplink Supplementary Weather Traj. Advisory TP CD Departure Wx Traffic Planning En Route Trajectory Management Surface Management


Class 3 Class 4

Criticality

Class 1

Figure 3. Projection of cloud computing capabilities into ANSP ATM function design space.

Procedure Design Traditionally an ANSP function but operators may also perform this function for their own benefits

Slot Management

System Performance Analysis


FOQA

Post Operation

Sub sec

Seconds Minutes

Hours

Days

Months

Years

Decision Time Horizon Figure 4. Projection of cloud computing capabilities into OCC ATM function design space.

7

Class 1 Class 2

Guidance

Navigation

Perf. Comp. Trajectory FP/Wx FP Prediction Update Wx ASCR Ground Upload Surveillance Airborne Trajectory Control Surveillance Airborne Wx

NDB Update PDB Update Manuals/Charts (EFB)


Class 3 Class 4

Criticality


Increased latency for airborne systems due to airground data links

Post Operation


Sub sec

Seconds Minutes

Hours

Days

Months

Years

Decision Time Horizon Figure 5. Projection of cloud computing capabilities into Aircraft ATM function design space. It can be seen in Figure 3 that many of the ANSP ATM functions can be supported by cloud computing. Surveillance for separation assurance (requires 0.99999 availability) in TFDM, STARS (requires 2.2 sec latency), ERAM (requires 3.0 sec latency), and ATOP (requires 3.0-15 sec latency) are the most demanding ones in terms of latency and availability requirements. While surveillance for separation assurance definitely requires low latency and the highest availability and reliability, it is still supported by the best of class cloud computing services today. As a matter of fact, FAA’s NextGen Surveillance and Broadcast Services (SBS), commonly known as Automatic Dependent Surveillance-Broadcast (ADS-B), are already hosted in commercial data centers. Although bandwidth requirements are not shown in the figure, weather systems are the most demanding in this regard. They are successfully supported by today’s network infrastructure too. From Figure 4 it can be seen that unlike ANSP ATM functions, OCC ATM functions span a relatively narrow range of decision time horizon, from months to minutes, rather than from years to seconds or even sub-second. Even though there are Class 1 functions, they are normally not as critical as ANSP ATM functions. As such, most of the OCC ATM functions can be supported by cloud computing. Actually, many of the OCC ATM functions are already hosted by networked servers. Figure 5 is for aircraft functions. It should be noted that, although significant improvements of air-ground data link are expected, aircraft ATM functions will continue to face more stringent latency and bandwidth (not reflected by the figure) limitations. For this reason, the vertical shaded area in Figure 5 extends further to the right. Furthermore, a number of aircraft ATM functions’ decision time horizon is down to sub second, and all the functions identified are considered Class 1 functions. There are also operational considerations other than computing and network connectivity. For example, the ground surveillance function is by definition to be reside on board the aircraft. In any case, there are indeed aircraft ATM functions that could be transitioned to the cloud, for example, trajectory control functions. The computation could be done in the cloud, with the new flight plan uplinked to the aircraft for execution. In the extreme case of small Unmanned Aircraft Systems (UAS), nearly all functions could be moved to the ground, and only the control surface commands are uplinked to the aircraft to guide aircraft movements. C. Cloud-enabled NAS Reference Architecture To look beyond current existing ATM system boundaries, and to inspect cloud computing transition in the context of the NAS as a whole, a cloud-enabled NAS reference architecture, graphically represented in Figure 6, was thus developed. The NAS reference architecture provides a high level abstraction of fundamental NAS

8

functionalities that are loosely coupled, while preserving the flexibility to allow for different architectural alternatives to be explored. The abstracted fundamental functionalities are generalized into high level services and their associated high level interfaces. In the NAS reference architecture, it is assumed that all the ground based ATM functions are supported by cloud computing. This may include an application being hosted in the cloud, running in a client connected to the cloud, or as a separate node connected to the cloud. The abstracted services were generalized from functionalities in the NAS by separating common services from individual systems and consolidating similar functionalities. OCC

Emerging Cloud-Based Services

Air-Ground Communication Management System

Flight Management System

Aircraft

Airborne Spacing and Conflict Resolution

Airborne Airborne Sensors Airborne Sensor Sensor

EWINS Weather Trajectory Management System

Collaborative Decision Making Client

Communication Management System

Ground Surveillance

Surveillance Surveillance Surveillance Sensor Sensor Sensor

ATN Common Weather Services

Aeronautical Information Management

Support All ANSP and OCC Functions

Integrated Integrated Integrated Weather Weather Weather Processor Processor Processor

Integrated Weather Processors connect to Common Weather Services only

ANSP

Connections omitted to reduce clutter Shaded outlines denote instances with local soft configurations Elbow connectors have lower latency requirements than straight line connectors

Collaborative Decision Making Platform

Flight Data Services

Surveillance Data Services

En En Route EnRoute Route Automation Automation Automation System System System

Metroplex Metroplex Metroplex Automation Automation Automation System System System

Tower Tower Tower Automation Automation Automation System System System

National Traffic Flow and Capacity Management

Metroplex Metroplex Metroplex Traffic Flow and Traffic Flow Traffic Flow Capacity Management Management Management System System

Surface Surface Surface Traffic Flow and Traffic Flow Traffic Flow Capacity Management Management Management Systems Systems

ATC

Figure 6. Cloud-enabled NAS reference architecture It is also assumed that the air-ground data link will be the ultimate state of the Aeronautical Telecommunication Network (ATN). It is also assumed that Internet Protocol version 6 (IPv6) will be used for both data and voice communications, where Voice of IP (VoIP) network will be used for voice switching. Communications between ground systems are based on internet connections. QoS is integrated into the network infrastructure, where it is necessary, to satisfy communication performance requirements. D. Cost of Transitioning to Cloud Computing The cost of transitioning ATM functions to cloud computing depends on technical specifics of the ATM function, the functional allocation, and specifics of computing infrastructure and service model selected for the deployment. For a given ATM function, components on the right hand side of Eq. (2) will be determined:

(2) Where The components of the total cost are further decomposed as follows.  Development, Integration, and Deployment Costs

9

o





Development cost, including cost associated with the new system or new system components, and cost associated with re-factoring existing system components to be integrated into the ATM function o Integration cost, including integration with other existing systems, sensors, controls, and third party services and data sources, if needed o Deployment cost associated with equipment, applications and data Infrastructure and Equipment Costs o Physical facility cost; may be eliminated if the transition eliminates in-house installation of hardware, and if there is no footprint increase on the client side o Network communication infrastructure to support operating the function o Computer workstations, commercially available software or software components, and associated replacement cost o Decommissioning and removal of retired assets associated with legacy deployment Operations and Maintenance Costs o Operations and maintenance of facilities and infrastructure for the function, including network communication cost o Operations, support, maintenance, and upgrade of system components, databases, and interfaces for the cloud deployed ATM function o Software license fees, considering cloud-based software license models o Cost associated with ecosystem sustainment, meetings and facilitation

IV. Description of the Cloud Based Alpha System Concept A case study of the Alpha System was conducted following the analysis framework reviewed in the previous section. The Alpha System is a fictitious dynamic flow management application modeled after a system that was locally installed at the 20 ARTCCs in the NAS. The functionality of the legacy Alpha System can be categorized as a Metroplex Traffic Flow and Capacity Management (MTFCM) system in the NAS reference architecture. This section presents a description of the legacy Alpha System and its cloud based transition concept. It should be noted that, for the sake of convenience, terminologies and some architectural aspects of a real world system were borrowed to model the fictitious Alpha System. The description is, however, pure conceptual and intended for research purpose. It is not intended to represent the real world system. A. The Baseline Legacy Alpha System 1. Legacy Alpha System Architecture The baseline legacy Alpha System is an arrival sequencing and scheduling tool. The purpose of the Alpha System is to reduce the less efficient Miles-In-Trail (MIT) Traffic Management Initiatives (TMIs) and significantly expands time-based metering. The legacy Alpha System functions and interfaces are shown in Figure 7. Each of the Alpha System functions is designed as a separate process that may be running on a separate server or along with other processes, depending on the computational load. In a particular installation, in addition to the core cluster of workstations and processes, a Dynamic Planner (DP) or a Meter Point Dynamic Planner (MPDP) group is setup for each arrival terminal or each en route metering point to be managed by the Alpha System. Thus multiple DP/MPDP groups may exist in a given installation. Point-to-point connections and interfaces are used to provide input data from external systems. Thick client Timeline Graphical User Interface (TGUI) and Planview Graphical User Interface (PGUI) are hosted by workstations installed locally at the hosting en route ATC facility and remotely at the terminal and tower ATC facilities. Attributes of the Alpha System is shown in Table 3. Table 3. Alpha System attributes. Attribute

Value

Corresponding NAS Requirements (NAS-RD-2012)9

Decision Time Horizon

Minutes, up to 20 minutes

Maximum latency: input response 1.2 s, process input 3 s, flow evaluation round trip response time 10 s.

Criticality

Class 2: Wide impact

Inherent availability rating: Efficiency Critical. Minimum Service Availability: 99.99%.

10

Interfaces

System Functions

Host ATM Data Distribution System (HADDS) Data Interface (HDIF)

Input Source Manager (ISM) Communication Manager (CM)

Automated Radar Terminal System (ARTS) Data Interface (ADIF)

Collaborative Arrival Planning (CAP)

Dynamic Planner (DP)/ Meter Point Dynamic Planner (MPDP)

Surface Management System (SMS) Data Interface (SDIF)

Weather Data Processing Daemon (WDPD)

Other Traffic Data Interfaces Weather Data Interface (WDIF)

Trajectory Synthesizer (TS)

Route Analyzer (RA)

Interface Aircraft Operator

Surface Management System (SMS)

Timeline Graphical User Interface (TGUI)

Planview Graphical User Interface (PGUI)

Figure 7. Legacy Alpha System functions and interfaces. 2. Legacy Hardware and Software Configuration Characteristics At each site, there is a main string for operational use, and a support string running in the back room. The legacy Alpha System hardware consists of rack mounted servers and workstations whose characteristics are listed in Table 4. At a typical site, the main string consists of 7 rack mounted servers and 5 to 15 workstations. The support string consists of 4 rack mounted servers and 5 to 9 workstations. Table 4. Alpha System legacy hardware. Machine Type

Characteristics

Rack Mounted Server

2 × 1.5 GHz 64-bit processors with 2 GB memory, single graphics card

Workstation

1 × 1.5 GHz 64-bit processor with 1 GB memory, dual graphics card

The Alpha System is hosted on the Intel based hardware platform running a Linux operating system. Communication Manager (CM) is the communications hub and data manager for other Alpha System processes. The CM connection management software is based on a special non-blocking socket library. Within each DP/MPDP group, multiple Route Analyzer (RA)/Trajectory Synthesizer (TS) pairs are normally to carry out the computational load of trajectory calculation for flights traversing the en route center airspace. When multiple RAs are running, the CM simultaneously distributes aircraft assignments among available RAs to balance the load. The baseline legacy Alpha System software consists of code for system processes, offline tools, and common libraries. The Source Lines Of Code (SLOC), including both source instructions and comment lines, is used as the software metric in this analysis. Values of this metric were notionally derived from estimated reference numbers for a real world system; they are shown in Table 5. However, numbers in the table are derived for research purposes only. They are not meant to represent the real world system and they do not reflect any specific configurations of the real world system. It is assumed that the Alpha System is written in C and C++ languages. 3. Legacy Alpha System Installation The Alpha System is assumed to be installed at each of the 20 en route centers in the NAS. At each center, a certain number of servers and workstations are required for the main string and support string. For each terminal area requiring a DP group, one or two additional servers are installed at the center, and workstations are normally installed at the terminal to provide remote TGUI and PGUI. For a very busy airport, an additional server may be installed at the center, and a workstation is installed at the tower facility. The number of servers and workstations installed is listed in Table 6. In the table, SVR refers to servers, and WS refers to workstations.

11

Table 5. Alpha System software metric. Software Component

SLOC

Route Analyzer (RA)

17K

Trajectory Synthesizer (TS)

46K

Dynamic Planner (DP)

73K

Weather Data Processing Daemon (WDPD)

17K

Timeline Graphical User Interface (TGUI)

294K

Planview Graphical User Interface (PGUI)

97K

Input Source Manager (ISM)

24K

Communications Manager (CM)

122K

Collaborative Arrival Planning (CAP)

13K

Host ATM Data Distribution System (HADDS) Data Interface (HDIF).

29K

Automated Radar Terminal System (ARTS) and Standard Terminal Automation Replacement System (STARS) Data Interface (ADIF)

11K

Enhanced Traffic Management System (ETMS) Interface

8K

Graphical User Interface Router (GUIR)

3K

Shared Memory Management

3K

Monitor and Control Subsystem (M&C)

93K

Global Includes

13K

Library Tools

219K

Offline Tools

54K 1136K

Total B. The Cloud Based Alpha System

1. Overview of the Cloud Based Alpha System Concept Due to its solution domain, functionality, and capabilities, the Alpha System could also be a foundational research platform for new ATM technologies envisioned for improving system efficiency and throughput. The transition of the Alpha System to cloud computing will thus have lasting impact to current ATM operations and to future ATM technology development as well. The Alpha System’s legacy architecture, although not designed for the virtualized environment, has a modular design that allows for the system being decomposed into loosely coupled processes and closely coupled processes. A preliminary analysis of the legacy Alpha System architecture revealed that the loosely coupled processes would be where future functionality enhancements or alternative designs could occur. Closely coupled processes are mostly external interface processes. The ISM decouples external interface processes from the rest of the Alpha System. This means that the majority of the Alpha System processes can be virtualized separately from external system interface processes. With the developments of SWIM and SBS, and expected future development of Common Support Services – Weather (CSS-Wx) and Aeronautical Information Management Modernization (AIMM), the Alpha System’s external interface process could also be virtualized when time is ready. The popularity of the Intel/Linux platform among cloud computing service providers will simplify the process and reduce cost to transition the legacy Alpha System to the cloud environment. Figure 8 depicts the architectural process to transition the legacy Alpha System into a tiered architecture hosted in a cloud environment. Note that details of the legacy Alpha System block in this figure are already shown in Figure 7 when the legacy system was discussed.

12

Table 6: Alpha System server and workstation installation. Main

Support

Terminal

Tower

Total

Center SVR

WS

SVR

WS

SVR

WS

SVR

WS

SVR

WS

1

7

10

4

7

2

1

1

1

14

19

2

7

9

4

7

5

4

3

4

19

24

3

7

10

4

8

1

1

1

1

13

20

4

7

8

4

8

1

1

1

12

18

5

7

11

4

9

1

1

1

12

22

6

7

7

4

5

1

4

1

2

13

18

7

7

7

4

7

2

3

1

2

14

19

8

7

9

4

9

1

1

1

12

20

9

7

9

4

6

3

5

1

15

21

10

7

7

4

6

1

1

1

12

15

11

7

7

4

8

1

1

1

12

17

12

7

6

4

6

1

1

1

13

14

13

7

6

4

6

2

2

1

13

15

14

7

5

4

8

1

1

2

12

16

15

7

7

4

8

1

2

1

12

18

16

7

6

4

6

1

1

12

13

17

7

6

4

7

1

1

1

12

15

18

7

15

4

8

2

13

2

14

38

19

7

7

4

7

2

2

13

16

20

7

7

4

6

2

2

13

16

262

374

1

1

1

Grand Total

Figure 8. Transition Alpha System to the cloud.

13

1

To maximize resource pooling to reduce cost and improve system efficiency, it is envisioned that the cloud based Alpha System will not be deployed in the cloud as a host of discrete systems for individual en route centers. Rather, it will be deployed as an integrated system for the NAS or at least for a region. For example, although the Alpha System business logic, i.e. flow sequencing and scheduling functionalities, is not expected to dramatically change during the initial transition, the organization of different processes will likely change so as to take advantage of massively parallel processing and the ability to sharing processing results among different en route centers. The Data Service layer will retain the same functional framework used by the ISM, but interfaces to stove-piped (pointto-point connections to specific facilities and systems) data sources will be replaced by interfaces to common data sources. A new cloud based Web Services layer is added to manage authentication and scalability along with other service management functionalities. The presentation layer consists of TGUI and PGUI client apps at en route, terminal, and tower facilities. Due to the complexity of the legacy TGUI and PGUI design (see SLOC in Table 5), the initial transition may still retain thick client running on native Alphas System workstations. Thin client TGUI and PGUI may be developed in a phased approach to allow for for-purpose thick client workstations at terminal and tower facilities to be replaced with standard workstations. Or, dedicated TGUI and PGUI workstations may be completely eliminated by integrating Alpha System thin client with existing ATC automation workstations, either as client apps or data layers. Mobile devices could also be used to access the Alpha System. This will significantly reduce Operating and Maintenance (O&M) costs. Access will also be easily provided to airline dispatchers. 2. Cloud Hosted Alpha System Business Logic Although the functionality of the Alpha System is expected to remain the same in the initial transition, the architecture will be modified to maximize resource pooling. In a typical legacy Alpha System setup, there is one primary CM server in the operational string. However there are two servers running concatenated RA and TS processes. This is an indication of trajectory calculations as the most demanding processes. At the NAS level, duplicate trajectory calculations occur in Alpha Systems installed at adjacent en route center. It thus makes sense to aggregate closely coupled RA/TS processes into a common trajectory service, which then becomes loosely coupled with the DP/MPDP processes. In doing so, the CM functionality associated with RA/TS will be decoupled from the rest of CM functionality. The reorganized common trajectory service will then support multiple DP/MPDP nodes (groups), even those from different en route centers. This is analogous to the Flight Data Services (FDS) concept shown in Figure 6. The CM for each node is a simplified version of the legacy Alpha System CM that only provides functionality to support the CP/MPDP. Input data sources and direct communication with GUI clients are mapped to the data service layer and web service layer respectively, which are discussed in the next two subsections. In such an architectural design, load balancing and system failover will be managed separately and independently for the common trajectory service and the DP/MPDP logic, potentially providing additional cost saving benefits. This business logic is illustrated in Figure 9.

Figure 9. Cloud based Alpha System business logic design.

14

3. Cloud Based Alpha System Data Services As shown in Figure 7 and Figure 8 data sources for the Alpha System can be categorized into aeronautical information, traffic data, and weather data. The legacy Alpha System does not use runtime direct connections to acquire aeronautical information. Airspace configuration and routes are updated every 56 days through the adaptation process. The legacy Alpha System uses stove-piped connections to ATC automation systems installed at related ATC facilities to receive traffic data. The legacy Alpha System connects to various individual ftp servers to receive weather information. The cloud based Alpha System will fully leverage NextGen information services such as AIMM, SWIM, SBS, and CSS-Wx. When these services are fully deployed, the Alpha System’s external interface process will be virtualized, completely removing stove-piped connections. 4. Client Apps at ATC Facilities and Airline OCC Client apps at ATC facilities represent the presentation layer of the cloud based Alpha System. The cloud based Alpha System will retain the ability to provide scheduling information to en route automation system for display on the controller’s console. Additionally, information may be provided to aircraft operators (via CAP) for display at their OCC. Since the controller’s display console and potential displays at the OCC are not parts of the Alpha System, they are not considered client apps in this analysis. Client apps at ATC facilities include:  TGUI: It shows air traffic as points along vertical lines called timelines. Parameters of TGUI are user configurable.  Load Graph: It shows the Airport Acceptance Rate (AAR) and traffic demand estimates. It is part of the TGUI software but uses a separate GUI that may display up to 9 graphs.  PGUI: It displays a map view of air traffic along with a sequence list of color coded aircraft according to sectors, along with fixes, waypoints, freeze horizons, arcs and holding points. As seen from Table 5, TGUI and PGUI account for a relatively large portion of the legacy Alpha System software (approximately 26% and 9% respectively). The TGUI uses Open Software Foundation (OSF) Motif and X windows and is implemented in the C++ language. The PGUI is an X11/Motif and X windows application. This implies relatively straight forward transition, as compared to some other systems, of TGUI and PGUI to standard Linux workstations that are more cost effective than dedicated workstations. However, both TGUI and PGUI are thick clients, implementing many computations adopted over time. To transition to thin client apps, re-architecture will be necessary to move many calculations from the client side to the server side. The legacy Graphical User Interface Router (GUIR) is based on a small library of classes that wrap the UNIX sockets Application Programming Interface (API) and provide non-blocking input and output support. As seen from Table 5, the legacy GUIR is indeed a relatively small piece of software. The GUIR is designed to avoid the CM sending the same message to remote GUIs multiple times. For the cloud based Alpha System, the GUIR needs to be ported (or redesigned) to serve the communication between the cloud and GUIs at ATC facilities. For the legacy Alpha System, this is only needed for so called remote GUIs at terminal and tower facilities. For the cloud based Alpha System, all GUIs, including TGUIs and PGUIs at the en route centers will be handled in the same way, i.e. as remote GUIs. This effort will essentially be integrated as part of the web service layer development. 5. Maintenance and Control (M&C) and System Failover The legacy Alpha System assumes remote field support and maintenance from a central office for systems installed at all centers. It is assumed that an on premise system administrator is staffed at each of the centers to maintain the system in day to day operations. For the cloud based Alpha System, the majority of on premise system administrator duties will be performed remotely, potentially from centralized technical support offices. In the ultimate state, client apps will be running on standard workstations, the on premise IT support will be shared among different systems. The legacy Alpha System also assumes mechanisms to attempt to automatically restart a failed process, failover to a backup computer or process, re-spawn the process, and load balance between the remaining processes. These capabilities will be fully retained and incorporated in the cloud hosted Alpha System design with even better performance at a lower cost. Cloud computing is designed to be redundant, and backup servers can be started within seconds without been limited by the number of physical servers installed at each center. In case of a rare catastrophic system outage, a cloud failover plan may include automated failover from one cloud host center to another, even the one located in another geographic region that is hundreds or thousands miles away. This would not be possible for on premise hardware installations. On the other hand, in the cloud environment, load balancing also means the ability to scale down the system to reduce resource usage when it’s not needed so as to reduce cost. With integrated system backup and recovery design in the cloud environment, standing provision of the Alpha System support string with predefined fixed number of servers for individual centers is no longer needed. This will

15

further reduce cost. Because the hardware infrastructure is maintained by the cloud service provider, the system administrator is relieved from hardware troubleshooting on the server side and can thus focus on operational aspects of the Alpha System itself, which will result in increased productivity. As an alternative, the Alpha System may be deployed at two or more cloud host centers, each with a designated primary service region. For example, an east coast center and a west coast center, each primarily serving half of the Continental United States (CONUS), but is able to scale up to serve the entire CONUS in case of catastrophic outages. This may be achieved with enhanced Alpha System recovery capabilities combined with additional recovery tools available on the cloud platform. C. Cloud Based Alpha System Value Story Based on discussions above, the cloud based Alpha System value story can be summarized below and serve as guidelines for the cost and benefit analysis:  Cloud computing is capable to support the requirements of Alpha System. o Alpha System is a flow scheduling system that supports TMC’s tasks. The decision time horizon (minutes up to 20 minute) is well supported.  Cloud computing enables changes to the way the Alpha System is developed and operated o The increased capability sharing will enable the elimination of duplications in current systems, namely through consolidation of capabilities such as trajectory prediction currently hosted by servers at each ARTCC. o Cloud computing provides a transparent and virtual application execution environment. With such an environment, and the availability of loosely coupled common services, the configuration for development, testing and evaluation, and operational deployment will be essentially indifferent from each other, thus enabling agile development and deployment of new features and system updates. o Scalable and elastic capacity and the ability of fast provision provided by cloud computing allow for the Alpha System dynamically adapting computational capacity to traffic demand on the fly. o The ubiquitous access of Alpha System capabilities enabled by cloud computing has the potential to incorporate resources not previously accessible into the execution of tasks and into the solution of problems, for instance, increased level of collaboration between aircraft operators and the ANSP for flow scheduling.  Sources of values of the transition of the Alpha System. o Reduced cost to invest (including initial development and cyclic system replacement), upgrade, and operate the system o Accelerated technology transfer for early realization of benefits of future flow scheduling capabilities The reduced cost to invest, upgrade, and operate the system is analyzed in the next section.

V. Cloud Based Alpha System Transition Cost Analysis The cost of transitioning to cloud computing described in Section III.D was carried out for the Alpha System as part of the case study. Both system engineering analysis and a commercial cloud tool were used. For the sake of simplicity, the cost associated with the computational infrastructure and the cost associated with the Alphas System application itself were analyzed separately. In the cost summary, the results from these two separate analyses are combined to provide an overview of the cost. A. Total Cost of Ownership (TCO) of the Computational Infrastructure Computational infrastructure includes servers, workstations, network equipment, installation and mounting equipment, and operating system software. The Total Cost of Ownership (TCO) of the computational infrastructure was calculated using the Amazon Web Services’ (AWS’) online TCO calculator 11, 12. The analysis for a single site is presented first to demonstrate the basic process and assumptions employed in the analysis. Then, the results for a NAS wide deployment are presented. 1. Single Site Analysis Mapping to the cost components described in Section III.D, the AWS TCO calculator considers the following items:  Infrastructure and Equipment Costs

16

o o o

Physical facility cost, including space, power and cooling Network communication infrastructure to support operating the function Computer servers and rack infrastructure, workstations, storage equipment, and operating system software o Decommissioning and removal of retired assets associated with legacy deployment  Operations and Maintenance Costs o Operations and maintenance of facilities and infrastructure for the function, including network communication cost To simplify the required input from the user of the tool, the AWS TCO calculator uses a set of default configurations and assumptions for various technical and economic aspects of the computational infrastructure, for both on-premises installation and AWS deployment. Additional details of these default configurations and assumptions can be found in Ref. 11 and 12. Since the type of server and the type of workstation specified in Table 4 are now obsolete, the specification input to the Amazon TCO tool was adjusted to yield a reasonable cost estimation, as shown in Table 7. The specification of AWS server and workstation instances provided by the AWS TCO calculator are also shown in the table. Table 7. Alpha System hardware specification used in TCO calculations. Machine Type

TCO Input Parameters

AWS Instance

Rack Mounted Server

2 single core processor, 4 GB memory, Linux, compute optimized, running 24 × 7

c3.large, 2 vCPU, 3.75 GB memory, 2 × 16 GB SSD

Workstation

1 single core processor, 2 GB memory, Linux, compute optimized, running 24 × 7

m1.small, 1 vCPU, 1.7 GB memory, 1 × 160 GB HD

Network-attached block devices, 2TB

1 TB, monthly incremental backup

Storage

AWS provides an entirely separate region called AWS GovCloud (US) 13 that provides an environment where customers can run ITAR-compliant applications. There are currently two availability regions within the AWS GovCloud. It was assumed that the Alpha System will be deployed in the AWS GovCloud. As a rule of thumb, services in the GovCloud would cost 20-30% more than comparable services in the public cloud. For the sake of simplicity, it was assumed that for a given installation, the same number of servers and workstations would be deployed in the cloud. In this case, the same number of operational string servers and support string servers could be utilized to achieve high fault tolerant availability in the cloud. In practice, by re-align the configuration of virtual machine servers with computational needs, additional efficiency could be achieved to provide the same performance. This assumption would thus provide a conservative cost estimate. Three deployment models were compared:  On-premises: legacy deployment with all servers and workstations installed at ATC facilities.  AWS GovCloud servers with on-premises thick client workstations: cloud deployment with all servers in the AWS GovCloud, but thick client workstations installed at ATC facilities. This represents a near term deployment model. All else equal, this provides the upper bound of TCO for cloud deployment.  AWS GovCloud server and workstation with thin client: cloud deployment with all servers and workstations in the AWS GovCloud, and thin client on existing standard workstations or mobile devices. This represents an extreme case of the thin client model where no dedicated workstations will be installed at ATC facilities for the Alpha System, thus no additional cost. In reality, a certain number of dedicated on-premises workstation might still be needed, even though their number could be reduced. All else equal, this provides the lower bound of TCO for cloud deployment. The TCO for the computational infrastructure of Site 1 listed in Table 6 was calculated for above three deployment models. Since the AWS TCO calculator provides comparison for a one-to-one configuration only, to calculator TCO for the three deployment models described above, the tool was executed three times with different inputs. One was for an installation with both servers and workstations to obtain TCO for the on-premises model and the AWS GovCloud server and workstation with thin client model. Another one was for an installation with servers only to obtain TCO for servers in the cloud. The last one was for an installation with workstations only to obtain TCO for thick client workstations on-premises. In this last case, it was assumed that a smaller local data backup was also needed. The results are shown in Table 8.

17

Table 8. Comparison of 3-year TCO of Alpha System computational infrastructure at Site 1. On-Premises Cost Item TCO

AWS GovCloud SVR with OnPremises Thick Client WS TCO (On-Premises %)

Change

AWS GovCloud SVR & WS with Thin Client TCO

Change

SVR (14) & WS (19)

$288,542

$148,917

(83%)

-48%

$35,145

-88%

Storage

$61,696

$63,076

(90%)

+2%

$6,457

-90%

Network

$166,752

$197,574

(70%)

+18%

$58,406

-65%

IT-Labor

$118,800

$81,000

(84%)

-32%

$29,700

-75%

Total

$635,791

$490,566

(79%)

-23%

$129,707

-80%

As seen in Table 8, an 80%, or $506,084 TCO savings could be achieved for a 3-year period by deploying all the servers and workstations in the AWS GovCloud. As already discussed, this is an optimistic estimation. Even though many of the dedicated workstations on-premises could be eliminated by using thin client on standard or shared workstations or mobile devices, some dedicated workstations might still be needed. If only the servers were deployed in the cloud, a 23%, or $145,225 TCO savings would be achieved for a 3-year period. This was significantly lower than the previous deployment model but it would still be worth pursuing. The lower savings could be attributed to the high cost of running on-premises thick client workstations, which account for nearly 80% of the TCO for this deployment model. One observation was the increase of network cost in this model, and the overall high cost of the network cost in all three models. One of the reasons was that assumptions used in the AWS TCO calculator did not necessarily closely reflect the network requirements of the Alpha System, particularly when only the thick client workstations were retained on premises. Specifically, in the way the AWS TCO calculator was used, the 19 thick client workstations remained on premises were considered a standalone system, rather than a system that had the majority of the data processing residing in the cloud. Thus, the results for this model were on the conservative side, i.e. overly estimated. Due to time limits and that this deployment model was intended to provide an upper bound of TCO, the said issue was recognized but the results were not adjusted in the current paper. In summary, a 23% to 80%, or $145,225 to $506,084 TCO savings could be achieved for a 3-year period by deploying the Alpha System for Site 1 to the AWS GovCloud. 2. NAS Wide Deployment Analysis Following the same process for a single site, the NAS wide deployment was analyzed. Three-year TCO for two possible deployment paths were calculated. The first path would be to simply transition the Alpha System servers and workstations to the cloud environment as separate systems for individual sites. In this case, even when all the sites were transitioned to the cloud, they would still have been operated independently of each other, as in the legacy on-premises installation. The comparison of 3-year TCO of the three deployment models under this path is shown in Table 9. As seen from the table, TCO results for the NAS wide individual deployment were simply a linear extrapolation of the results for a single site. For the AWS GovCloud server and workstation with thin client model, there was an increase of TCO savings as the site size increases. For the AWS GovCloud server with on-premises thick client workstation model, the cost savings increased as the number of deployed servers increased, either in terms of absolute number or in terms of the percentage of machines deployed to the cloud. In short, the more that could be deployed in the cloud, the more TCO savings could be achieved. It is obvious, however, deploying each site as a separate system did not provide much additional benefits because resource sharing between sites was not considered in this path. In summary a 22% to 80%, or $2,795,158 to $10,050,865 TCO savings could be achieved for a 3-year period by deploying the Alpha System for all sites to the AWS GovCloud as separate systems. The second path would be to simply transition the Alpha System at different sites as a single system. Under this path, all the servers and workstations transitioned to the cloud were considered a single deployment. Except the TCO of on-premises thick client workstations, which had to be calculated for each of the sites separately, the TCO of servers and workstations transitioned in the cloud were be calculated in a single run of the AWS TCO calculator. The comparison of 3-year TCO of the three deployment models under this path is shown in Table 10. As seen from the table, an additional 9% TCO savings can be achieved by simply transition the servers and workstations at all sites to the cloud as a single deployment. The table shows that a 31% to 89%, or $3,904,858 to $11,159,665 TCO

18

savings could be achieved for a 3-year period by deploying the Alpha System to the AWS GovCloud in a single step. Table 9. Comparison of 3-year TCO of Alpha System computational infrastructure as separately deployed systems. Site

On-Premises

ID

SVR

WS

3-Year TCO

1

14

19

$635,791

2

19

24

3

13

4

AWS GovCloud SVR with OnPremises Thick Client WS 3-Year TCO

AWS GovCloud SVR & WS with Thin Client

Change

3-Year TCO

Change

$490,566

23%

$129,707

80%

$805,786

$540,539

33%

$150,355

81%

20

$630,895

$495,114

22%

$128,389

80%

12

18

$604,187

$477,848

21%

$122,854

80%

5

12

22

$633,270

$506,932

20%

$128,478

80%

6

13

18

$604,187

$480,572

20%

$124,677

79%

7

14

19

$635,791

$490,566

23%

$129,707

80%

8

12

20

$618,729

$492,390

20%

$125,666

80%

9

15

21

$662,499

$507,832

23%

$135,242

80%

10

12

15

$582,374

$456,035

22%

$118,635

80%

11

12

17

$596,916

$470,577

21%

$121,448

80%

12

13

14

$587,270

$451,488

23%

$119,953

80%

13

13

15

$594,540

$458,759

23%

$121,359

80%

14

12

16

$589,645

$463,306

21%

$120,042

80%

15

12

18

$604,187

$477,848

21%

$122,854

80%

16

12

13

$567,832

$441,493

22%

$115,823

80%

17

12

15

$582,374

$456,035

22%

$118,635

80%

18

14

38

$846,746

$701,522

17%

$156,422

82%

19

13

16

$601,811

$466,030

23%

$122,765

80%

20

13

16

$601,811

$466,030

23%

$122,765

80%

Total

262

374

$12,586,641

$9,791,483

22%

$2,535,776

80%

Table 10. Comparison of 3-year TCO of Alpha System computational infrastructure at a single system. On-Premises Sites

SVR

WS 3-Year TCO

20

262

374

$12,586,641

AWS GovCloud SVR with OnPremises Thick Client WS

AWS GovCloud SVR & WS with Thin Client

3-Year TCO

Change

3-Year TCO

Change

$8,681,783

31%

$1,426,976

89%

19

In both transition paths discussed so far, the number of servers and workstations were assumed to remain the same after the transition. It would be possible to consolidate and share virtual servers among different sites. This could potentially lead to a reduction of the number of virtual servers required. For each site, the reduction of the number virtual servers would in general lead to a reduction of the server cost line in Table 8. This could be achieved through both eliminating simple duplications in required virtual servers, and using the virtual servers that best match the required computational tasks. Since the traffic loading changes throughout the day, for sites distributed throughout the NAS, their peak hours vary, thus it would be possible to further reduce the number of standing virtual servers for the Alpha System. Depending on where the actual cost stands within the lower and upper bounds of the TCO, the additional savings provided by these two mechanisms could be significant. These effects are currently being studied and will be reported later. Another potential mechanism to further reduce the TCO of the computational infrastructure in the cloud environment would be to release the virtual machines when they are not in use. The TCO calculations presented so far assumed a 100% virtual machine usage, that is, the virtual machines would be assumed running full time 24 × 7. Technically this would not be necessary because traffic would be very light during midnight, which normally would not require an active use of the flow management capability. The effect of this mechanism however would be somewhat complicated. For one, the AWS pricing would change if the server reservation model had changed, which would require additional analysis. Starting and stopping the virtual machines could have other operational implications, such as the impact on continuous data logging. Another mechanism to reduce the number of virtual machines would be through re-architecting the system for higher levels of resource alignment and sharing. However, in this case, the computational infrastructure cost savings might become a secondary effect, the increased system capability and operational efficiency might become the dominate driver. This leads to the needs for the analysis outlined in the next section. B. Cost Associated with Alpha System Application Even though the Alpha System has a modular design, in its current form, it is not yet ready to be transitioned to the cloud computing environment. Many manual tasks are still involved in local adaptation and system provisioning. Legacy software tools and libraries are still being used, making it difficult to run the system in the cloud environment. System architectural and design changes would be necessary for the transition. More importantly, to allow for significant cost savings for future upgrades and developments, changes need to be made to the system to be consistent with modern system design principles. Mapping to the cost components described in Section III.D, the cost associated with the Alpha System application itself considers the following items:  Development and Integration, and Deployment Costs o Development cost, including costs associated with the new system or new system components, and cost associated with re-factoring existing system components to be integrated into the ATM function o Integration cost, including integration with other existing systems, sensors, controls, and third party services and data sources, if needed o Deployment cost associated with equipment, applications and data  Infrastructure and Equipment Costs o Decommissioning and removal of retired assets associated with legacy deployment  Operations and Maintenance Costs o Operations, support, maintenance, and upgrade of system components, databases, and interfaces for the cloud deployed ATM function o Application specific software license fees, considering cloud-based software license models o Cost associated with ecosystem sustainment, meetings and facilitation The analysis of these items is underway, and will be reported at a later time. C. Cost Analysis Summary The ultimate goal of the cost analysis is to justify the transition of an ATM function to the cloud to achieve cost savings, or operational benefits. Cost savings and required new expenses will be integrated when they are done to provide a complete and consistent picture.

20

VI. Conclusion This paper presented a case study of the transition of the Alpha System dynamic flow management system to the cloud computing environment. Analysis results showed that, by transition a legacy system to the cloud computing environment, significant operating and maintenance cost savings can be achieved. Additionally, the cloud transition enables will change the way the Alpha System is developed and operated. Additional work is underway to refine the analysis and to build a mini-pilot demonstration so as to deepen the understanding of implications of the cloud transition, and formulate future research directions on issues specific to the transition of ATM functions to the cloud computing environment.

Appendix The following acronyms and abbreviations used in this paper: GDP 4D Four-Dimensional GIS AAR Airport Acceptance Rate ABRR Airborne Reroute Execution GPS ADIF ARTS Data Interface GS ADS-B Automatic Dependent SurveillanceGUIR Broadcast HADDS AFP Airspace Flow Program HD AIMM Aeronautical Information Management HDIF Modernization IP AIP Aeronautical Information Publication IPv6 AIRAC Aeronautical Information Regulation and ISM Control LL ANSP Air Navigation Service Provider M&C API Application Programming Interface MIT ARTS Automated Radar Terminal System MPDP ASCR Airborne Spacing and Conflict Resolution MTFCM ASDI Aircraft Situation Display to Industry ATC Air Traffic Control NAS ATCSCC Air Traffic Control System Command NDB Center NextGen ATM Air Traffic Management ATN Aeronautical Telecommunication Network NOTAM ATOP Advanced Technologies and Oceanic NWP Procedures OCC AWS Amazon Web Services OSF CAP Collaborative Arrival Planning OSPS CD Conflict Detection CDM Collaborative Decision Making PDARS CM Communications Manager CONUS Contiguous States PDB CPT Conflict Probe Tool PGUI CSS-Wx Common Support Services – Weather QoS CTOP Collaborative Trajectory Options Program RA DAMS Dynamic Airspace Management Services RAIM DP Dynamic Planner EFB Electronic Flight Bag RMA ERAM En Route Automation Modernization ETMS Enhanced Traffic Management System SBS EWINS Enhanced Weather Information System SDIF FAA Federal Aviation Administration SDP FDP Flight Data Processing SDP FOQA Flight Operational Quality Assurance SLOC FP Flight Plan SMS FTI FAA Telecommunications Infrastructure

21

Ground Delay Program Geographic Information Management Services Global Positioning System Ground Stop Graphical User Interface Router Host ATM Data Distribution System Hard Drive HADDS Data Interface Internet Protocol Internet Protocol version 6 Input Source Manager Latency Limit Monitoring and Control Miles-In-Trail Meter Point Dynamic Planner Metroplex Traffic Flow and Capacity Management National Airspace System Navigation Data Base Next Generation Air Transportation System Notices To Airmen NextGen Weather Processor Operations Control Center Open Software Foundation Operations and System Performance Systems Performance Data Analysis and Reporting System Performance Database Planview Graphical User Interface Quality of Service Route Analyzer Receiver Autonomous Integrity Monitoring Reliability, Maintainability, and Availability Surveillance and Broadcast Services SMS Data Interface Service Delivery Point Surveillance Data Processing Software Line Of Code Surface Management Systems

SSD STARS SVR SWIM TBFM TCO TFDM TFM TFMS TGUI

TMI TP TS UAS vCPU VoIP WDIF WDP WDPD WS Wx

Solid-State Drive Standard Terminal Automation Replacement System Server System Wide Information Management Time-Based Flow Management Total Cost of Ownership Terminal Flight Data Manager Traffic Flow Management Traffic Flow Management System Timeline Graphical User Interface

Traffic Management Initiatives Trajectory Prediction Trajectory Synthesizer Unmanned Aircraft Systems Virtual Central Processing Unit Voice over IP Weather Data Interface Weather Data Processing Weather Data Processing Daemon Workstation Weather

Acknowledgments This work is partially supported by the NASA under contract NNA12AB81C. The authors would like to thank Dr. Deepak S. Kulkarni and Dr. Parimal H. Kopardekar of NASA Ames Research Center, and Mr. Mark G. Ballin of NASA Langley Research Center for their guidance and inputs, and for their direct technical contributions in several major areas presented in this paper. Without their great effort and tireless support with work would not be possible.

References 1

Ren, L., Beckmann, B., Citriniti, T., and Castillo-Effen, M., “Cloud Computing for Air Traffic Management – Framework Analysis,” 2013 IEEE/AIAA 32nd Digital Avionics Systems Conference (DASC), IEEE, New York, NY, October 6-10, 2013. 2 Ren, L., Beckmann, B., Citriniti, T., and Castillo-Effen, M., “Cloud Computing for Air Traffic Management – Framework and Benefit Analysis: Alternative Functional Allocation Schemes,” V2.00, Project Document Submitted to NASA, GE Global Research, Niskayuna, NY, December 30, 2013. 3 Ren, L., Beckmann, B., Citriniti, T., and Castillo-Effen, M., “Cloud Computing for Air Traffic Management – Framework and Benefit Analysis: Review and Characterization of Current Operations,” V2.00, Project Document Submitted to NASA, GE Global Research, Niskayuna, NY, August 10, 2013. 4 Mell, P. and Grance, T., “The NIST Definition of Cloud Computing,” Recommendations of the National Institute of Standards and Technology, Special Publication 800-145, NIST, Washington, DC, September 2011. 5 Badger, L., Grance, T., Patt-Corner, R., and Voas, J., “Cloud Computing Synopsis and Recommendations,” Recommendations of the National Institute of Standards and Technology, Special Publication 800-146, Draft, NIST, Washington, DC, May 2011. 6 Ren, L., Beckmann, B., Castillo-Effen, M., and Citriniti, T., “Cloud Computing for Air Traffic Management – Framework and Benefit Analysis: Candidate ATM Functions for Advanced Computing,” V1.10, Project Document Submitted to NASA, GE Global Research, Niskayuna, NY, September 09, 2013. 7 Ren, L., Beckmann, B., Citriniti, T., and Castillo-Effen, M., “Cloud Computing for Air Traffic Management – Framework and Benefit Analysis: Benefit Mechanisms, Potential Beneifits, and Challenges,” V1.10, Project Document Submitted to NASA, GE Global Research, Niskayuna, NY, September 09, 2013. 8 Klein, M. H., Kazman, R., Bass, L. J., Carrière, S. J., Barbacci, M., and Lipson, H. F, “Attribute-Based Architecture Styles,” Technical Report CMU/SEI-99-TR-022, ESC-TR-99-022, Software Engineering Institute, Carnegie Mellon University, Pittsburgh, PA, February 1999. 9 FAA, “National Airspace System Requirements Document,” NAS-RD-2012, FAA, Washington DC, December 4, 2012. 10 FAA, “FAA Telecommunications Infrastructure (FTI) Operational Network IP Users’ Guide,” Revision 2D, Redacted for Public Release, FAA, Washington DC, May 2010. 11 Varia, J., “The Total Cost of (Non) Ownership of Web Applications in the Cloud,” Amazon Web Services whitepaper, Amazon, Seattle, WA, August, 2012. 12 AWS, “AWS Total Cost of Ownership (TCO) Calculator,” online tool, Amazon, Seattle, WA, as of May, 2014. Available at . 13 AWS, “AWS GovCloud (US) Region – Government Cloud Computing,” products and solutions web page, Amazon, Seattle, WA, as of May, 2014. Available at .

22