4, servers and network virtualization concept was explained along with tools .... software configuration through the Telescope Monitor and Configuration ... the required performance, stability and cost saving, a third one will arrive in 2012.
Virtualization in network and servers infrastructure to support dynamic system reconfiguration in ALMA
Tzu-Chiang Shena, Nicolás Ovandoa, Marcelo Bartscha,,Max Simmonda, Gastón Véleza, Manuel Roblesa, Rubén Sotoa, Jorge Ibsenb, Christian Saldiasa, a
Atacama Large Millimeter/submillimeter Array, Av. Alonso de Córdova 3107, Santiago, Chile, b European Southern Observatory, Av. Alonso de Córdova 3107, Santiago, Chile. ABSTRACT
ALMA is the first astronomical project being constructed and operated under industrial approach due to the huge amount of elements involved. In order to achieve the maximum through put during the engineering and scientific commissioning phase, several production lines have been established to work in parallel. This decision required modification in the original system architecture in which all the elements are controlled and operated within a unique Standard Test Environment (STE). The advance in the network industry and together with the maturity of virtualization paradigm allows us to provide a solution which can replicate the STE infrastructure without changing their network address definition. This is only possible with Virtual Routing and Forwarding (VRF) and Virtual LAN (VLAN) concepts. The solution allows dynamic reconfiguration of antennas and other hardware across the production lines with minimum time and zero human intervention in the cabling. We also push the virtualization even further, classical rack mount servers are being replaced and consolidated by blade servers. On top of them virtualized server are centrally administrated with VMWare ESX. Hardware costs and system administration effort will be reduced considerably. This mechanism has been established and operated successfully during the last two years. This experience gave us confident to propose a solution to divide the main operation array into subarrays using the same concept which will introduce huge flexibility and efficiency for ALMA operation and eventually may simplify the complexity of ALMA core observing software since there will be no need to deal with subarrays complexity at software level. Keywords: ALMA, networking, virtualization, dynamic configuration, VRF, MPLS, blade server, hypervisor
1. INTRODUCTION Nowadays astronomical projects are distinguished not only by the high technology in the instrumentation, but also by the size and huge number of involved elements. ALMA is a clear example of this kind of scientific project among others, where the large amount of antennas requires an industrial approach to establish production lines in order to assembly, integrate and verify the correct function of the 66 antennas. With the aim to process efficiently and reduce the construction period of the observatory, several parallel production lines had to be established. This decision required modification in the original system architecture in which all the elements are controlled and operated within a unique Standard Test Environment (STE) [3]. At beginning, duplicating the hardware comprises a STE was the first approach, but shortly became economically forbidden and produce excessive work load in the system administration team. Fortunately, in the industry, the virtualization market is getting consolidated and the preliminary tests have demonstrated to be a viable approach. After two years since the first attempt to use virtualization started, today we have 13 STEs deployed at ALMA to provide the required infrastructure and platform for the commissioning and verification processes for both engineering and science departments. This experience is shown in the following chapters in which, in chapter 2 an introduction of the operation in ALMA is summarized followed by the concept of a STE in chapter 3. Then in chapter 4, servers and network virtualization concept was explained along with tools developed in order to administrate and configure resource efficiently, and finally, the expected evolution in the near future and conclusion are presented in chapter 5 and 6 respectively.
2. ALMA HW/SW VERIFICATION AND ARRAY OPERATION PROCESSESS The hardware verification processes involves antennas, front-ends, back-ends, photonic references, correlators. One of the most resource demanding is the antenna verification processes. There are four types of antennas in ALMA Software and Cyberinfrastructure for Astronomy II, edited by Nicole M. Radziwill, Gianluca Chiozzi, Proc. of SPIE Vol. 8451, 84511N · © 2012 SPIE · CCC code: 0277-786X/12/$18 · doi: 10.1117/12.925246
Proc. of SPIE Vol. 8451 84511N-1 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 02/14/2014 Terms of Use: http://spiedl.org/terms
manufactured by three different vendors. Each vendor has its assembling line at the Operation Support Facility (OSF) [2] and STE with reduced version of ALMA observing software in order to verify the performance of each antenna prior the delivery to ALMA. After the antenna is received by ALMA staff, it must pass by 4 different verification stations before to be integrated and operated as part of the main array at the ALMA Operation Site (AOS) at 5000 m.a.s.l. [2]. In the antenna verification stations, first of all, antenna control/movement is verified. This includes minimizing the deformation of the main dish by adjusting its panels, and optical pointing accuracy measurements. In the station 2, the frontend and backend equipments are integrated. Station 3 corresponds to testing the antenna as a single dish instrument, and station 4 corresponds to integrate the antenna within the main array. These four stations are depicted at the Figure 1
Figure 1. Antenna Integration and Verification processes
In simulation environments, ALMA software are developed, integrated and tested serially as the first part of incremental software release cycle. Activities in the main array at the AOS are composed by ALMA software verification with real hardware within the context of incremental release phase IV. Weekly, software regression tests are executed as well in order to consolidate new patches and bug fixes. Sequentially, commissioning and science verification processes is performed by scientists, which also participate in the final software acceptance process (phase V of the incremental software release cycle). At the end, every two weeks, a block of early science operation is executed using the main array. Simultaneously, engineering department has to perform periodically maintenance activities in each antenna and other array infrastructure. The aforementioned activities are shown in the figure 2.
Figure 2, ALMA parallel processes during array construction
In order to execute all these processes efficiently, parallelization is introduced whenever it is possible. This put a big pressure in the deployment of the required infrastructure to support all these activities. The original design of a STE had to be adapted definitely.
Proc. of SPIE Vol. 8451 84511N-2 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 02/14/2014 Terms of Use: http://spiedl.org/terms
3. STANDARD TEST AND PRODUCTION ENVIRONMENTS ALMA software was designed to run on top of a set of servers which are grouped as a unit under the concept of Standard Test Environment (STE) [3]. Each STE has the same network subnets and addresses definition in order to guaranty the same environment expected by the ALMA software in scenarios such as: software development, software or hardware testing, antenna commissioning and array operation. Some of these are under simulation others are operated with real hardware. There is a total of 13 STEs across ALMA. Some of them are as simple as a set of three servers with the ALMA software configured in simulation mode, and in the other end, the one dedicated to control the main array will eventually have up to 135 servers in it. Functionally, a STE is an infrastructure to support activities in the ALMA production lines. For each production line, ALMA hardware such as: antennas, photonic references, local oscillators, correlators are configured together with the ALMA software in order to perform the verification, commissioning and science operation activities. The most dynamic elements are the antennas, which jump from one production line into another until they are integrated into the main operation array ready for science activities. Within the main production line, the array operation, there are still network configuration changes. ALMA array comprises of 66 antennas which can be relocated among the 198 available antenna pads for operation. Each antenna pad has its own fibers connection and almost every single antenna can be relocated into any pad. Within an antenna there are several network devices with static IP addresses which make another strong restriction to the overall network design: each STE must be able to communicate with antenna network but an antenna subnet cannot be changed across production lines.
4. THE SYSTEM ADMINSTRATION CHALLENGE It's clear that configuration of each STE varies according to its assigned high level function within ALMA processes. This involves network configuration, services configuration (such as DHCP, DNS, TFTP, etc) and finally the ALMA software configuration through the Telescope Monitor and Configuration Database (TMCDB) [4], a relational database which models and persists configuration of each element of the array. A misconfiguration can cause hours of downtime or even days in the verification processes and could be even worse if the misconfiguration passes unnoticed and it is only detected at science data verification level, in such case, all the tests have to be repeated. The big challenge here is how to administrate correctly and efficiently all the available resources and guarantee the traceability of a single configuration change in every single STE. Having in mind that configuration changes happen in daily basis. The most important requirements in regarding to the administration of all the STE are: i) operation in one STE must be isolated from operations in other STE, ii) minimize time consumed in integrate a new antenna in a STE, iii) version control of each configuration change, iv) live health monitoring and finally v) be able to show live resource distribution. 4.1 Server Virtualization A STE is a base requirement to support operation of a production line in ALMA. With the processes parallelization, the proliferation of STE became unmanageable shortly. At beginning, duplicating the hardware comprises a STE was the first approach, but shortly became economically forbidden and produce excessive work load in the system administration team. Therefore we decided to experiment with virtualization. Starting from 2009, several virtualization technologies have been evaluated, such as XEN, KVM, VMware. But the final choice was VMware [5]. The final decision was based mainly on features of administration tools rather than a performance reason [3]. Since then, the virtualization technology has been the foundation to provide computing power in ALMA. Server virtualization is also being used in other areas beside the STE, and it not only in the testing and development environment but also in production environment, i.e.: build farms, non real-time production servers (web, database, applications). After two years of production experience with virtualization, a new step has been taken: consolidate the rack mount servers in blade servers. Two chassis with blade servers [8] were procured and put into operation during 2011. After demonstrating the required performance, stability and cost saving, a third one will arrive in 2012. Currently, we have in production two chassis with blade servers administrated with VMware ESXi hypervisor [5]. Some interesting features among others are the High Availability and the Distributed Resource Scheduler (DRS) of the
Proc. of SPIE Vol. 8451 84511N-3 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 02/14/2014 Terms of Use: http://spiedl.org/terms
VMware cluster, which can provides online virtual machine relocation capability upon failures on one specific host. The DRS capability also introduces more flexibility in maintenance activities, in which one node can be taken offline without affecting to the production environment. The memory management of ESXi hypervisor has the ability to reclaim unused memory, de-duplicate and compress memory pages, therefore, overestimated resource allocation for virtual machines is less critical. Virtualization brought many advantages to the ALMA Software Group: rapid deployments, sizing of virtual servers according to requirements, a higher level of redundancy. Economically speaking, blade servers have become a kind commodity to provide computing power in ALMA. Having common hardware infrastructure, resource can be easily reclaimed after a specific production line ceases and reassign to new one, with different scenario and different requirements. Budget plans are simplified as well. There is no need to deal with the cost estimation of diverse hardware combination for upcoming projects. The sole input to be taken into account is the estimation of computing power required by new projects. The hardware and software technologies used for create and administrate virtual servers in ALMA are shown in table 1. Table 1. Hardware and software specification used for virtualization. Hardware Dell blade chassis M1000e, with Cisco WS-CBS3130X-S 20 Gbps uplink, WS-CBS3032DEL-F blade switch, Brocade 5424 8Gbps Fiber Channel. Dell blade server M610 with two quad core processors, 2 Gbps Ethernet ports and 2 x 8 Gbps fiber channel port. Hypervisor Software VMware ESXi 4.1, update 2. VMware vCenter 4.1, update 2 Storage NetApp 2040 storage, 12 Tb SAS, 24 Tb SATA, 4 x 4 Gb fiber channel.
One important milestone with virtualization was achieved during the AOS Test Interferometer (AOSTI) project [6], in which, a fully virtualized STE was deployed in order to control an array with real hardware. This was never done before, previously, only servers within a STE are partially virtualized, leaving most performance demanding server in baremetal. Despite there is still minor resource adjustment to reach the optimums overall STE performance, this result gave us confident to proceed with full virtualization of every single STE. 4.2 Network Virtualization Beside the server virtualization, network infrastructure needs to be adjusted in order to support them. One special case is the STE, which has a strong network specification in order to preserve the same environment to the ALMA software in every production line. This time, replicate the network infrastructure is not a solution, neither from the economical point of view, nor from the administrative effort point of view. Therefore other approach such as virtualization was explored. Network virtualization is not a new concept, it started decades ago with Frame Relay, ATM PVC, later it evolves to Virtual LAN (VLAN) and nowadays also Layer 3 virtualization is possible with the VRF technologies. Most of network equipments in ALMA are CISCO based. In term of virtualization, Cisco provides technologies: Virtual Routing and Forwarding (VRF) and Multiprotocol Label Switching (MPLS) [7]. VRF is a technology implemented in IP network routers and allows multiple instances of a routing table to coexist on the same router at the same time. Since each VRF is independent, the same IP subnet can exist in 2 different VRFs. Basically IP address can overlap in 2 VRFs but without conflicting with each other. Another meaning of VRF is VPN Routing and Forwarding which is a key element in Cisco’s MPLS (Multiprotocol Label Switching) VPN technology. Internet service providers often take advantage of VRF to create separate virtual private networks (VPNs) for customers and provide scalable IP MPLS VPN services. In short, virtual networks enable administrators to split a physical link into multiple virtual links completely isolated one from the others.
Proc. of SPIE Vol. 8451 84511N-4 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 02/14/2014 Terms of Use: http://spiedl.org/terms
In figure 3 and 4, physical and logical diagrams about VLANs and VRFs are showed. VLANs are used at layer 2 switches to create logically isolated subnets and VRFs are created at layer 3 switches to route packages among VLANs and create virtual domains.
Figure 3, virtual networks built on top of physical network infrastructure.
Figure 4, virtual LANs are grouped within a VRF to create a isolated domains within the same hardware infrastructure.
The summary of the requirements imposed in the overall network design are: i) Each STE has to be isolated from the other STEs, ii) In each STE, network devices located at the antenna must be configured with the same IP addresses, iii) the number of STEs may varies according the need of the observatory, iv) there are equipments which are distributed across two computer rooms in the observatory, separated by a distance of 40 km. and v) one antenna can be assigned to any of the available antenna pads The network layout within a STE is shown in the figure 5, in which array related hardware are grouped by class C subnets. A firewall is also in place in order to isolate the STE.
Proc. of SPIE Vol. 8451 84511N-5 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 02/14/2014 Terms of Use: http://spiedl.org/terms
Figure 5, network layout within a STE.
The resulted physical network diagram is showed at figure 6, in which, border layer 2 switches are connected by the mean of layer 3 switches in both computer rooms. Both computer rooms are connected by redundant 10Gbps links.
Figure 6, physical diagram of network device in computer room located at the OSF and the AOS.
Proc. of SPIE Vol. 8451 84511N-6 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 02/14/2014 Terms of Use: http://spiedl.org/terms
Logically, we defined a virtual domain per each STE. The corresponding subnets are created in each domain or VRF and are routed within it. To allow controlled access to each STE through the user network, a firewall is configured accordingly. Note that antennas subnets are not replicated across all STEs, there is only one set of antennas subnets. According the stage of an antenna, it is associated with a single STE for a given moment. This association can be easily done through the association of the given VLAN to the proper VRF. In figure 7, operational STEs are showed: TFENG, TFINT, TFOHG, TFSD, and TBAOS.
Figure 7, logical diagram of virtual domains created to support parallel production lines.
Proc. of SPIE Vol. 8451 84511N-7 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 02/14/2014 Terms of Use: http://spiedl.org/terms
4.3 Administration tools Together with the virtualization, we defined the following model in order to administrate and monitor the infrastructure. This model is composed by 3 layers; i) physical Infrastructure (switches, routers, firewalls, blade servers, etc.), ii) software to support virtualization (VRF Lite, VLAN, VMware ESXi, etc.), iii) Application servers. In parallel Administration and monitoring tools interact with all these layers.
Figure 8, physical diagram of network device in computer room located at the OSF and the AOS.
Configuration evolves in daily basis in all production lines and to complicate even more the change requests come from several sources and should be executed by several experts working in shifts. To allow homogeneity and minimize the configuration time and down time due to human introduced errors in the configuration, several tools and configuration templates have been developed in house. These tools receive high level parameters as inputs and update the configuration across switches, routers, servers, and restart finally the ALMA software to reflect the changes. For example: moveAntennas scripts. Table 2. Example of script to reconfigure a STE. AOS2: gns tshen:~ 1 > moveAntennas -h Usage: moveAntennas [-s STE|-S] -p PAD -a ANTENNA [-n][-v][-w] Options: -h, --help -s STE -a ANTENNA -p PAD -n -v -w -S
show this help message and exit STE name (TFINT, TFOHG, TFSD, TFENG, TBAOS, TBAOS2) Antenna to move Pad to place the antenna Dry run, do not make any real changes, and show the current VLAN configuration Be verbose Write configuration to the switch's startup script Show current VLAN configuration for Antenna
Having the possibility to switch ALMA hardware across STEs, we proposed a solution to divide the main operation array into subarrays using the same concept which will introduce more flexibility and more activities could be done in parallel within ALMA operation. The same approach may be an alternative to simplify the complexity of ALMA core observing software since there will be no need to deal with subarrays complexity at software level. Additionally, a web based dashboard is provided to show the current setup. This allow engineers and scientists who don't need to understand the underneath virtualization complexity to know quickly the current configuration of the production line they are working with.
Proc. of SPIE Vol. 8451 84511N-8 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 02/14/2014 Terms of Use: http://spiedl.org/terms
Figure 9, a web based dashboard which shows the online status of resources configured within the TFSD STE.
Finally, all the configurations are kept under version control in CVS. Additionally, a configuration control framework called puppet [10] is used to supervise unauthorized changes in the system and revert automatically them back to the correct version. Configuration files are organized with object oriented approach, in which common sections are established and then they are inhered and specialized in each STE, this reduces considerably the amount of configuration files to maintain.
5. EVOLUTION IN THE NEAR FUTURE After proving the successful architecture, in the coming months we are going to start to consolidate and standardize all STE across ALMA observatory, since some of them haven't be migrated yet due to historical reason. Administration tools is being improved, putting specially emphasis in simplify the dashboards and make them more intuitive for users from other departments. Currently, hardware resource such as an antenna can be switched dynamically across different STE and the result is reflected almost instantaneously at the network and operative system level, but this is not the case in the ALMA software. Improvement needs to be invested in this area since a restart of a ALMA software still takes a considerable amount of time. In the middle term, we want to keep moving in the same direction and including concepts of IaaS, PaaS in ALMA. Interesting open source initiatives such as Cloud Foundry and other commercial alternatives (Heroku, Openshift, Google App Engine) are being evaluated to deal with the high rate of proliferation of internal web applications. Other area of virtualization we are interesting on are related to ALMA ARCHIVE [9], the current architecture is based on ESO NGAS cluster system [9], a ten years old design. Even it has been proved to be very stable and consolidated; economically it is not very efficient. We believe that an important cost saving can be achieved with virtualization in this area as well.
6. CONCLUSION Virtualization experience of last two years was presented, in which, we showed that virtualization has been the foundation to provide new computing power and network infrastructure in ALMA. Given the industrial nature of ALMA project and especially during the construction phase, virtualization has been a important factor to accomplish to the observatory's requirements to establish parallel production lines in a very efficient and cost saving way.
Proc. of SPIE Vol. 8451 84511N-9 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 02/14/2014 Terms of Use: http://spiedl.org/terms
Nowadays, with a highly competitive and dynamic computer hardware market, life cycle of a product is extremely short (less than 2 years). This may be good for the industry, but for a astronomical observatory it's really a night mare to find the proper spares for the existence infrastructure in the middle term. An alternative could be purchase project life time spares, but this approach beside to be risky, sometime is impractical due to budget restriction. The Investment in blade servers helps us to guarantee not only the continuity of support and spares but at the same time we still get benefited by the evolution of this market.
REFERENCES [1] Shen, T., Ibsen, J., Olguin, R., Soto, R., Other, "Status of ALMA Software", Proc. ICALEPCS XIII, (2011). [2] Gonzalez, V., Mora, M., Other, "Fist year of ALMA site software deployment: Where everything comes together", Proc. SPIE 7737, 77371Z-77371Z-8 (2010). [3] Zambrano, M., Arredondo, D., Other "Experience virtualizing the ALMA Common Software", Proc. ADASS XIX, 434, 477 (2010). [4] Farris, A., Hiriart, R., "The ALMA Telescope Monitor and Configuration Database", NRAO, design document, (2008) [5] http://www.vmware.com/products/vsphere/mid-size-and-enterprise-business/overview.html [6] Olguin,R., Shen, T., Other, "Development of the Test Interferometer for ALMA", Proc. SPIE. (2012) [7] http://www.cisco.com/en/US/docs/net_mgmt/vpn_solutions_center/1.1/user/guide/VPN_UG1.html [8] http://www.dell.com/us/business/p/poweredge-blade-servers [9] A. Wicenec, J. Knudstrup, "ESO's Next Generation Archive System", ADASS XI, 95, 98. (2001) [10] http://www.puppetlabs.com
Proc. of SPIE Vol. 8451 84511N-10 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 02/14/2014 Terms of Use: http://spiedl.org/terms