A JSON-Based Markup Language for Deploying Virtual. Clusters via Docker. Scott Morton, Salvador Barbosa, Ralph Butler and Chrisila Pettey. Department of ...
A JSON-Based Markup Language for Deploying Virtual Clusters via Docker Scott Morton, Salvador Barbosa, Ralph Butler and Chrisila Pettey Department of Computer Science, Box 48 Middle Tennessee State University Murfreesboro, Tennessee, USA
Abstract - Our team develops large cluster-based projects (often using MPI). At times it is necessary to simulate a cluster and perhaps even two or more networked clusters. This occurs particularly in a testing mode where one or more of the clusters may be involved in production work. It is possible to run multi-rank MPI jobs on a laptop or desktop, but it is more problematic to simulate a cluster or a multi-cluster environment. The technique presented in this paper uses Docker containers, a JSON system configuration description, and Python scripts to quickly and easily build and run user defined networked systems on any of the three major host operating systems - Linux, Mac OS X, and Microsoft Windows. Keywords: Virtual Clusters, Linux Containers, Docker, JSON, Python
1
Introduction
While configuring a cluster, or a network, requires an understanding of networking (e.g. IP addresses) and operating systems concepts (e.g. host-based authentication), most users of high performance clusters are not the system administrators and never have to concern themselves with the potentially tricky issues involved in configuring them. However, at times, it would be convenient to simply use a virtual cluster that is a replica of the actual hardware. Such situations might include, the system not being available due to scheduling, or the need to run a test of a simple change to a piece of software, or the system is inaccessible due to being in a remote location. Whatever the situation, we need to run some experiments, and we cannot run them on the actual hardware. In this instance, it would be nice to have a virtual replica of the system on our laptop. However, configuring a virtual system on our laptop suddenly puts us in the position of being the system administrator. In order to alleviate the hurdles faced in such a situation, we proposed a markup language for describing virtual clusters in 2005 [1]. That system based on XML, QEMU, VDE, TUN/TAP, and Python was called VCML, was fairly simple to use, and was almost indispensable in debugging software running on various, sometimes confusing, hardware configurations. However, over time, technologies changed. Some software is no longer
supported, while newer, more powerful software becomes available. With the popularity of JSON [5], and the advent of Linux Containers [6] and Docker [3], we have developed a new virtual system configuration software system, VCML2. This paper describes VCML2 in section 2 and illustrates its use with some example clusters in section 3.
2
Virtual Cluster Markup Language
To understand the proposed system for describing and deploying virtual clusters, it is necessary to be somewhat familiar with the underlying software that is combined to create VCML2. This underlying software, including Linux Containers, Docker, and JSON will be discussed in section 2.1. Then section 2.2 will present the basics of VCML2
2.1
Support Software
The project presented in this paper is about building virtual clusters on a host computer running any of the three most common operating systems: Linux, Mac OS X, and Microsoft Windows. The prior project, VCML, used virtual machine software, but this project, VCML2 uses container software. When using container software, it is important to understand two notions: image and container. An image is a passive or static entity like an executable program. It consists of an executable operating system configured with services and other programs. A container, on the other hand, is an active entity, like a process. In fact, it actually runs as a process on the host operating system. The container process runs the operating system and provides the services and programs made available from the associated image. Linux Containers [6], LXC, are the Linux community's attempt to create their own version of a technology that has been around for several years. Two older examples are Solaris Zones [9] and FreeBSD Jails [4]. Linux containers being relatively new are still in development and have not fully addressed some issues such a security. However, they formed the basis for early Docker implementations, and so they are included here. Docker has since moved on to their own model. Docker containers [3] almost immediately proved extremely useful, and were quickly deployed in many production shops. One good reason for the popularity of
Docker is that it is much less resource intensive to spin up hundreds of containers on a single system, than to do something similar with virtual machines. Additionally, it is possible to develop software in a container on a test machine and then deploy the container with the developed software onto the production machine - thereby ensuring that the production environment is identical to the development environment. Since there is such a large Docker community, their containers are rapidly becoming production level software. For these reasons, we chose Docker containers as the basis for VCML2. JSON [5] originally became a de-facto standard among web developers because of its simplicity and its obvious relationship to JavaScript. Mostly due to the simplicity, its popularity spread to other communities, e.g. Python, where it has largely displaced markup languages such as XML. Indicative of this popularity is that Python has a JSON module that is part of the standard distribution. Thus it was an obvious choice for our project for the system configuration language. Python [7] is a scripting language. Given the ubiquitous nature of Python, it seems unnecessary to explain our use of it in this project. That is particularly true given Python support for JSON. Our current project consists of a handful of scripts that can be used to build, deploy, and manage virtual clusters configured with VCML2 and running on Docker containers. These scripts, on average, consist of about 120 lines of executable code each. We make those scripts freely available, not only to our colleagues and students, but also to anyone who would like a copy.
2.2
VCML2
As was stated previously, the goal of this project was to be able to build a wide range of containers - from a couple of nodes on a single network to multiple clusters with routing between them. Additionally, we wanted to be able to do this on a single computer running any of Linux, Mac OS X, or Microsoft Windows. Specifically, we wanted to provide an easy to use toolset that would quickly create a virtual system for the user on the host platform of their choice. We envisioned that the user would sketch the configuration they wanted to deploy, express that configuration in a simple markup language, and then run a script that would build and deploy that configuration. We also wanted to provide management functionality in terms of routing and users and some file system support. To accomplish this goal, we first developed the language that allows the user to express the configuration in a JSON file. To keep the language simple, there are only two types of entities - networks and nodes. Each type of entity has a set of attributes.
Networks have a name, a driver, and a subnet attribute. The name attribute is used by the nodes to specify which networks nodes are on. It makes configuring the Docker containers more convenient. With regards to the driver attribute, we currently have only tested with bridge, but it can use any that the user has support for on their system. The subnet is represented in CIDR notation. Nodes ultimately become containers that represent computers and/or routers attached to a network. In the JSON configuration file they have the following attributes: name, hostname, list of networks to which they are attached, image, volume, and router. As with networks, the name attribute is a Docker convenience, and we usually have it match the hostname, although that is not a necessity. The image is the virtual disk from which the OS and all its facilities will be booted. It should be noted that executables, such as MPI, that are resident on the host platform can be mounted into the virtual cluster nodes, instead of having to do an install into the node. The volume is a colon-separated pair giving the volume on the physical host that is mounted on the container when it boots up. The volume attribute is optional. router is a boolean attribute that indicates if the node performs routing functions possibly in addition to being a compute node. Once the desired JSON system configuration file is created, the Python dbuild script can be used to build and start the system running. dbuild must build the networks, build the nodes (containers) from their associated images, and make sure the nodes are attached to the appropriate networks. After building the system, dbuild will configure routing. Any node that is a router has to have routing turned on and have their routing tables set up. For nodes that are not routers, there has to be minimal setup of their routing tables (which network/router to use). In setting up the routing tables, our assumption is that the user wants each node to be able to communicate with all nodes it could possibly communicate with in the given configuration. So, if there is any communication path between two nodes (even going through multiple routers), we set up routing tables. If the user does not want routing to occur between two nodes, then they must set the routing tables by hand or else reconfigure the system. For example, in Figure 1, m2 is configured as a router. This means that m1 can communicate with m4 or m3 through m2. If the user does not want m1 to be able to communicate with m3, then they either need to reconfigure the system as in Figure 2 where there is no m2, or if they need m2 as a compute node, then they could configure the system as in Figure 3. If the user does not want to change the configuration and does not want the default routing tables provided by our scripts, then they will need to configure their own routing tables.
prior to building the system, then you could do an appropriate docker pull command. And finally, if you want to build your own image, then we provide a small Dockerfile file that you can customize and use in an appropriate docker build command.
Figure 1. All nodes can communicate with each other.
Figure 2. Node m1 and node m3 cannot communicate with each other.
As was mentioned previously, there are five small Python scripts for building, deploying, and managing the virtual systems. dbuild, as described above, fully builds out and starts the configuration described in a config.json file. dstop will leave the configuration built, but stops the execution of the nodes including the routers. dstart can be used to start the configuration up again if it is already built. dremove does a stop and removes all the components nodes and networks. duseradd can be used to add a user to all currently active nodes providing the same valid user id on the running container as you have on the host machine. . Admittedly, you could do the same thing we provide with VCML2 by installing and using the extra tools provided by Docker and Weaveworks [10] (e.g., Weave, Compose, Swarm, etc.). However, along with the need to install and configure additional software, we found there to be a fairly steep learning curve to using these tools. Additionally, Docker Compose - the tool that allows you to specify the system configuration - is not supported for Microsoft Windows. Unlike the Docker tools, to use VCML2 you install Docker, download our five Python scripts, create your JSON configuration file, run dbuild, and the system is ready to be used. For a simple network, this whole process can be done in under ten minutes. And it will work on Microsoft Windows, Linux, and Mac OS X. The following section demonstrates VCML2's simplicity.
3
Figure 3. All nodes can communicate with nodes m2 and m4, but m3 and m1 cannot communicate with each other. In addition to routing you might want root to be able to ssh among nodes. Therefore dbuild sets that up automatically. Also, you might want to be able to run something such as MPI, and you don't want the user to have to set up ssh keys, etc., so dbuild also sets up host-based authentication. The final item that must be considered in building a system is the image that will be run on a container. There are three possibilities for obtaining an image. The first, and simplest, is to just use the image we provide which will automatically be pulled from an online repository when you do the first build of a system. If, however, you wanted to work offline, and wanted to make sure you had our image
Example Virtual Systems
Shown below (Figures 4 - 6) are three architecture models. The simplest is the single cluster with two nodes on a shared network. The second example is of two separate systems on disparate networks that are capable of communicating with each other through a shared router. The final example is of four separate systems with two shared routers. The left column of the table contains the diagram of the proposed system, while the right column contains the JSON necessary to describe it. In the JSON of Figure 4 there is only one network, netA, in the NETWORKS section. In the NODES section, both m1 and m2 are described along with the networks accessible to them and whether or not they are nodes capable of acting as routers. In figure 5, node m2 is somewhat more interesting in that it is configured as a router that is attached to both netA and netB. In figure 6, m5 is an example of a node where dbuild must be careful in setting up the routing table, because m5 can access two routers.
{ "NETWORKS": { "netA": { "driver": "bridge", "subnet": "172.16.0.0/26" }, }, "NODES": { "m1": { "hostname": "m1", "image": "vcml/vcml2", "router": false, "networks": ["netA"], "volume": "/nfshome/home/scott:/home/scott" }, "m2": { "hostname": "m2", "image": "vcml/vcml2", "router": false, "networks": ["netA"], "volume": "/nfshome/home/scott:/home/scott" }, } } Figure 4. Single Cluster architecture with two nodes and a shared network and the JSON file that describes it.
{ "NETWORKS": { "netA": { "driver": "bridge", "subnet": "172.16.0.0/26" }, "netB": { "driver": "bridge", "subnet": "172.16.0.64/26" }, }, "NODES": { "m1": { "hostname": "m1", "image": "vcml/vcml2", "router": false, "networks": ["netA"], "volume": "/nfshome/CNL/home/scott:/home/scott" }, "m2": { "hostname": "m2", "image": "vcml/vcml2", "router": true, "networks": ["netA","netB"], "volume": "/nfshome/CNL/home/scott:/home/scott" }, "m3": { "hostname": "m3", "image": "vcml/vcml2", "router": false, "networks": ["netB"], "volume": "/nfshome/CNL/home/scott:/home/scott" }, } }
Figure 5. Two separate systems capable of communicating via a shared router and the JSON that describes them.
{ "NETWORKS": { "netA": { "driver": "bridge", "subnet": "172.16.0.0/26" }, "netB": { "driver": "bridge", "subnet": "172.16.0.64/26" }, "netC": { "driver": "bridge", "subnet": "172.16.0.128/26" }, "netD": { "driver": "bridge", "subnet": "172.20.0.0/24" }, }, "NODES": { "m1": { "hostname": "m1", "image": "vcml/vcml2", "router": false, "networks": ["netA"], "volume": "/nfshome/home/scott:/home/scott" }, "m2": { "hostname": "m2", "image": "vcml/vcml2", "router": true, "networks": ["netA","netB","netC"], "volume": "/nfshome/home/scott:/home/scott" }, "m3": { "hostname": "m3", "image": "vcml/vcml2", "router": true, "networks": ["netC","netD"], "volume": "/nfshome/home/scott:/home/scott" }, "m4": { "hostname": "m4", "image": "vcml/vcml2", "router": false, "networks": ["netB"], "volume": "/nfshome/home/scott:/home/scott" }, "m5": { "hostname": "m5", "image": "vcml/vcml2", "router": false, "networks": ["netC"], "volume": "/nfshome/home/scott:/home/scott" }, "m6": { "hostname": "m6", "image": "vcml/vcml2", "router": false, "networks": ["netD"], "volume": "/nfshome/home/scott:/home/scott" }, } }
Figure 6. A complex system including four disparate systems with two shared routers and the associated JSON description.
4
Conclusions
For all the times when it would be convenient to simulate a system of networked routers and compute nodes on your desktop or laptop, we have provided a relatively simple, easy technique for specifying, building, deploying, and managing possibly complex virtual systems using Docker containers. The technique, called VCML2 uses JSON with only two entities for describing the system configuration. Building, deploying, and managing the specified system is accomplished with five small Python scripts that we make freely available to the community. VCML2 works on Linux, Mac OS X, and Microsoft Windows - a necessity to us as all three operating systems are in use by different members of our team, consisting of both faculty and students. While VCML2 was created to assist our team with research, one side benefit was that it is also useful for teaching. A Docker image alone can be used to replace the practice of having a virtual machine either on a jump drive [2] or via download [8] that makes it possible for students to do assigned programs in an environment comparable to whatever the professor needs for the class. However, VCML2 gives the added ability of allowing students to configure networked systems that can be used for a variety of purposes - for example to run MPI jobs, to run client-server jobs, or to learn about setting up routing to name a few. While we had anticipated needing to change VCML2 so that it had the ability to run distributed virtual networked systems - i.e. nodes located on multiple physical hosts - so far it has met all of our needs. In the future, if we find that we need to add that capability, then we will investigate the technologies that we used in the prior VCML project and contrast them with newer technologies that may exist.
5
References
[1] Butler, Ralph, Lowry, Zach, and Pettey, Chrisila, “Virtual Clusters,” Proceedings of the Eighteenth International Conference on Systems Engineering, August 2005, pp. 70 - 75. [2] Butler, Ralph, Pettey, Chrisila C., and Lowry, Zach, “CPVM: Customizable Portable Virtual Machines,” Proceedings of the 44th ACM Southeast Conference, March 2006, pp. 616 - 619. [3] Docker. https://www.docker.com/ [4] FreeBSD Jails https://www.freebsd.org/doc/handbook/jails.html [5] JSON http://www.json.org/ [6] Linux Containers. https://linuxcontainers.org/ [7] Python https://www.python.org/ [8] Sayler, A., Grunwald, D., Black, J., White, E., Monaco, M., "Supporting CS Education via Virtualization and Packages: Tools for Successfully Accommodating "Bring-Your-Own-Device" at Scale," Proceedings of the 45th ACM Technical Symposium on Computer Science Education, 2014, pp. 313-318. [9] Solaris Zones http://docs.oracle.com/cd/E36784_01/html/E36848/zones .intro-1.html#scrolltoc [10] Weaveworks https://www.weave.works/company/