Virtualization

12 downloads 22685 Views 127KB Size Report
Linux (UML)-based virtualization solution for this class of ... source approach for a low-cost, scalable ..... Linux, benefiting from a closer integration between host.
V I RT U A L I Z AT I O N A Linux-based virtualization solution for IT courses leverages an open source approach for a low-cost, scalable cluster. The tool is suited for student home use for OS and networking lab assignments. Alessio Gaspar, Sarah Langevin, and William D. Armitage

Virtualization Technologies in the Undergraduate IT Curriculum

V

irtualization techniques have had a powerful impact on a wide range of server applications with the effect of increased performance, flexibility, and security. Here, we review virtualization’s development and observe that in some cases it can provide efficient solutions to previously difficult problems—such as hands-on learning related to OSs, networking, and system administration. While these domains demonstrate a clear need for active learning and hands-on experience, they also share a common problem: allowing students unrestricted root access to campus systems and networks frequently results in corrupted systems and downed networks. Previrtualization solutions emphasized expensive and inefficient dedicated lab facilities, which, among other disadvantages, barred these courses from online implementations. We implemented a user mode Linux (UML)-based virtualization solution for this class of courses, using an open source approach for a low-cost, scalable cluster, and we demonstrated that students can use this facility to successfully comVirtual Bookmarks plete OS and networking lab assignments from home.

Inside

36

IT Pro July ❘ August 2007

TO PRACTICE OR NOT TO PRACTICE: THERE’S NO QUESTION I hear and I forget, I see and I remember, I do and I understand—Confucius

We need active learning and hands-on experience to complement the lecture component of a course. Yet, the practical burden (classroom management, system administration, and so on) associated with implementing and maintaining laboratories can overwhelm instructors. Instructors and institutions tend to shy away from laboratories even when courseware is readily available.Thus, hands-on practice comes at a price. What is the exact nature of these hindrances and the price tag? What are the benefits we should expect?

Practical obstacles Let’s turn our attention to the IT curriculum’s need for hands-on practice. In the case of programming, no one will deny the positive impact of laboratories, homework assignments, group assignments, and in-class exercise sessions to help students internalize lecture concepts. When it comes to system-level courses, however, the

Published by the IEEE Computer Society

1520-9202/07/$25.00 © 2007 IEEE

widely accepted, secure, system-administration-friendly, general consensus on the benefits are often hindered by and cost-efficient way to turn any classroom into a disconsiderations of a more practical nature.To illustrate our posable laboratory for the previously mentioned courses. argument, let’s take the example of three undergraduate courses: OSs, networking, and system administration. Common to these is the need for students to work with VIRTUALIZATION TECHNOLOGIES full privileges on the systems to engage in meaningful TO THE RESCUE! activities.You want your students to explore, rewrite, and A virtual machine will allow you to run—as a process or even damage the OS kernel’s code; set up a network of group of processes—an entire simulated OS.You can have several machines to analyze the TCP/IP traffic between full privileges in the virtual machine while still operating them; or perform system administration tasks on a Web as a regular user on the physical computer. This setting server.The price to pay for such flexibility is unfortunately clearly facilitates security issues and maintenance. proportional to the potential pedagogical benefits. In most Reinstalling a virtual machine is generally as easy as copycases, this requires granting your ing its virtual disk file. students full privileged access to a There is little doubt that, by Nowadays, virtual classroom workstation, thus exposenabling the teaching of systemmachines provide a ing the hosting network to more related topics, virtualization techsecurity risks, and offering your have had a pedagogical secure and cost-efficient nologies technical support team (or instrucimpact comparable with that which way to turn any tor) an opportunity to spend many they had on data centers’ server hours restoring the systems to a consolidation. However, as with classroom into a functional state for use in the next any rapidly emerging technology, course. These issues often lead buzzwords can spread along with disposable laboratory. instructors and even entire institumisconceptions about key contions to avoid labs for OS or cepts or terminology. We want to networking courses because of the sheer amount of mainapproach virtualization technologies from a pedagogical tenance work involved. perspective with a solid understanding of the technological diversity existing in the wide spectrum of available solutions, allowing us to make an educated judgment about Early solutions their potential impact in the IT classroom. Denying the benefits of hands-on practice to students for some courses is not a satisfying solution in many respects. For OS courses, educational OSs primarily spare Virtual services students the overwhelming task of becoming familiar with Let’s start at the service level. We can isolate processes a production-level OS kernel in only one semester. Later, such as Web or email servers to improve the security of the most of these OSs were used with emulators, which didn’t physical server hosting them. This idea was pioneered by require students to reinstall a physical host each time they the chroot jails approach, which allowed a given service to worked on their projects.What about other courses? Early function in an isolated minimal file system. If this service on, institutions came up with their own mitigation solubecame compromised, the intrusion would be contained in tions to let students use some workstations with full privan easily restored capsule instead of granting the attacker ileges without endangering campus network security. Such access to the whole file system. While this has little to do solutions included isolating entire classrooms from the with current technologies deemed “virtualization,” the campus network. While addressing security concerns, this Linux V-Server project demonstrates how to extend this approach deprived students from accessing online idea to isolate services in separate security contexts. resources while working. It also locked down an entire Processes are isolated not only at the file system level but classroom for a single course or laboratory, causing schedalso directly at the kernel level, which allows for resource uling and resource-allocation problems. To improve the usage limits for each security context without the need to situation, some institutions had students carry around their run an entire virtual machine. In fact, the same physical own hard drive and insert it into specially equipped work- host’s kernel is used in each context, thus improving perstations. However, what if you need students to manage formance and simplifying resource sharing among them. several servers? What if they need them active at the same time? These approaches are not readily scalable and often Hardware emulators involve costly investments. They are also not compatible While the previously mentioned techniques share some with the distance education focus many institutions are conceptual roots with virtualization, hardware emulation promoting in today’s computing education landscape. comes to mind first when discussing virtualization techUntil recently, each institution developed its own tailored nologies. Bochs, plex86, and QEmu are good examples of solutions. Nowadays, virtual machines provide a clean, open source solutions that emulate Intel x86 hardware. July ❘ August 2007 IT Pro

37

VIRTUALIZATION

x86 hardware? The work produced by an emulator then Emulators for other platforms also exist, including some seems like overkill for such a restricted scenario. Other that retrocomputing experts might find familiar, including solutions are far more adept in these cases. microcomputers (Atari, Amiga, C64). Because these programs actually emulate in software the hardware in its entirety, any OS that runs on this hardware will run seam(Full) virtualization lessly except, of course, for timing-sensitive functions. Most commercial solutions (VMware, Virtual PC) use However, this emulation process is inefficient and makes virtualization techniques, also referred to as native or full these approaches less attractive for performance conscious virtualization. These approaches were designed from the users. Emulation is a heavy duty process, which, in most ground up with efficiency in mind. While emulators take practical applications, is uncalled for. Let’s take the exameach single instruction of the code and translate them into ple of a Linux server on which we want to run a virtual machine code that the hosting machine can execute, virmachine dedicated to a Web server, another dedicated to tualizers assume that both the hosting and virtual systems an email server, and so on. If these are of the same architecture. This virtual machines are to also run a allows them to run the code The underlying Linux system, we run Linux on x86 natively and only intercept instruchardware bare metal, run an x86 emulator on tions that would cause problems, top, and finally another Linux on instead of tackling a computationarchitecture's design top of the virtual x86. This is parally intensive translation task for specificities can ticularly wasteful and motivated every instruction.This brings nearthe emergence of two distinct famnative performance to the virtualmake efficient ilies of virtualization solutions: ization process. This approach is virtualization tricky. those meant to let you run any arbialso exemplified by the Virtual Box trary OS on your host but with project, which executes most code greatly increased performance, and natively (virtualization) but relies those meant to facilitate the execution of a virtual version on code scanning, analysis, and patching techniques when of an OS on top of itself. having to deal with x86 instructions that don’t lend themselves to clean virtualization.

Efficient hardware emulators Is hardware emulation completely obsolete? Not quite, but it had to evolve to respond to the need for speed unanimously expressed by the IT industry. QEmu illustrates this evolution perfectly; while its source code is either GPL or LGPL, its author also released an accelerator loadable kernel module that improves the performance of this hardware emulator to near native level. From the technological perspective, QEmu uses a so-called portable dynamic translator to execute x86 code. To quote the QEmu Web site: When [QEmu] first encounters a piece of code, it converts it to the host instruction set. Usually dynamic translators are very complicated and highly CPU dependent. QEMU uses some tricks which make it relatively easily portable and simple while achieving good performances. The basic idea is to split every x86 instruction into fewer simpler instructions. Each simple instruction is implemented by a piece of C code (see ‘target-i386/op.c’). Then a compile time tool (‘dyngen’) takes the corresponding object file (‘op.o’) to generate a dynamic code generator which concatenates the simple instructions to build a function (see ‘op.h:dyngen_code()’).

This approach allows you to run an x86-based OS on top of another architecture. But what if we are trying to run a x86 virtual machine on top of an OS already running on 38

IT Pro July ❘ August 2007

Paravirtualization The underlying hardware architecture’s design specificities can make efficient virtualization tricky. Paravirtualization addresses this issue by running all of the virtual machines on top of a so-called hypervisor.We need to port each OS running in these virtual machines to a new architecture: the hypervisor. Porting primarily consists of making sure all calls issued by the guest OS will be either native or rely on the hypervisor’s specific features. Once the port is completed, the guest OS will no longer rely on the unwanted instructions that caused full virtualization to go through so many contortions. The obvious disadvantage of such an approach—which is at the heart of the increasingly popular Xen virtualization platform—is that an OS cannot simply be installed inside a virtual machine. It needs to be ported first to the hypervisor architecture which, in the case of closed source products, can be an unattainable goal. On the other hand, paravirtualization offers excellent performance, which led commercial Linux distributions such as Red Hat Enterprise 5 to adopt it. Furthermore, the need to port a guest OS is no longer a concern because Intel and AMD provide support for virtualization in their latest processors.

Hardware support for virtualization As is clear by now, many ingenious technologies responded to the need for speed. Today, both emulation

and virtualization approaches can be further optimized through the use of virtual machineenabled specialized hardware. Both Intel and AMD offer chips with hardware support for virThe following URLs will allow the reader to explore the many tualization, thus overcoming some of the design virtualization projects mentioned in the article: flaws of the x86 family of processors.The Kernel Virtual Machine (KVM) project is, after the ➤ Linux VServer; http://linux-vserver.org/ UML project, the second virtual technology to ➤ Qemu; http://www.qemu.com/ make it inside the kernel source code tree. KVM ➤ KVM; http://kvm.qumranet.com/kvmwiki includes some QEmu code and tools to provide ➤ Xen; http://www.cl.cam.ac.uk/research/srg/netos/xen/ a virtual machine solution that relies on the avail➤ VMWare; http://vmware.com ability of virtualization extensions in the CPU. ➤ UML; http://user-mode-linux.sourceforge.net/ KVM relies on two loadable kernel modules ➤ Pacifica/vanderpool; http://en.wikipedia.org/wiki/ (LKM) loaded in the Linux kernel of the host X86_virtualization that allow it to serve as a hypervisor for virtual ➤ Virtual box; http://www.virtualbox.org/ machines. The first one, kvm.ko, contains the ➤ Win4lin; http://www.win4lin.com/ CPU-independent code and relies on a second ➤ Blue pill; http://theinvisiblethings.blogspot.com/2006/06/ one, either kvm-intel.ko or kvm-amd.ko, to leverintroducing-blue-pill.html age the virtualization extensions of either proces➤ Bochs; http://bochs.sourceforge.net/ sor family.A modified QEmu process uses these ➤ Plex86; http://www.plex86.org/ LKM modules to handle virtual machines capa➤ Wine; http://www.winehq.org/ ble of hosting an unmodified guest OS. ➤ Willows; http://www.willows.com/ Unprivileged users can then create these virtual machines and handle them as simple processes, thus providing flexibility of usage in addition to increased performance. within the UML instances without requiring setting up a Hardware support for virtualization thus affects emulacomplicated networking configuration to allow for the use tion-based approaches but can also benefit paravirtualof a remote file system such as the Network File System ization technologies such as Xen. Xen requires porting the (NFS) or the Common Internet File System (CIFS) UML guest OS to the hypervisor. However, the newest releases also shines when it comes to running multiple identical virof Xen leverage hardware support to waive this constraint. tual machines. If all your students need to start similar When running on a recent Intel or AMD CPU, Xen uses virtual machines, most virtualization solutions—except virtualization extensions to run an unmodified OS as guest, QEmu, which has a similar feature to the one we are making it even more versatile. describing—will require each virtual machine to have its own copy of the virtual disk image. With UML, each instance will boot from the same single disk image and Virtual OS vs. virtual machine store modifications it makes in a copy-on-write file. This To conclude our overview, we need to discuss another tremendously reduces disk space use on the server and category of virtual machine technologies.This one is exemfacilitates system administration. Last but not least, UML plified by the User Mode Linux (UML) project, which has integrates with the Manage Large Networks project been part of the Linux kernel source tree for some time, toolkit, which enables students to build on-demand, thus allowing users to recompile their own virtual arbitrarily complex virtual networks by specifying only a machines from the Linux source code. According to Jeff simple configuration file. Dike, UML project creator, UML is a port of the Linux kernel onto itself. In the source tree, a new port named UML has been introduced and contains the code allowing THE IT EDUCATOR’S PERSPECTIVE the current kernel to execute as a process (guest) on top of another host Linux kernel.This approach is often referred Will someone please think to the children? to as a virtual OS instead of a virtual machine to underline —Helen Lovejoy the fact that UML is meant to execute Linux on top of Linux, benefiting from a closer integration between host From a technological perspective, the previously menand guest virtual machines. tioned virtualization platforms are an impressive showWhile UML might be slower in terms of performance, case of advanced techniques from the OSs internals and this integration brings advantages of a different nature. compilation fields. The IT industry has acclaimed them First, UML can leverage different file systems (including as cost-saving, burden-alleviating, server-consolidating, the Host FS) to access the hosting server’s file system from security-enhancing solutions to data centers’ critical

Virtual Bookmarks

July ❘ August 2007 IT Pro

39

VIRTUALIZATION

worlds are successful for the first type of course. However, instructors end up having to assign virtual machines to students and locate them on fixed hardware nodes on which they will be kept up and running, waiting for use by designated student(s). The nature of the experimentation you want your students to conduct—and therefore the pedagogical strategy you can employ—depends upon the chosen virtualization technology’s capability to let users without privileged access to the hosting server handle virtual machines as regular processes. From this perspective, QEmu/KVM and UML are both better suited to the kind of lightweight experimenting that our students need to engage in—for example, creating entire virtual networks, observing them, tearing them apart, and so on. This is especially true when we combine these technologies with the manage large networks tool, which allows students to automatically build and deploy entire virtual networks on-demand, using only a text-based description Introducing virtual written in a simple configuration machines in the language (see Figure 1).

problems.We have already outlined their potential to facilitate the deployment of laboratories for specific IT courses, and it’s obvious they have enormous potential for supporting distance-education implementations of the same topics. We believe there is, however, one more aspect to consider.Are these virtual machines fully addressing the pedagogical and classroom management issues IT instructors are faced with? With too much of a high-level perspective, implementation details can sometimes blur away only to come back to make the difference between a successful or failed implementation. In our previous section, we dug into the specifics of what various virtualization technologies were capable of in developing an educated judgment of their usefulness to our cause. It’s time now to put on our IT educators’ goggles and evaluate how these same virtualization technologies actually address the issues we are facing.

Virtualization platforms for students

Let’s start with pedagogical issues. Our primary motivation for classroom supports introducing virtual machines in the Potential for diverse hands-on classroom was to support diverse distance education hands-on learning activities not Still, from a pedagogical viewlearning activities easily obtainable by other means point but further from the classnot easily obtainable due to practical constraints. While room, the applicability of virtualthe general idea of virtual machine ization technologies to distance by other means. seems to successfully address this education depends a great deal on issue, different implementations their ability to provide a usable might be more or less suitable. Consider the example of interface over potentially low-speed Internet connections. Xen.This project was developed with the idea that virtual In the case of a VMware server, the default interface is a machines would be deployed by system administrators on GUI rendering the entire virtual machine’s desktop on the their servers to consolidate the workload of several smaller client’s computer. Other virtualization technologies rely prisystems. Live migration of entire virtual machines is now marily on command line interfaces, which, while less conpossible to further accommodate this goal. Many vendors vivial, can enable a larger audience to access remotely the offer high-availability configurations in which virtual virtual machine revolution. Also, a simple Secure Socket machines, hosted on different physical servers, can back Shell (SSH) connection to a Linux host coupled with X11 up one another in case of hardware failure. However, no tunneling can let students open only the graphical applicamatter how impressive and useful these features are, they tions they need and have to render only those individual are developed following the axiom that virtual machines windows remotely rather than an entire desktop, thus would be deployed by system administrators. improving bandwidth usage. What does this mean to the instructor? In Xen, domain0 belongs to the hypervisor and allows for the management Less administration … or not of all virtual machines on the system. To allow your The second aspect we need to discuss further is classstudents to create and destroy virtual machines, you will room management. Virtual machines promise simplificaneed them to have a privileged account in domain0. tion of the laboratory management without compromising Furthermore, once they have this access, they are entitled the hosting network’s security. Do all the previously disto manage all deployed virtual machines, not only their cussed technologies fare similarly in this noble endeavor? own.While some instructors might be satisfied if their stuWindows-based solutions generally require virtualization dents could access some virtual machines that could be software installation and maintainence on each workstaautomatically restarted through a script on their behalf, a tion used by students. This, of course, limits flexibility in networking course instructor would usually require stuterms of where students can work but, fortunately, comdents to span multiple virtual machines and organize them mercial solution providers have rectified this point by into an arbitrary virtual network. Projects such as Xen offering free-to-download versions of their virtual machine 40

IT Pro July ❘ August 2007

playing software. This allows institutions to overcome the otherwise overwhelming cost of deploying such software on all campus workstations and enables students to install such software on their personal computers. This discussion leaves one important element out of the picture: what about the virtual disk images? Can we ask students to carry a thumb drive with a 500-Mbyte virtual disk image? What if they need to work on several virtual machines for a networking course? What if all these images differ by only a couple of megabytes? Based on commercial offerings from Microsoft and VMware (B. Anderson and T.E. Daniels, “Xen Worlds: Xen and the Art of Computer Engineering Education,” Proc. ASEE Ann. Conf. and Exposition, 2006, ), some institutions improved on this obviously awkward solution by having disk images stored centrally on a server and delivered on-demand to the workstations. What might be seen as a merely wasteful solution in terms of bandwidth within the boundaries of a campus can prevent students from working remotely from home via lessthan-stellar Internet connection speeds. Now, what about having students manipulate entire virtual networks in these conditions?

Figure 1. (a) Virtual network with (b) corresponding manage large networks project file. LAN hub

Webserv

(a)

Another aspect linked to the already mentioned need to install virtualization suites on each machine is the cost factor. Many institutions neglect this aspect because most commercial virtualization vendors provide software to run a preconstructed virtual machine as a free download, eliminating part of the licensing fees. However, because each workstation needs to run the virtualization software, the hardware requirements for comfortable use of the system are above those of a basic workstation. While this extra cost might be acceptable at most institutions, this investment will sit mostly idle and unused. It’s paradoxical, once again, that the deployment of some virtualization solutions in an instructional setting results in the wasting of more hardware. In the data center, virtualization reduces this very same effect by consolidating partially used multiple servers into virtual machines running on a single system where the usage rate is much higher.

Here we address some of the limitations discussed in the previous section.Virtualization clearly has helped tremendously in enabling students to learn by doing in both faceto-face and distance education settings. However, the IT educational community is still in the process of getting these technologies to where we need them. In the following, we will use a National Science Foundation-sponsored project on which we have been working—Scalable, OpenSource, Fully-Transparent and Inexpensive Clustering for Education (Softice)—to illustrate what we consider the

Observer

global { project web-net} switch lan-hub { type hub } host webserv { network eth) { switch lan-hub address 192.1.1.2 netmask 255.255.255.0 } }

Cost factor

THE FUTURE OF VIRTUALIZATION IN IT EDUCATION

User

host user { network eth0 { switch lan-hub address 192.1.1.3 netmask 255.255.255.0 } } host observer { network eth0 { switch lan-hub address 192.1.1.4 netmask 255.255.255.0 } } (b)

three milestones needed for a successful integration of virtualization in teaching practice and infrastructure.

Milestone #1: virtualization in the classroom Using virtual machines was a first step in the right direction—one which most IT educators have already taken.As we discussed in the previous section,this has,however,often resulted in installing and maintaining virtualization suites on July ❘ August 2007 IT Pro

41

VIRTUALIZATION

Scalable and Transparent Open Source Classroom many campus workstations.While classrooms are no longer Management for Linux-Based Laboratories,” IEEE Int’l reserved and isolated for use by a single laboratory, they still Conf. Engineering Education, Instructional Technology, need to be specially equipped. Similarly, students need to Assessment and E-Learning, 2006, and A. Gaspar and S. Langevin, “New machines, along with virtual disk images, if they want to use Approaches for Linux-Based Undergraduate OS them for course work. While this situation represents Concepts Labs.,” Proc, IEEE Int’l Conf. Engineering progress compared to earlier solutions, many institutions Education, Instructional Technology, Assessment and Ealready perceived the intrinsic limitations of such a decenLearning, 2006, ) strived tralized scheme and moved toward providing students access from the beginning to employ only open source technoloto virtual machines as a service over the network. This was gies, thus limiting a cost factor that might keep some instithe motivation for the Softice project (see http://softice. tutions from adopting commercial lakeland.usf.edu/) we started three years ago and which is now progres- The CS and IT education solutions. Using open source technologies also allowed us to adapt sively becoming a standard for communities have the tools that institutions already deploying virtualization.This brings used and combine them with one us to our second milestone. already implemented another in ways that wouldn’t be remote access to possible with closed source soluMilestone #2: remotely tions. Besides these two classic accessible virtualization virtual machines. arguments in favor of open source The second step is allowing stusolutions in general, we also came dents to access their virtual machines to realize, as other implementations started appearing in from anywhere on campus, or even home.This increases flexthe literature, that none of these limitations were affectibility while opening new perspectives for distance education. ing our infrastructure. To explain why, we need to introThe CS and IT education communities already report many duce the main aspects of the Softice infrastructure. successful implementations allowing remote access to virtual Our students work on Linux virtual machines, hosted on machines. However, the previous section already underlined a single Linux system accessible over the Internet. This how some of these implementations are still imposing counternegates the need to transfer virtual disk images to the stuproductive constraints, which we summarize as follows: dents’ workstations (Stockman, 2005), which, in the case of distance education, would quickly make it unbearable for • Requiring students to carry around virtual hard drives students—having to download several gigabytes of disk images on USB drives or requiring each equipped workimages. How do students work on their virtual machines station to download gigabytes of images from a centralthen? They connect to our Linux system using regular user ized server (M. Stockman, J. Nyland, and W. Weed, accounts and then create their own virtual machines and vir“Centrally-Stored and Delivered Virtual Machines in tual networks using the manage large networks tool and the Networking/System Administration Lab,” ACM UML.Their work session is similar to connecting to any Unix SIGITE Newsletter, vol. 2, no. 2, 2005, pp. 4-6). system and they use free software tools available on every • Requiring all virtual machines assigned to each student platform (SSH client, X-windows server for a GUI interto be kept constantly up on several physical machines face). This allows students to use any type of workstation, (Anderson, 2006). This allows students to use preasrunning any OS, to access their virtual machines. The softsigned virtual machines but denies them the capability ware deployed is cheap and small enough to be pushed on to create virtual machines as needed to experiment with any campus machine and even on students’ personal comflexible virtual networks. puters.Access over the network even proved more fluid than • Limiting the number of simultaneous connections to the when using commercial products such as VMware server virtual machine server (C. Border, “The Development insofar that,instead of transferring a view of entire desktops, and Deployment of a Multi-User, Remote Access our students use mainly command line interfaces and, when Virtualization System for Networking, Security, and in need of a GUI,render each application window separately System Administration Classes,” Proc. ACM SIGCSE using the X11 protocol tunneled inside their SSH connecConf., Special Interest Group in Computer Science tion.Because they can open applications individually instead Education, ACM press, 2007, pp. 576-580). of having to render the entire desktop, distance learners can work comfortably over slow Internet connections. The Softice project (A. Gaspar and C. Godwin, “Rootkits & Loadable Kernel Modules: Exploiting the Linux Kernel for Fun and (Educational) Profit,” J. Computing Milestone #3: scalable virtualization hosting Sciences in Colleges, vol. 22, no. 2, 2006, pp. 244-250 and A. Let’s now propose a third step and discuss the last part of Gaspar, S. Langevin, and W. Armitage, “Inexpensive, our Softice infrastructure. Serving remote access to virtual 42

IT Pro July ❘ August 2007

machines over the Internet might sound feasible on a single Linux Figure 2. Softice cluster infrastructure. this approach would clearly pose a scalability problem. As more students, instructors, and courses Computing nodes use such an infrastructure, or if each student now suddenly needs Master node access to more virtual machines (for example, in networking Campus wireless courses), a single server will access become overwhelmed and the institution will have to purchase GB switch Internet another one to split the workload. Isolated subnetwork If demand increases over the years, the traditional data center Home access waste cycle will occur: many servers will be purchased, deemed obsolete and consolidated into another newer, larger server. Is there a way to break this Classrooms vicious circle and adopt, from the get-go, a scalable hardware infrastructure to host our virtual out a fourth milestone in leveraging virtual machines in machines? We suggest that a possible solution is to not host the IT classroom. Flexibility would be increased, system our virtual machines on a single Linux server but rather to administration would remain scalable since Warewulf host them on a load-balancing cluster. In Softice, we used already uses cluster nodes provisioning techniques also the Warewulf clustering toolkit to implement a TCP/IP loadfound in SSI systems, and, most importantly, the students’ balancing cluster made of recycled classroom PCs. Students virtual machines would be automatically and dynamically connect to a single IP, the master node, which redirects their migrated from node to node to maximize the usage of our connection to an available cluster node waiting on a private hardware and our students’ comfort. LAN (see Figure 2). From the student’s perspective, the system appears as a single machine. As demand increases, more inexpensive irtualization is a fast moving, exciting technological nodes can be added to scale up the computing power.This field, which, when correctly understood, can help IT solution is also scalable from the system administration instructors overcome long-standing pedagogical hinpoint of view because the master node is the only machine drances while also allowing their institutions to leverage their to administer regardless of how many nodes are attached. hardware investment.We have tried to demystify these new Provisioning tools integrated in the Warewulf clustering technologies while keeping their application in IT education toolkit make this possible. The end result is an easy to as our primary focus. The conclusion is positive, with some administer and use cluster that will scale up to increased lessons learned from various early implementations reported demands without causing a linearly increasing workload in the CS and IT education literature.The potential is there for system administrators. Our most recent work on this is much more we can achieve to finish properly welcoming aspect of Softice has been to reconsider single system the virtualization revolution in our IT classrooms. ■ image (SSI) clustering technologies, which proved incompatible with our virtualization software a couple of years Alessio Gaspar is assistant professor in the Information ago but might now be mature enough to let us go beyond Technology Department at the University of South Florida. what Warewulf can offer. Instead of redirecting incoming Contact him at [email protected]. SSH connections to an underused node, we are working on enabling the master node to simply start all processes Sarah Langevin is a student at the University of South for all students and migrate them to unused cluster nodes Florida. Contact her at [email protected]. once its workload becomes high. Several open source projects offer SSI clustering capabilities: Mosix, Open Mosix, William D. Armitage is interim chair of the Department of Open SSI, and Scyld Beowulf. A successful integration of Information Technology at the University of South Florida. both SSI and virtualization technologies might even spell Contact him at [email protected].

V

July ❘ August 2007 IT Pro

43