NETWORK DISTRIBUTED COMPUTING USING DCEZ ... - CiteSeerX

3 downloads 7537 Views 147KB Size Report
to dedicated clusters that implement standard parallel frameworks such as PVM ... Implementing BOINC on a Linux server platform, and using the BOINC client ...
NETWORK DISTRIBUTED COMPUTING USING DCEZ Chuck Pheatt Emporia State University 1200 Commercial Emporia, KS 66801 (620) 341-5637 [email protected] ABSTRACT This paper introduces a client-server suite called DCEZ (easy distributed computing), which provides a readily configurable and easy to use network distributed computing framework. The suite will allow users to easily operate a distributed computing environment and solve real problems with a hybrid peer–to-peer (P2P) grid. It is anticipated that this tool will be most useful in a pedagogical setting such as an introductory computing course or as an adjunct to a computationally intensive course that addresses parallel computing issues. INTRODUCTION Distributed computing has been defined in a number of different ways [1]. Numerous authors have proposed, created and implemented distributed computing systems for years, and have developed numerous initiatives, frameworks and architectures to permit distributed processing of data and objects across a network of connected systems. One approach to distributed computing that has received much attention in recent years is a framework where users can harness idle CPU cycles and storage space of numerous networked systems to work together on a computationally-intensive problem. Most well know of these initiatives is SETI@Home, a scientific experiment that uses Internet-connected computers in the Search for Extraterrestrial Intelligence (SETI) [2] and Folding@Home, a distributed computing project which studies protein folding, misfolding, aggregation, and related diseases [3]. These and other projects use a framework developed and promoted by Berkeley Open Infrastructure for Network Computing (BOINC) [4], which is an open-source software platform for computing using volunteered resources. This computing approach is commonly known as hybrid peer–to-peer (P2P) grids [5] and massively parallel systems computing [1]. THE IDEA AND QUESTION The notoriety and popularity [6, 7] of initiatives such as SETI@Home and Folding@Home spawned the idea of possibly integrating hands-on exploration of such technologies in the undergraduate computer science curricula. This approach is also strongly encouraged in [8]. The ACM Computing Curricula 2001 for Computer Science [9], specifically the sections: • •

AL4. Distributed algorithms [core]. CN4. High-performance computing [elective].

speak to the use of such technologies with the language: • • •

“Explain the distributed paradigm”. “Explain one simple distributed algorithm”. “Design, code, test, and debug programs using techniques of numerical analysis, computer simulation, and scientific visualization”.

This begs the question, “How can we provide undergraduate CS students with an infrastructure that can be easily deployed and used to explore the distributed computing paradigm?”

IMPLEMENTATION CONSTRAINTS Ideal environments for students to explore high-performance computing include access to super-computers or access to dedicated clusters that implement standard parallel frameworks such as PVM [10] (a software package that permits a heterogeneous collection of Unix and/or Windows computers hooked together by a network to be used as a single large parallel computer) and MPI [11] (a standard specification for message-passing libraries for a wide variety of parallel and distributed computing environments). Access to such resources is generally limited to Tier I Research Universities and generally not available to undergraduate students who are simply exploring such technologies. An alternative to having a significant turnkey resource or significant funds for creating a computational resource is to build a cluster computing resource from retired or computer lab computers using available infrastructures. Approaches that might be considered include: • • • • • •

Using Unix or Linux platforms with PVM or MPI and building a dedicated cluster using a group of retired or unused computers. Implementing BOINC on a Linux server platform, and using the BOINC client software with available computer lab computers. Implementing a broad-based enabling technology such as the Globus Toolkit [12], which is an open source software toolkit used for building grids with services written in a combination of C and Java. Using a framework such as BCCD [13] or Thin-Oscar [14] to create a temporary cluster that students can experiment with using laboratory computers. Using a commercial clustering product such as DeskGrid [15] to implement a “BOINC-like” cluster using laboratory computers. Using a framework like JavaSpaces [16] to implement a generic worker-foreman model, in which generic workers exist permanently on many nodes, and custom "foremen" send out data to analyze and specific instructions on how to do just that.

From the author’s experiences in experimenting with the aforementioned solutions, several limitations become painfully apparent: • • • • • •

Dedicated clusters can be created with PVM or MPI; however a significant amount of Unix/Linux expertise is required to implement such solutions. In addition, significant care and feeding of the cluster is required to keep it operational. Implementing a BOINC solution requires downloading 20-25 Mbytes of source code and wading through several hundred pages of documentation. BOINC was designed for addressing significant research problems, and not directed at casual cluster computing. The Globus Toolkit solution is simply overwhelming based on the size of the required binaries (for example, Solaris 9 binary installer 168MB) and the well over 1,000 pages of core documentation. Using BCCD or Thin-Oscar implementations is in general low impact and configuration proceeds fairly rapidly (1 to 2 hours for a first try), but yield only a temporary implementation. If the configured computers are rebooted, the client software disappears. Solutions like DeskGrid provide a no-cost trial version, but limit the level of grid computing that can be performed. Obtaining licenses is an alternative if funding is available. JavaSpaces requires significant knowledge of Java for even a small implementation.

Based on these experiences, an identified need for a readily accessible, easily installable and semi-permanent clustering solution was established. These factors led to the design, development and implementation of the DCEZ (easy distributed computing) client-server suite. DCEZ DESIGN CONSIDERATIONS Important attributes for developing a distributed computing infrastructure for a learning environment were identified. Key elements identified included:

• • • • • • • • • •

The basic framework should consist of a single server and several clients associated with that server. This framework may be replicated any number of times. This allows a single computer lab to host several clusters simultaneously. The server and client software must be able to be installed and made ready for computing within 5 to 10 minutes for a small cluster (5 machines). Installation and use instructions must be simple enough to be contained on a single 8 ½ by 11 sheet of paper. The client software should be minimal. It should have as little effect on OS performance as possible. After client installation, the software should be available even if a client machine is rebooted. The infrastructure must protect against network attacks. Message digests and digital signatures should protect against the distribution of viruses and spurious requests. Existing applications in common languages (C, C++, Java) should run as applications with little or no modification. An application may consist of several files (program and data) and may generate several output files. Many different projects should simultaneously be able to use the cluster. Projects should be treated as independent and are added to a central server priority queue. The clients should not interfere with routine use of client computers. To this end, clients should be available for participating in the cluster when they have been unused for some period of time. The use of a screensaver as a client appears to address this requirement. The server software should be intuitive and require little or no interaction after a request for computing on the clients is requested. Run information should be contained in a script file associated with the task to be performed. The client-server suite should run in a Windows environment. This will allow instructors with access to Windows based labs an environment to easily exercise the suite. Requirements for the existence of DNS, DHCP and other network infrastructure should be minimal.

DCEZ IMPLEMENTATION AND USE The client and server were both implemented using Microsoft Visual C++ 2003 for a Windows target platform. The executables are of modest size, with the client program being 0.3MB and the server requiring 2MB. Both executables are supplied with an installation program. The client program is a screensaver program that displays a bitmap graphic and is configured through the use of the Windows desktop configuration menu. The server program is a Windows program using a single document interface configuration and a configuration file used to tune the server environment. Installing client software requires that the user specify the following information in the application settings dialog box: • • •

Server computer name or server IP address. Communications port number (6499 is the default and may be changed here). Selection of an alternate bitmap graphic for the screensaver to display.

The server software requires no configuration other than modifying the communications port number if the client default has been changed. A server configuration file is provided if the user wishes to modify client priority levels or heartbeat times. When the server is invoked, the interface shown in Figure 1 is displayed. The server interface displays all active client computers as well as two logs: • •

The event log- this log displays client’s activity such as joining the cluster, leaving the cluster or any error conditions encountered. The job log – this log displays the job activity of the clients. As new work is received or completed by the clients, this information is displayed.

Work is submitted to the clients by clicking the “Run a Job” button on the server window. A dialog box requests the name of a configuration file.

Figure 1 Server Interface The configuration file provides location information on program and data files as well as the number of iterations (individual runs) to be spawned. Iterations in this context are the number of individual executions of an executable requested. Each iteration is identified by its iteration number and may be thought of as the rank identifier used in MPI programming. SUPPLIED APPLICATIONS A number of example applications are provided with the DCEZ software to illustrate the effectiveness of the clientserver suite and to provide users with example programs they can use as the basis for their own work. They include: • • • •

The obligatory parallel programming hello world program. This program simply reports the name and performance (in MHz) and OS of each client computer in the cluster to the server. A FireStarter program which is a variation of an introductory MPI application from [17]. This Monte Carlo simulation models the spreading of a fire in a large forest using a cellular automata approach. A MD5 brute force attack program, which simply tries every single combination of a key against encrypted data. The user can specify key length as well as the alphabet used. A Pi estimation program using arbitrary precision arithmetic.

DCEZ PERFORMANCE The performance of the client/server suite was evaluated in an existing university computer lab. Computers available to the client-server suite included: • •

25 - 3.2GHz computers running Windows XP. 20 - 200MHz to 1GHz retired computers running a mix of Windows NT, 2000 and XP.

The cluster was benchmarked using a LinPack benchmark [18]. Based on the recorded performance measure Rmax = 17.8 GFlops, the cluster exhibited performance that would have ranked in the top 50 supercomputers in the Top 500 listing in November, 1995 [19]. This is a respectable performance using otherwise wasted CPU cycles. DCEZ AVAILABILITY The client/server binaries, test programs and associated installation programs are available from the author.

ACKNOWLEDGEMENTS The author would like to thank: • Susan Adam and Brad Nagel for sharing their OPENGL screensaver program for use in this project. • Kris Haney for his original shared directory implementation of the cluster/client suite. • Matt Jackson for his efforts on the final implementation of DCEZ. REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]

[18] [19]

Goff, M., Network Distributed Computing, Fitscapes and Fallacies, Upper Saddle River, NJ: Prentice Hall, 2004. University of California, Search for Extraterrestrial Intelligence (SETI), 2005, http://setiweb.ssl.berkeley.edu, retrieved November 10, 2005. Pande, V. and Stanford University, Folding@Home, 2005, http://folding.stanford.edu, retrieved November 10, 2005. University of California, Berkeley Open Infrastructure for Network Computing, 2005, http://boinc.berkeley.edu, retrieved November 10, 2005. Wikipedia, Peer-to-peer, http://en.wikipedia.org/wiki/Peer-to-peer, retrieved November 10, 2005. Androutsellis-Theotokis, S. and Spinellis, D. A survey of peer-to-peer content distribution technologies. ACM Computing Surveys, 36(4): 335–371, December 2004. Imagenation, Overview of Grid Computing, http://www.theopensourcery.com/osrevGrids.htm, 2004, retrieved November 10, 2005. Nevison, C.H.. "Parallel Computing in the Undergraduate Curriculum," Computer, 28(12): 51-56, December, 1995. ACM/IEEE-CS Joint Curriculum Task Force. Computing Curricula 2001. Geist , A., et. al, PVM: Parallel Virtual Machine: A Users' Guide and Tutorial for Network Parallel Computing. Cambridge, MA: The MIT Press, 1994. Gropp , W., et. al., Using MPI - 2nd Edition: Portable Parallel Programming with the Message Passing, Cambridge, MA: The MIT Press; 2nd edition, 1999. Globus Alliance, University of Chicago, Welcome to Globus, 2005, http://www.globus.org, 2005, retrieved November 10, 2005. Gray, P., University of Northern Iowa , BCCD, http://bccd.cs.uni.edu, 2005, retrieved November 10, 2005. Mugler, J., et. al. OSCAR Clusters. Proceedings of the Ottawa Linux Symposium (OLS'03), Ottawa, Canada, July, 23-26, 2003. Info Designs, Inc., Grid Computing on Desktop PC’s, http://www.deskgrid.com, retrieved November 10, 2005. Freeman, E. Hupfer, S. and Arnold, K., JavaSpaces, Principles, Patterns, and Practice, Upper Saddle River, NJ: Pearson Education, 1st edition, 1999. The Shodor Education Foundation, Inc, The National Computational Science Institute, Computational Science Education Reference Desk, Basic MPI, http://www.shodor.org/refdesk/Resources/Tutorials/ParProgProtocols/index.php, 2004, retrieved November 10, 2005. Longbottom, R., Linpack 100x100 Benchmark In C/C++ For PCs, 2003, http://homepage.virgin.net/roy.longbottom/oldones.htm, retrieved November 10, 2005. TOP500.Org, List for November 1995, http://www.top500.org/lists/lists.php?Y=1995&M=11, retrieved November 10, 2005.