Effective Use of Networked Reconfigurable Resources - CiteSeerX

Effective Use of Networked Reconfigurable Resources A. V. Staicu1, J. R. Radzikowski1, K. Gaj1, N. Alexandridis2, and T. El-Ghazawi2 1

George Mason University, 4400 University Drive, Fairfax, VA 22030 George Washington University, 725 23rd Street NW, Washington DC 20052

2

Abstract Distributed reconfigurable resources, such as FPGA-based accelerator boards1 are expensive and often underutilized. Therefore, it is important to permit sharing these resources at least among the members of the same organization. Traditional resources, such as CPU time of loosely coupled workstations can be shared using a variety of existing distributed computing systems. We analyzed twelve of these systems, and chose five of them, belonging to the class of Job Management Systems with a Central Job Schedular, as most suitable for extension to reconfigurable hardware. These five systems, LSF, CODINE, PBS, Condor, and RES, were further compared based on their functional characteristics and ease of extension. LSF was shown to be superior according to the majority of the 25 identified requirements. The general architecture of the extended system was developed, and the exact way of extending LSF, CODINE, and PBS to manage FPGA-based accelerator boards was proposed.

I. INTRODUCTION This paper reports on a research effort to create a distributed computing system interface for the effective utilization of networked reconfigurable computing resources. The objective is to construct a system that can leverage underutilized resources at a given time to serve other users who currently have the needs, in a grid computing like style. The targeted type of resources are workstations and clusters that are equipped with Field Programmable Arrays (FPGA) boards serving as reconfigurable coprocessors, as one can find in academic and government research labs. In order to take advantage of previous related works, our strategy is to extend an efficient Commercial Off the Shelf (COTS) Job Management System (JMS) [1, 2]. Such extensions should provide the ability to recognize reconfigurable resources, monitor and understand their current loading, and effectively schedule them for the incoming remote user requests, with little impact on local users. It also includes providing local users with proper tools to control the degree to which they wish to share their own resources and how others may use such resources. Our effort started with a study that aimed at the comparative evaluation of currently available job management systems (JMS) and a conceptual design of how to architect such a system for managing networked reconfigurable resources [2]. Twelve systems were considered in this study 1 This work was sponsored by the Department of Defense under

the LUCITE contract #MDA904-98-C-A081.

and evaluated (and ranked) with respect to their suitability for being extended to handle FPGA reconfigurable computing resources and tasks. The compared systems are LSF, PBS, Condor, RES, Compaq DCE, Sun Grid Engine /CODINE, Globus, Legion, NetSolve, MOSIX, AppLES, and NWS. In order to facilitate the comparison, twenty-five major requirements were identified, and a scoring system was developed to gauge the compliance of these systems to the functional requirements, and rank them accordingly [1, 2, 6, 7]. In addition, the 12 systems were also clustered into several categories in order to easily appreciate the functional and operational differences among them and help the decisionmaking. The remaining part of the paper is organized as follows. In Section II, we classify the compared systems into six distinctive categories, and choose a category that is most suitable for extension to reconfigurable hardware. In Section III, we give a comparative overview of five systems belonging to the selected category. In Section IV, we identify features of a system that supports extension to reconfigurable hardware, and propose a general architecture of the extended system. Finally, we describe the means of extending three selected Job Management Systems, and identify a single JMS that offers the best functional characteristics combined with an ease of extension to reconfigurable hardware.

II. CLASSIFYING THE COMPARED SYSTEMS The twelve compared systems can be classified into several major categories, based on their primary objective, major modules, and distribution of control. These groups are defined below, together with the discussion of their relevance to the project goals.

A. Job Management Systems with a Central Scheduler The objective of a JMS is to let users execute jobs on a non-dedicated cluster of workstations with a minimum impact on owners of these workstations by using computational resources that can be spared by the owners. The system should be able to perform at least the following tasks: a. monitor all available resources, b. accept jobs submitted by users together with resource requirements for each job, c. perform centralized job scheduling that matches all available resources with all submitted jobs according to the predefined policies, d. allocate resources and initiate job execution, e. monitor all jobs and collect accounting information.

To perform these basic tasks, a JMS with a Central Scheduler must include at least the following major functional units [4]: 1. User server – which lets user submit jobs and their requirements to a JMS (task b), and additionally may allow the user to inquire about the status and change the status of a job (e.g., to suspend or terminate it). 2. Central job scheduler – which performs job scheduling and queuing based on the resource requirements, resource availability, and scheduling policies (task c). 3. Resource manager, including • Resource monitor (tasks a and e), and • Job dispatcher and (task d). scheduling policies

Resource Manager resource requirements

User Server jobs & their requirements

available resources

Job Scheduler

Resource Monitor

• •

Legion NetSolve.

Systems belonging to this group are of secondary interest to us from the viewpoint of the direct project goals. However, after extending the capabilities of a centralized JMS, a JMS without a central job scheduler can be used to create a highlevel Distributed Job Management System in order to achieve additional scalability.

C. Distributed Operating System Distributed Operating System differs from the Job Management System by operating on processes instead of jobs, and being oriented primarily towards tightly coupled clusters of homogenous workstations [5]. The only member of this category is • MOSIX. Since one of the primary requirements of the project is support for loosely coupled clusters of heterogeneous workstations, MOSIX and other systems of this type should not be considered as potential candidates for extension.

D. Parameter Study Schedulers resource allocation and job execution

Job Dispatcher User job

Figure 1 Major functional blocks of a Job Management System with a Central Scheduler.

We have classified the following five systems as centralized Job Management Systems: • LSF – Load Sharing Facility • Sun Grid Engine / CODINE • PBS – Portable Batch System • Condor • RES Systems belonging to this group are of primary interest to us from the viewpoint of the project objectives. All these systems can be potentially extended to support distributed computations using FPGA accelerator boards.

B. Job Management Systems without a Central Scheduler The primary difference compared to the first group is the lack of a central job scheduler. The system is able to collect information about all available resources and accept job submissions, but the decision about matching jobs with available resources is performed locally by a user server or a local job scheduler. A system of this kind often works as a high-level job management system, which can submit jobs to several lowerlevel centralized job management systems, based on the information about their currently available resources. This group of systems includes: • Globus

Parameter Study Scheduler is a system geared specifically towards the distribution of parameter study applications over networked resources. It is used to schedule jobs belonging to a single application on multiple workstations, and to assign input data to these jobs, depending on the current use of resources of these workstations. The application in this case should involve solving a given task for a large number of parameter sets. Each computational node is used to work only on a small subset of a large parameter space. The size of this subset depends on the current computational capabilities of the given node. The instances of the program running on the different workstations do not communicate among each other, but rather return partial results to the central node. The only representative of this group is • AppLES This type of a system is not really a JMS and does not fulfill many major requirements formulated in our project.

E. Resource Monitors and Forecasters Resource Monitors and Forecasters are similar to the resource monitor, which is a standard part of each JMS system, as defined in point “A”. The difference is that the system of this type does not only monitor computational resources, but also predicts their availability and performance in the future. This system uses numerical models to generate forecasts regarding availability of resources in a given time frame. The only representative of this group is • Network Weather Service. The system of this type may be combined with a centralized JMS to extend its capabilities.

F. Distributed Computing Interface Distributed Computing Interface is a set of interoperable libraries that can be linked with application programs or operating systems to support client-server applications. Distributed Computing Interface provides • secure communication between client and server, based on public key certificates; • directory services, including the means of creating, updating, and accessing directories of currently active servers, public key certificates, etc., • distributed file services (e.g., a client access to remote disks), and • time services (e.g., synchronization of local clocks on several distributed workstations).

III. COMPARATIVE OVERVIEW OF JOB MANAGEMENT SYSTEMS WITH A CENTRAL JOB SCHEDULER

PBS, Condor, and RES leads to the following conclusions (Table 1): • Majority of systems are available in public domain. The only system that does not have a public domain version is LSF. PBS have both commercial and public domain versions. • Majority of systems operate under several versions of UNIX, including Linux. The only system that does not directly support Linux is RES. LSF is the only JMS that fully supports Windows NT. In Condor, there is a limitation that jobs submitted from Windows NT can only be executed on machines running Windows NT. CODINE, PBS, and RES currently do not support Windows NT. • Majority of systems have both graphical and commandline interface. The only system without a graphical user interface is RES. Analysis of the second set of requirements, “Scheduling and Resource Management,” leads to the following conclusions (Table 2):

The main features of Job Management Systems with a Central Job Scheduler can be classified into the following categories [1, 2, 3, 6, 7, 8, 9]:

• Batch jobs are supported by all job management systems, while Interactive jobs are not supported by Condor and RES.

i. ii. iii. iv. v.

• A full parallel job support is offered only by LSF and CODINE, both systems support programming, testing, execution, and run-time management of parallel applications. PBS has a limited parallel job support (cannot control or monitor resource usage of all processes belonging to a parallel job). Condor can act as a resource manager for the PVM (Parallel Virtual Machine) daemon. The current version of Condor does not support MPI.

Operating System Support, Flexibility, and User Interface Scheduling and Resource Management Efficiency and Utilization Fault Tolerance and Security Documentation and Technical Support.

Analysis of the first set of requirements, “Operating System Support, Flexibility, and User Interface,” for LSF, Codine,

Table 1 Comparison of selected Job Management Systems from the viewpoint of the Operating System Support, Flexibility, and User Interface. LSF

CODINE

PBS

Condor

RES

Distribution

commercial

public domain

commercial and public domain

public domain

restricted

Source code

no

yes

public domain version only

yes

restricted

Solaris

yes

yes

yes

yes

yes

Linux

yes

yes

yes

yes

yes

Tru64

yes

yes

yes

no

no

Windows NT

yes

no

no

partial

no

GUI & CLI

GUI & CLI

GUI & CLI

GUI & CLI

CLI

Operating System Support

User Interface

Table 2 Comparison of selected Job Management Systems from the viewpoint of Scheduling and Resource Management. LSF

CODINE

PBS

Condor

RES

Batch jobs

yes

yes

yes

yes

yes

Interactive jobs

yes

yes

yes

no

no

Parallel jobs

yes

yes

partial

limited to PVM

no

Accounting

yes

yes

yes

yes

no

• Accounting is available in all centralized job management systems, except for RES.

Analysis of the fourth set of requirements, “Fault Tolerance and Security,” leads to the following conclusions (Table 4):

Analysis of the third set of requirements, “Efficiency and Utilization,” leads to the following conclusions (Table 3):

• LSF supports kernel-level checkpointing on all operating systems that provide it. LSF also supplies a user-level library to provide checkpointing on operating systems that do not provide kernel-level control. Checkpointing in Condor can be done using user-level libraries. Checkpointing in CODINE can be done in the same way as checkpointing in LSF, however it must be based on foreign checkpointing libraries, such as Condor libraries. In PBS, only the kernel-level checkpointing is supported on a small number of operating systems that provide this mechanism. • In LSF and CODINE, if the master host becomes unavailable or its daemons fail, another host takes over automatically. In LSF, only the total number of hosts in the system limits the number of transitions; in CODINE, only one additional host is specified as a shadow master host. In all JMSs, jobs can be automatically rerun if the execution host fails. • Authentication is supported by all systems, and strong authorization by all systems, except for RES.

• Stage-in and stage-out are supported by LSF, PBS, and Condor, while CODINE and RES do not support these capabilities. • In LSF, CODINE, and Condor, jobs can be checkpointed and migrated to another machine with a lighter load. If no checkpointing is supported by a given job (e.g., commercial software), and the job is preempted from a heavily loaded host, this job can be rerun from the beginning on another machine. In LSF and CODINE a job can be automatically migrated if it is suspended for an extended period of time to assure load balancing. In CODINE exceeding the predefined load value may also cause the migration. PBS and RES do not support either job migration or load balancing. • LSF Multicluster allows dividing computational resources of a very large or geographically dispersed organization into a number of autonomous clusters. This configuration enables load sharing and batch processing among clusters, and can support systems with multiple thousands of nodes. The similar capability is not available in any other JMS.

Analysis of the last set of requirements, “Documentation and Technical Support,” leads to the following conclusions (Table 5):

Table 3 Comparison of selected Job Management Systems from the viewpoint of Efficiency and Utilization. LSF

CODINE

PBS

Condor

RES

Stage-in and stage-out

yes

no

yes

yes

no

Timesharing

yes

yes

yes

yes

no

Process migration

yes

yes

no

yes

no

Dynamic load balancing

yes

yes

no

no

no

excellent

good

good

good

good

Scalability

Table 4 Comparison of selected Job Management Systems from the viewpoint of Fault Tolerance and Security. LSF

CODINE

PBS

Condor

RES

yes

using external libraries

only kernel-level

yes

no

master and execution hosts


only for execution hosts


no

Authentication

yes

yes

yes

yes

yes

Authorization

yes

yes

yes

yes

no

Check pointing Daemon fault recovery

Table 5 Comparison of selected Job Management Systems from the viewpoint of Fault Tolerance and Security. LSF

CODINE

PBS

Condor

RES

yes

using external libraries

only kernel-level

yes

no





no

Authentication

yes

yes

yes

yes

yes

Authorization

yes

yes

yes

yes

no

Check pointing Daemon fault recovery

• Documentation is excellent for LSF, and good for the majority of the other systems, except RES. The documentation of LSF is superior compared to the documentation of other systems from the viewpoint of organization, completeness, readability, and clear division of information into parts required by LSF administrators, users, and programmers. • Technical support of Platform Computing Corporation (LSF), Veridian Systems Inc. (PBS) and the Condor Team at the University of Wisconsin-Madison were contacted several times during our study [3]. The typical issues concerned problems with installation and configuration of the systems, and clarifications regarding the capabilities of each JMS. The technical support of Sun Grid Engine was not used during our study. The technical support of LSF was found to be excellent. A typical response time was in the range of 2-3 hours. The answers to our questions were complete and accurate. The technical support of PBS was good. A typical response time was in the range of 24 hours. Nevertheless, the follow-up communication was sometimes necessary to clarify answers to our questions. The technical support of Condor was average. A typical response time was in the range of 2-3 days, and selected answers required a followup communication.

IV. EXTENSION TO RECONFIGURABLE HARDWARE A. Features of the Job Management Systems supporting extension The specific features of Job Management Systems that support extension to reconfigurable hardware include • capability to define new dynamic resources, • strong support for stage-in and stage-out in order to allow an easy transfer of the FPGA configuration bitstreams, data inputs, and results between the submission host and the execution host with reconfigurable hardware; • support for Windows NT and Linux, which are two primary operating systems running on PCs that can be

extended with commercially available FPGA-based accelerator boards with the PCI interface.

B General architecture of the extended system General architecture of the extended system is shown in Figure 2. The primary component of this extension is an external resource monitor that controls the status of an accelerator board, and periodically communicates this status to a resource monitor. The resource monitor transfers this information periodically or by request to a Job scheduler, which uses this information to match each job that requires acceleration with an appropriate host. Job requirements regarding the new reconfigurable resource are specified during a job submission to a user server, and are enforced by a job scheduler the same way as requirements regarding default built-in resources.

C Extending selected JMSs to manage reconfigurable resources LSF, CODINE, and the most recent version of PBS (5.1.1) all support defining and managing new dynamic resources using configuration files only, without any need for changes in the source code of the JMS programs. This feature can be used to extend each of these three JMSs to manage FPGA-based accelerator boards. The new resource that needs to be added to a given JMS represents the availability of the accelerator board for JMS users. An external resource monitor is written according to the specification for • ELIM, External Load Information Manager in LSF • Load sensor in CODINE, and • Shell escape to the MOM configuration file in PBS. This daemon is started by a local resource manager (LIM in LSF, cod_execd in CODINE, and MOM in PBS), and communicates with the resource monitor using standard output. Extending Condor and RES to provide the similar functionality would require changes in the source code of these JMSs.

scheduling policies

Resource Manager resource requirements

User Server jobs & their requirements

available resources

Job Scheduler

resource allocation and job execution

Resource Monitor Job Dispatcher

External Resource Monitor Status of the FPGA board

User job FPGA FPGA Board APIs board Figure 2 Extension of a Centralized Job Management System to recognize, monitor, and schedule reconfigurable resources.

V. SUMARY LSF outperforms all other Centralized Job Management Systems in terms of the operating system support, scalability, documentation, and technical support. It is also one of only two systems that fully support parallel jobs, checkpointing, and offer strong resistance against the master host failure. CODINE performs extremely well in multiple categories such as parallel job support, job migration, load balancing, and resistance against the master host failure. The major drawbacks of CODINE include the lack of support for Windows NT, no support for stage-in and stage-out, and only externally supported checkpointing. The primary weaknesses of PBS include no support for Windows NT, very limited checkpointing, no job migration or load balancing, and limited parallel job support. Condor distinguished itself from other systems in terms of the strong checkpointing. It is also one of the oldest and the best understood job management systems. The main weaknesses of Condor include no support for interactive jobs, the limited support for parallel jobs, and average technical support. Surprisingly, an ease of extension to support FPGA-based accelerator boards appeared to be a minor factor in comparison. Three out of four systems, LSF, CODINE, and PBS v. 5.1, seems to be easily extendable without the need for any changes in their source code. The remaining systems Condor and RES can also be relatively easily extended, taken into account the full access to their source codes. Taking into account the combined results of our study we recommend choosing LSF as a primary system to be extended to support FPGA-based accelerator boards.

REFERENCES [1] M. A. Baker, G. C. Fox, and H. W. Yau, “Cluster Computing Review,” Northeast Parallel Architectures Center, Syracuse University, Nov. 1995. [2] T. El-Ghazawi, et al., Conceptual Comparative Study of Job Management Systems, Technical Report, February 2001, available at http://ece.gmu.edu/lucite/reports.html. [3] T. El-Ghazawi, et al., Experimental Comparative Study of Job Management Systems, Technical Report, August 2001, available at http://ece.gmu.edu/lucite/reports.html. Scalable Parallel [4] Kai Hwang, Zhiwei Xu, Computing:Technology, Architecture, Programming, McGraw-Hill 1998. [5] A. Goscinski, Distributed Operating Systems. The Logical Design, Addison-Wesley, 1991. [6] James Patton Jones, “NAS Requirements Checklist for Job Queuing/Scheduling Software,” NAS Technical Report NAS-96-003 April 1996, available at http://www.nas.nasa.gov/Pubs/TechReports/NASreports/ NAS-96 003/ [7] James Patton Jones, “Evaluation of Job Queuing/Scheduling Software: Phase 1 Report,” NAS Technical Report, NAS-96-009, Sep 1996 available at http://www.nas.nasa.gov/Research/Reports/Techreports/ 1996/nas-96-009-abstract.html [8] C. Byun, C. Duncan, S. Burks, "A Comparison of Job Management Systems in Supporting HPC ClusterTools," Proc. SUPerG, Vancouver, Fall 2000. [9] O. Hassaine, "Issues in Selecting Job Management Systems," Proc. SUPerG, Tokyo, April 2001.

Effective Use of Networked Reconfigurable Resources - CiteSeerX

Effective Use of Networked Reconfigurable Resources - CiteSeerX

Suggest Documents

Effective Use of Networked Reconfigurable Resources - CiteSeerX

An Approach to Use MDE in Dynamically Reconfigurable Networked

Reconfigurable Cooperative Control of Networked ... - Semantic Scholar

Design of Networked Reconfigurable Encryption Engine

Reconfigurable Control of Networked Nonlinear Euler ...

Characterising effective eLearning resources - CiteSeerX

Effective Resources Use for Virtual Laboratories

A Networked, Lightweight and Partially Reconfigurable Platform

Harnessing Resources for Networked Policing

The Distribution of People to Resources in a Networked ... - CiteSeerX

Defending Networked Resources Against Floods of Unwelcome ...

Efficient Runtime Management of Reconfigurable Hardware Resources

Improving Utilization of Reconfigurable Resources ... - DATE conference

Contribution and effective use of external resources for ... - unctad

Mobilisation and effective use of domestic resources for a ...

Effective Use of Resources Distributed Cloud Computing Platform for ...

Effective use of teaching resources for GCSE by Lisa Wang

Patterns of students' use of networked learning technologies - CiteSeerX

Self-organization of Reconfigurable Protocol Stack for Networked ...

Study on the Availability Prediction of the Reconfigurable Networked ...

On the Use of Distributed Reconfigurable Hardware in ... - CiteSeerX

Sharing Networked Resources with Brokered Leases

Reconfigurable Memory for Reconfigurable Computing - CiteSeerX

Dynamically Reconfigurable Photonic Resources for Optically ...