User-Oriented Job Management in a Grid Environment
1
Cuiju Luan1, Guanghua Song1, Yao Zheng1, Guiyi Wei1, 2 College of Computer Science and Center for Engineering and Scientific Computation, Zhejiang University, 310027, P. R. China 2 Zhejiang Gongshang University, Hangzhou, 310035, P. R. China
[email protected],
[email protected],
[email protected],
[email protected]
Abstract In a grid environment, job management is an important issue, and efforts can be made to improve its efficiency, flexibility, convenience and security. This paper presents a user-oriented job management with detailed design and implementation. The proposed job management has the advantages of reusability of job definition, flexibility and automation of file operations. The job management uses a workflow mode, that is, the whole management is divided into several separate steps. A job management wizard is designed to conduct each step. Therefore, what the grid user needs to do is to define the job by constructing necessary information at runtime. In order to save the memory and manage the jobs better, the job management adopts a two-level storage method. The first level stores the job index information, while the second keeps detailed information of the job in an XML file. Our experiments show that this approach is user friendly and job reusable. Keywords: Grid, job management, two-level storage, job reusability, file operation
1. Introduction A Multidisciplinary Applications-oriented Simulation and Visualization Environment (MASSIVE) [1] is developed and deployed at the Center for Engineering and Scientific Computation (CESC) in Zhejiang University (ZJU). It provides a grid-based platform for geometrical modeling, discretization, scientific computing and visualization. Its advanced services can be accessed easily in a visual manner. Scientists are in a position to concentrate on the science, while platform developers can focus on the delivery of services that can be assembled as building blocks to create more elaborate services. A typical usage would be the generation of a mesh using a meshing service on an
IRIX platform, the solution of a CFD/CSM problem with the meshes previously created on a PC Cluster, and the collaborative visualization of the numerical results with equipments such as a display wall and a BARCO stereo projection system at the CESC. A key issue in the MASSIVE project is the capability to effectively manage the jobs, including job definition, scheduling, executing, monitoring, and so on. In the MASSIVE project, most program codes need to be executed many times in order to get a reasonable result, while the difference is usually just a few parameters in these executions. To meet this kind of needs, there should be some mechanism to realize that once a job is defined it can be reused later on, while what the user needs to do is only to change some parameters. Moreover, this kind of applications requires very flexible file/folder operations, which can be described in the job definition and can be performed automatically. Existing grid job management mechanisms [2-5] mainly resolve the problem of job scheduling, and emphasize high performance or high throughput computing. However few projects concern on how to effectively manage jobs. To address the above problems, we present a useroriented job management adopted in the MASSIVE project. The job management provides a mechanism of job reusability. Furthermore, the system provides a set of flexible file/folder operations, which can be described in the job definition and performed automatically. The goal of job management is to make the operations in the grid environment more easy-to-use and efficient, and to make the jobs reusable. The rest of this paper is arranged as follows. Section 2 details the user-oriented management. Section 3 presents an application to illustrate the characteristics of the job management. Finally, we conclude with a summary of our work and an outline of our plans for continuing research in Section 4.
Proceedings of the 2005 The Fifth International Conference on Computer and Information Technology (CIT’05) 0-7695-2432-X/05 $20.00 © 2005
IEEE
2. User-Oriented Job Management
2.2 Job scheduler
In this paper, the job is defined as an object running on the grid in order to solve a certain problem. This kind of objects designate the resources to be used, the executable file or command to be executed on the remote resources, the input and output data or files, and the manner about how to deal with the results. The user-oriented job management consists mainly of five parts: Job Creator, Job Scheduler, Job Executor, Job Monitor and File controller. Its architecture is as shown in Figure 1.
The job scheduler is used to allocate physical resources to the jobs, according to their logical resources. The resources that can be allocated come from the VOs, to which the user belongs. Resources are discovered and published by the information service of the MASSIVE, and filtered by the user’s resource requirements. The job will then obtain the resources specified by the user or by the system automatically. Logical resources are the virtual resources that meet the requirements of resources filtering condition. The logical resources are allocated by the job creator, and should be converted to physical resources that exist in VOs by the job scheduler. This procedure comprises two phases [6]:
Local Job Manager File Storage
Filtering phase Job Creator User
Job Scheduler
File Controller
Job Monitor
Job Executor
Jobmanager
Gatekeeper
The MASSIVE can discover resources automatically and provide their relevant properties. At this stage, any resources that do not meet the filtering condition will be removed. For example, if the condition is “TotalMemory >= 512M && FreeMemory >= 128M”, then resources with less than 512MB of total memory or less than 128MB free memory will be removed, and the remainders constitute the waiting scheduled resources pool.
GRAM Server
Allocating phase Remote Job Execution
Figure 1. Architecture of the user-oriented job management
2.1 Job creator The function of the job creator is to get the information about the job, including the basic information, the resource filtering information and other advanced information. The basic information includes the job name, the job type, process count, the executable file(s), the running parameters and the working directory. The resource filtering information illustrates the user’s resource requirements, such as operating system, CPU, memory and disk system, etc. When the user describes this kind of information, the system allocates the logical resources to current job. The advanced information contains parameter and resulting files, and information about how to transfer and deal with them. We provide two methods, which are based on the GASS and the GridFTP respectively, to transfer files.
The waiting scheduled resources obtained from the first phase can be allocated manually or automatically. We provide two automatic resource scheduling algorithms, the heuristic-based stochastic algorithm and the heuristic-based greedy algorithm, to map the logical resources to the physical ones. The merit of the two-phase resources allocation is, that when the VOs or physical resources changed, the only thing needs to do is to reconstruct the map from the logical to the physical resources. After successfully allocating resources, the job can be submitted to a job executor through the message mechanism.
2.3 Job executor The job executor is a daemon process. Its function is to execute the jobs and to communicate with the user interface to send the jobs’ identifiers, which will be used by the job monitor to monitor or control the jobs. It adopts the first come first served (FCFS) algorithm. The job executor constantly receives message sent by the job scheduler. When receiving a job execution
Proceedings of the 2005 The Fifth International Conference on Computer and Information Technology (CIT’05) 0-7695-2432-X/05 $20.00 © 2005
IEEE
request, it will create a thread and send the job designated in the message to the corresponding gatekeeper [7], and then it gets the job identifier – jobcontact, which will be sent back to the corresponding thread of the job scheduler and be used to monitor and control the progress of the job.
2.4 Job monitor The job monitor is responsible for monitoring the running jobs and their sub-jobs. It can respond to the job state requiring, suspend, wakeup, kill and reschedule a job. Moreover, it can update the job state and display the job standard output during the job execution. The user can steer the job with it. With the job-killing function, when one sub-job on a site fails, the system can kill the whole job, discard any interim results and free system resources used by the job to save time and resources. Then the job can be rescheduled. Because the system can save the job identifier, the job monitor can abort after submitting the job to the job executor. When the user logs in again, he can still monitor and control the job submitted previously. This function is very convenient for users, whose jobs need long time to run.
2.5 File controller The file controller in the architecture provides capabilities to transfer file/folders securely, conveniently, efficiently and flexibly. It is implemented on top of the GridFTP in the Globus toolkit. The file controller offers comprehensive file/folder operations - creating/deleting file/folders, checking whether the file/folders exist locally or remotely, etc. The file/folders can be transferred by multi-channels or in third-party style. Partial file transfer has also been implemented. Whenever a job needs files or folders to be transferred, the file controller will be invoked. The job creator provides an interface to define file/folder operations before and after the execution of the job. As parts of the job definition, all the file operations will be carried out by the system automatically. That is, once the user defined a job and submitted it, he can leave free. The parameter files can be transferred to the execution nodes before the job executing. While the results will be obtained at the appointed locations and the data will be deleted from the execution nodes as required.
2.6 Job storage structure
structure, JobFile and XMLFile, with a one-to-many relationship between them. 2.6.1 JobFile. JobFile is a random file and is used to save all the jobs. It consists of tuples of . RecordValid denotes whether the current record is valid. If a job is deleted by the user or the XMLFile of a job has error, this field will be 0, i.e., invalid. RecordJobID is the job ID. It serves as the keyword of the current record and is used to identify a job exclusively. RecordXMLFile is the index of the XMLFile. It is used to search the job’s XMLFile. RecordIndex is the index of the current record. Because the JobFile saves basic information of all the jobs, the fields of this file are condensed to occupy less storage space. 2.6.2 XMLFile. We use the XML file to save all the information for a job, including the basic information, the resource filtering information and the advanced information. It is easy to understand and human-readable. Using the XML file format to save jobs makes it convenient to import jobs from outside the system. As long as a job is described in the required format, it can be imported into the system, thereby can enter in the job management flow.
2.7 Job management flow The operations of job management are organized as a workflow. It includes creating, scheduling, submitting, executing, and monitoring a job. The job information is carried on from the first operation to the last one. Within the workflow the job definition can be saved, or deleted at any moment. Hence the job can be halted at any step and can be restarted later. Because the job definition can be saved or opened at will, the user can define a job and submit it later, or a job can be submitted several times after reallocating physical resources or changing parameters. That brings greater convenience to the user, improves efficiency and realizes the job’s reusability. In the workflow, jobs are in one of the three job queues, the waiting, scheduled and finished queues, according to their states. The waiting queue keeps the un-submitted jobs. The scheduled queue keeps the running jobs. And the finished queue keeps the jobs that have been executed successfully or failed. The workflow diagram of the job management is illustrated in Figure 2.
The job management adopts two-level storage
Proceedings of the 2005 The Fifth International Conference on Computer and Information Technology (CIT’05) 0-7695-2432-X/05 $20.00 © 2005
IEEE
XML File n XML File ... XML File 1
Job File
Waiting Queue
Import
Scheduled Queue
Finished Queue
Historical Job New Job
Job Creator
Job Scheduler
Job Executor
Job Monitor
Figure 2. The workflow diagram of job management There are three approaches that a job can enter this workflow. The first is to create a new job completely, the second is to open a historical job saved in the file storage, and the last is to import a job from outside of the system, as long as the format conforms to that of the job file in the system. The job creating operation and the job scheduling operation in the workflow are reversible. The job management using the workflow to conduct the whole process makes the complicated operations easy to be performed, so that the user can pay more attention to the problem solving. Hence it improves the quality and efficiency of the job.
“/home/cesc32/pasis”, which locates at cesc32.zju.edu.cn and should be transferred to the cluster, the job parameter is “-f quzhou.in”, the work directory is “./WHY”, and the resource selected is cesc12.zju.edu.cn. The corresponding advanced job information mainly relates to the file transferring. Before the job execution, the parameter file “quzhou.in” will be transferred from cesc32.zju.edu.cn to cesc12.zju.edu.cn, while the data file “quzhou.geom” will be transferred from cesc11.zju.edu.cn to cesc12.zju.edu.cn. After the job execution, the resulting data file “quzhou.dat” will be transferred from cesc12.zju.edu.cn to cesc32.zju.edu.cn. All these information are described in the job definition. The above job was defined some time ago and can be reused once more. What the user needs to do is to edit the parameter file “quzhou.in” locally, while the job definition remains the same. Figure 3 shows the reused job named Quzhou, which is listed in the historical job list. The workspace interface in Figure 3 is used to get the job basic information. Figure 4 shows the information about the file transferring defined in the job definition.
3. An Application with the Job Management in the MASSIVE An application here is designed to illustrate how to define and execute a job with the user-oriented job management, and how to reuse the job definition. The application is a structural analysis of a crank. It uses the MPICH-G2 [8] library to pass message between processors. The resources, located at the CESC, Zhejiang University, consist of an SGI Onyx 3900 supercomputer, a Dawning PC cluster, six SGI workstations and nearly forty PCs. In one of the cases, the job will be submitted to the cluster manually from a desktop PC. The input data are one mesh file and one parameter file detailing the solving process and method. Here the mesh file is generated by the SGI Onyx 3900 supercomputer and the parameter file is edited on the local PC. They should all be transferred to the cluster before the job execution. Finally, the output data is one data file and some log files. In this simulation, the three involved resources are the supercomputer, the cluster and the desktop PC. Their full domain names are cesc11.zju.edu.cn, cesc12.zju.edu.cn and cesc32.zju.edu.cn, respectively. The basic job information includes as fellows: the job name is Quzhou, job type is multiple, schedule type is manual, processes count is 8, executable file is
Figure 3. Getting the basic information of a job
Figure 4. Specifying file operations of a job
Proceedings of the 2005 The Fifth International Conference on Computer and Information Technology (CIT’05) 0-7695-2432-X/05 $20.00 © 2005
IEEE
According to the working flow, the next step is to allocate resources to the job, of which the process is presented in Figure 5. The system can find the available resources automatically. What the user has to do is to select resources according the resources information list in the interface. Because the job definition meets the requirements of the current application, nothing needs to be changed. Now press the submit button, the job will be send to the job executor, and then submitted to the remote cluster. The file controller will be responsible for file transferring. The job monitor will provide the state and standard output of the running job.
Acknowledgments This research project is financially supported by the National Natural Science Foundation of China, and the authors wish to thank the National Science Fund for Distinguished Young Scholars under grant Number 60225009. We appreciate helpful discussions among the members of the Grid Computing Group at the CESC, Zhejiang University, and would like to thank Xin Huang, Chaoyan Zhu, Wei Wang and Xuqing Zhu for their work in the project.
References [1] Guiyi Wei, Guanghua Song, Yao Zheng, Cuiju Luan, Chaoyan Zhu, Wei Wang, MASSIVE: A Multidisciplinary Applications-Oriented Simulation and Visualization Environment, Proceedings of the 2004 IEEE International Conference on Services Computing, SCC 2004 (Shanghai, China, 2004), IEEE Computer Society, Los Alamitos, California, 2004, pp. 583-587. [2] Rajkumar Buyya, David Abramson, and Jonathan Giddy, Nimrod/G: An Architecture for a Resource Management and Scheduling System in a Global Computational Grid, Int. Conf. on High Performance Computing in Asia-Pacific Region, Beijing, 2000, pp. 283-289.
Figure 5. Allocating resources for a job With the job management, if the computing result is unsatisfactory, the user needs only to modify the parameter file “quzhou.in”, and then re-submits the job Quzhou, by reusing the job definition.
4. Conclusions and Future Work In this paper, the main issues solved include: the reusability of the job definition, which can free users from the repeating and trivial works; the flexible and comprehensive file/folders operations, which can meet demands on the file operations; the workflow, which can make the complicated operations easy to be performed. The overall result shows that the user-oriented job management is easy-to-use, efficient, and the job definition is more reusable. Therefore, the flexibility of job management is well enforced. In the future, we will progressively do more experiments to test the overall performance of this architecture and improve it accordingly. In addition, we plan to study on how to transfer the parameter files and resulting files more efficiently.
[3] J. Frey, T. Tannenbaum, M. Livny, I. Foster, S. Tuecke, Condor-G: A Computation Management Agent for Multi-Institutional Grids, Cluster Computing, 5(3), 2002, pp. 237-246. [4] S. J. Chapin, D. Katramatos, J. Karpovich, A. Grimshaw, Resource Management in Legion, Future Generation Computer Systems, 15(5-6), 1999, pp. 583-594. [5] H. Casanova, A. Legrand, D. Zagorodnov, F. Berman, Heuristics for Scheduling Parameter Sweep Applications in Grid Environments, Proceedings of the 9th Heterogeneous Computing Workshop (HCW'2000), Cancun, Mexico, May 2000, pp. 349-363 [6] Chuang Liu, Lingyun Yang, Ian Foster, Dave Angulo, Design and Evaluation of a Resource Selection Framework for Grid Applications, Proceedings of IEEE International Symposium on High Performance Distributed Computing (HPDC-11), Edinburgh, Scotland, 2002, pp. 24-26. [7] K. Czajkowski, I. Foster, N. Karonis, C. Kesselman, S. Martin, W. Smith, S. Tuecke, A Resource Management Architecture for Metacomputing Systems, Proceedings of the 4th Workshop on Job Scheduling Strategies for Parallel Processing, 1998, pp. 62-82. [8] N. Karonis, B. Toonen, and I. Foster, MPICH-G2: A Grid-Enabled Implementation of the Message Passing Interface, Journal of Parallel and Distributed Computing (JPDC), Vol. 63, No. 5, May 2003, pp. 551-563.
Proceedings of the 2005 The Fifth International Conference on Computer and Information Technology (CIT’05) 0-7695-2432-X/05 $20.00 © 2005
IEEE