Monitoring of Jobs and their Execution for the LHC Computing Grid

13 downloads 0 Views 743KB Size Report
The bash wrapper operates in two steps which are illustrated in figure 1. First, the ... write back environment: The python wrapper writes back all environment.
Monitoring of Jobs and their Execution for the LHC Computing Grid Ralph M¨ uller-Pfefferkorn1 , Reinhard Neumann1 , Stefan Borovac2 , Ahmad 2 ,3 Hammad , Torsten Harenberg2 , Matthias H¨ usken2 , Peter M¨ attig,2 Markus Mechtel2 , David Meder-Marouelli2 , Peer Ueberholz4 1

Center for Information Services and High Performance Computing Technische Universit¨ at Dresden, D-01062 Dresden, Germany 2 Bergische Universit¨ at Wuppertal, Gaußstraße 20 D-42119 Wuppertal, Germany 3 Forschungszentrum Karlsruhe, Hermann-von-Helmholtz-Platz 1 D-76344 Eggenstein-Leopoldshafen 4 Hochschule Niederrhein, Reinarzstraße 49 D-47805 Krefeld, Germany

Abstract To process large amounts of data produced at the Large Hadron Collider at CERN, users typically submit hundreds or thousands of jobs. This requires that users can keep track of their single jobs. In this paper, we present a user-centric system to monitor jobs, i.e. to monitor their execution and their resource usage within such a scenario. The system supports users with useful monitoring information, interactive visualisations and helps them to find the reasons for job failures.

1

Introduction

A major challenge in Grid Computing is the processing of large amounts of data. The Large Hadron Collider Computing Grid (LCG) project aims to provide the particle physics community of the LHC collider at CERN with an environment which is able to analyse the petabytes of data produced every year. A typical analysis scenario consists of several hundreds or thousands of jobs, each reading and analysing some small part of the data, e.g. one or several sets of collision events. Another scenario are simulations of particle physics events whose results are written to the event store. Both scenarios have the problem in common that detailed tracking of single jobs is more or less impossible. Thus, a framework which provides the user with status and resource information and gives one a possibility to react in case of failures would be desirable. Existing job monitoring tools in the LCG/gLite environment provide just limited functionality to the user, e.g. the LCG job monitoring tools [1]. These are either command line tools which deliver simple text strings for every job or the information they provide is very limited (e.g. status only). Other existing monitoring tools (e.g. the LCG Real Time Monitor [2] or GridIce [3]) focus on the monitoring of the infrastructure rather than the user application/job.

The High Energy Particle Physics Community Grid project1 (HEPCG) [4] of the German D-Grid Initiative [5] wants to contribute to the functionality the LCG-Grid provides to physicists. One major goal is to improve the monitoring of jobs, their execution and their resource usage. The focus is on the development of tools for end users, thus taking into account their ease of use. In the following the job execution monitor, developed at the Bergische Universit¨ at Wuppertal, and the job and resource usage monitoring, developed at the Technische Universit¨ at Dresden, are described in section 2 and section 3. We also discuss their current status and future plans.

2 2.1

The Job Execution Monitor Goal

The purpose of this component is to monitor the execution of script files within the user jobs. Typically, the first thing to be executed on a worker node is not a binary executable, but a script file which sets up the environment first. This step includes, e.g. the setup of some environment variables or the download of data files from a storage element. The setup process often includes commands which are known to be the source of many job failures. It is the goal of the Job Execution Monitor to monitor the execution of such critical commands and report their success or failure to the user.

2.2

Architecture

The monitoring part of the Job Execution Monitor is added automatically to the user’s job via a modified job submission command. In the JDL file, the user’s executable is replaced by the monitoring executable, which sets up the monitoring framework on the workernode and starts the original executable, which is in most cases a script file. The core module of the Job Execution Monitor is the script wrapper. It is responsible to load the language wrapper modules for the supported languages. In order to gain detailed information about the job execution, a given script file is executed command by command. After a command has finished, its exit status is checked and logged. The script wrapper publishes the logging data to the Relational Grid Monitoring Architecture (R-GMA) [6], which is used in LCG/gLite to store monitoring data. The structure of the Job Execution Monitor is modular, so that more language wrapper modules can be easily added to support more scripting languages. For every command, the script wrapper asks every language wrapper, whether the command contains a script file which can be handled by the language wrapper or not. If the command matches a scripting language, control is handed to the appropriate wrapper module. 1 funded by the German Federal Ministry of Education and Research (BMBF) under Grant No. 01AK802C

Fig. 1: Structure of the Bash wrapper The current version of the Job Execution Monitor is able to monitor the execution of bash and python scripts. The structure of the bash wrapper is more complex as python already provides some mechanisms to retrieve information about the execution of python commands. Therefore, the bash wrapper is described in more detail in the following section. As the goal of this part of the software is to avoid errors and to help finding them, the Job Execution Monitor may not produce additional errors. Therefore, if the Job Execution Monitor encounters any internal error in its own execution, the user job is executed without any monitoring, i.e. the job is executed without any interference.

2.3

The bash wrapper

The bash wrapper operates in two steps which are illustrated in figure 1. First, the given bash shell script is analysed by the bash parser. It detects commands and adds an escape sequence in front of it. When the modified script is launched, this causes the actual command to be passed as a command line argument to another python module called BashInvoke.py. The second step is the actual execution. The modified bash script is launched as a new process. It runs in parallel to the bash wrapper. Instead of executing the commands in the bash script, only the python module BashInvoke.py is called. This module passes the command string to the bash wrapper, which executes the command in a separate shell, which is kept open as long as the modified script is running. After the command has finished, its results (i.e. the data written

to stdin, stdout, stderr and the exit code) are returned to BashInvoke.py. The module publishes this data to the standard streams and the modified script can handle the data in the same way as the original script.

2.4

The python wrapper

The structure of the python wrapper is simpler compared to the bash wrapper because python is able to trace every command. Thus, the execution can be redirected and handled by the python wrapper. Hence, the python wrapper can act in three steps: read environment: The python wrapper reads all environment variables from the execution shell mentioned in the previous section. It sets up the environment for the following execution of the python script. running: The python wrapper defines a trace function for every command, which is able to write logging information and is responsible for the execution of the command. write back environment: The python wrapper writes back all environment variables to the execution shell. After that, bash scripts can continue as if the python script was executed inside the execution shell.

2.5

Future Work

It is planed to include the Job Execution Monitor into the LCG monitoring tools. Then, there will be no need to use a separate job submission command. The monitoring could be switched on with a simple entry in the job’s JDL file. Additionally, we want to develop an expert system to automatically recover certain error conditions in case of failed commands.

3 3.1

The Job- and Resource Usage Monitoring Goals and Architecture

The main goal of the development is the creation of a user-centric job and resource usage monitoring. Users are both the job submitters, who want to see whats going on with their hundreds or thousands of jobs and the resource providers, who want information about the usage of their resources. In short, the following are the three main objectives from the users point of view: 1. Easy access and handling - only limited knowledge about monitoring is needed by the user 2. Support the users with graphical representations of the information, that allow interaction to get further and detailed information 3. Authentication, authorisation and secure data transmission for data privacy reasons The architectural design is illustrated in the sketch in figure 2. It is modularly separating information gathering, information storage and retrieval, analysis of the monitoring data, and their visualisation in the user interface.

Fig. 2: Architecture of the Job and Resource Usage Monitoring System Information Gathering Currently, the LCG Job Monitoring is used to collect monitoring data (lcg-mon-wn) [1]. On the worker nodes small job monitoring applications can be started with the user job. The user simply needs to set an environment variable to start the monitoring. It provides information about the status of a job, monitors basic resources (e.g. CPU and memory usage). Additionally, the status of the jobs is read from the LCG Job Status Monitoring [1] which itself uses the information from the Logging & Bookkeeping service [7]. Information Storage and Retrieval The monitoring services write the collected data to R-GMA – the Relational Grid Monitoring Architecture [6] – which is used in LCG/gLite to store monitoring data. R-GMA is a kind of a distributed relational database, where data are stored in tables. As the name implies R-GMA follows the Grid Monitoring Architecture standard of the Global Grid Forum. So-called producers put the data into the storage, consumers are used to retrieve them. Analysis On the analysis server an R-GMA consumer reads the data from RGMA for further processing. Both for the users and for resource providers the data are preprocessed to provide them with usefull information. Authorisation is done here to allow or restrict data access according to policies for data privacy reasons. The data readout and analysis are provided as a Tomcat-Axis-based [8, 9] Web Service. As the amount of monitoring information can be quite significant, the data

Fig. 3: The monitoring is integrated into GridSphere. Here a simple example graph of a histogram. The red line and ellipse illustrate the interactivity clicking in the pie chart reveals more information. retrieval and analysis step is separated from the Visualisation The visualisation is not just putting data into static histograms. The visualisation provides the user with intelligent graphical representations of the information by allowing interactivity. Clicking on histograms, time lines or other charts will serve the user with detailed information about the chosen item or will lead to displays with extended information. Zooming into the visualisations also extends their usefullness. The visualisation routines are Java based applets running in the users browser. User Interface Web Browsers are a common and known tool for most internet users. Thus, the access to the monitoring information is browser based. GridSphere [10], a portal framework especially designed for Grid needs is the integrating platform for the visualisation. Additionally, GridSphere already provides needed functionality like user management and credential retrieval. The Analysis Web Service is called by a GridSphere service on the visualisation server, which provides the data to the monitoring portlet. The data are fed into the visualisation applets which are integrated into the portlet. Figures 3 and 4 show first prototypes of the graphical representations of the monitoring data and their integration into GridSphere. All components are coded in Java for interoperability reasons, except for the LCG monitoring tools which use Python.

Fig. 4: Another visualisation example: the first prototype of a timeline - the temporal development of an information (here the status of the jobs). Every horizontal row shows the info of one job. The colours code the status. The ability to zoom in and out of the data is denoted by the scrollbars of the display.

3.2

Current Status and Future Work

The HEPCG project started in September 2005. The initial focus of the job and resource usage monitoring was the setup of a prototype of the whole infrastructure described above. It was realised by November 2006, including the web service, visualisation and the Gridsphere integration. Future releases will include more advanced features, like authentication mechanisms with role handling for access to different levels of data, analysis algorithms especially for resource usage data for resource providers, persistent storage of the monitoring data and new monitoring metrics. Nevertheless, the first prototype will be already provided to users within the HEPCG project to get immediate feedback and to improve it according to their needs.

References 1. L. Field, F. Naz, et al. User level tools documentation, 2006. http://goc.grid.sinica.edu.tw/gocwiki/User tools. 2. Moont G. The LCG Real Time Monitor. http://gridportal.hep.ph.ic.ac.uk/rtm/. 3. C. Aiftimiei, S. Andreozzi, G. Cuscela, N. De Bortoli, G. Donvito, S. Fantinel, E. Fattibene, G. Misurelli, A. Pierro, G.L. Rubini, and G. Tortone. GridICE: Requirements, Architecture and Experience of a Monitoring Tool for

4. 5. 6.

7. 8. 9. 10.

Grid Systems. . In Proceedings of the International Conference on Computing in High Energy and Nuclear Physics (CHEP2006), Mumbai, India, February 2006. http://gridportal.hep.ph.ic.ac.uk/rtm/. HEPCG. High Energy Physics Community Grid, 2005. http://www.hepcg.org. D-Grid Initiative, 2005. http://www.hepcg.org/index.php?id=1&L=1. R. Byrom et al. Fault Tolerance in the R-GMA Information and Monitoring System . In European Grid Conference 2005, volume 2470 of LNCS, pages 751– 760, Amsterdam/Netherlands, February 2005. Springer-Verlag Berlin Heidelberg New York. EGEE JRA1 CZ. Logging and bookkeeping, 2006. http://egee.cesnet.cz/en/JRA1/index.html. The Apache Software Foundation. Apache axis - a soap engine, 2006. http://ws.apache.org/axis/. The Apache Software Foundation. Apache tomcat - a servlet container, 2006. http://tomcat.apache.org/. J. Novotny, M. Russell, and O. Wehrens. GridSphere: A Portal Framework for Building Collaborations, 2005. http://www.gridsphere.org:80/gridsphere/gridsphere?cid=publications.

Suggest Documents