Automatic services discovery, monitoring and ... - Semantic Scholar

3 downloads 82665 Views 333KB Size Report
nizations, resources availability is more and more dynamic and final lo- cations of jobs ... constraints which deeply impacts monitoring and visualization possibil-.
Automatic services discovery, monitoring and visualization of grid environments: the MapCenter approach Franck Bonnassieux1 , Robert Harakaly1 , and Pascale Primet2 1

CNRS UREC ENS-Lyon 46, Allee dItalie 69364 LYON Cedex 07, France {Franck.Bonnassieux, Robert.Harakaly}@ens-lyon.fr 2 LIP RESO ENS-Lyon, France [email protected]

Abstract. The complexity of Grid environments is growing as more projects and applications appear in this quick-evolving domain. Widespread applications are distributed over thousands of computing elements, various communities of users are aggregated into virtual organizations, resources availability is more and more dynamic and final locations of jobs cannot be foreseen. In such contexts, new paradigms and constraints which deeply impacts monitoring and visualization possibilities must be addressed. As nodes and even sites appear and disappear quickly in grid, automated resources and services discovery must be performed and regularly launched with an adequate frequency compared to the dynamicity of grid elements. The numerous production sites interconnected by grids have their own security policies and system administration principles and procedures, which can eventually be outsourced. A global Grid monitoring system must be efficient and not intrusive to respect security policies and do not interfere with site local rules like logging and auditing. On top of nodes and sites, the large number of applications and virtual organisations generates multiple different abstract views of the grid environments. A Grid visualization tool must be opened and flexible to represent all corresponding virtual views needed. The Grid visualisation tool MapCenter has been designed to cope with these new challenges. Our approach is to decompose the architecture into three main layers: – a visualization layer, based on a new presentation model, – a monitoring layer, implemented by an efficient multi-threaded polling framework, – an independent resource discovery and modelling layer,based on various back-ends. This architecture and the open and flexible presentation model has been fully described in [3]. This model can match any logical approach of the Grid computed from monitoring layers underneath. Logical and graphical views to visualize all dashboards needed by end users (scientist, industrial, system administrator, organization manager ...) can be created.

2 This paper emphasises on specific monitoring techniques and data access back-ends to achieve more efficient and fair Grid services availability checking. It also underscores automatic discovery and modelling capabilities that permit to quickly deploy on lots of Grid environments and to keep status up to date. In parallel with status visualization features, MapCenter is designed to aggregate all available grid information and represent logical views to access it. Various grid information systems technologies are available and several back-ends have been developed in PHP to grant access to end users. In particular a LDAP Browser and a SQL Query interface are available, and R-GMA [4] back-ends are also under development. Many dashboards can be created using these back-ends, which enable users to access all kind of grid information from central points without any knowledge of collecting and storage technologies used underneath. In order to obtain a non intrusive system, MapCenter performs all monitoring features from a central point, even if several MapCenter servers can be distributed for fault tolerance issue. No specific node or agent needs to be deployed on grid sites, and a complete MapCenter monitoring solution can be set up in few hours. Global multi-threaded scheduling architecture has been completely reviewed and timeouts on all types of sensors (ICMP, TCP, UDP, HTTP ...) have been implemented and tuned in order to improve performances and insure high scalability. A key point for a Grid monitoring framework is to use standard protocols to cope with heterogeneity and to keep real grid resources unaware of any polling, without any interfering side-effects on grid end systems. Stealth TCP port scanning has been implemented which decreases the numbers of packets exchanged and the processor load on the server, and overall remote end nodes and applications are not informed of this kind of scanning and so do not generate unwanted entries in various log files. UDP port scanning capabilities has also been added to check specific Grid UDP services.

                                                                                                                              

Fig. 1. Discovery Processings

3 To cope with the dynamicity and ubiquity of Grids, automatic discovery and logical views creation is implemented and now widely used in MapCenter. Resources are stored in various types of grid information systems and specific stubs have been developed for each of them (e.g. LDAP queries for Globus MDS [5], IBP client [6] for L-BONE, or CGI scripts browsing for PlanetLab ...). Figure 1 presents the general design of the discovery mechanism : the generators use a specific stub for accessing the grid information system, then all objects discovered are geographically localized using NetGeo [7] tool from Caida, and finally a entry data file is generated which contains all objects, symbols, maps and links representing this grid. These data files contain also logical views that represent any abstract level of grid representation and graphical views that are in general geographic maps with animated Grid sites displayed. A data file is then used by MapCenter daemon itself which performs the polling of all objects and dynamically generates all html pages.

Fig. 2. Graphical view example : USA map for L-BONE

MapCenter [10] is now used by several major Grid environments: DataGrid [11], DataTag [12], CrossGrid [13], PlanetLab [14], L-Bone [15], RLS [9], and AtlasGrid [16]. On the performance aspect, the DataGrid static version is currently monitoring more than 200 objects (computing or storage elements front-ends, each of them managing worker nodes farms), and the total time for polling various services running on all these objects is less than one minute. One MapCenter server is able to successfully monitor more than ten thousands of grid front-ends representing potentially several millions of worker nodes with less than ten minutes global polling frequency. Advanced automatic discovery feature is used on most of grid projects monitored. As an example, a MapCenter site has been created for RLS, and logical views for replication services which show all dependencies existing between these elements are generated each day (which master copy can update which replica, which replica can be updated by which one ...). Another example is the graph-

4 ical view of L-BONE USA grid sites (Figure 2) regenerated each night and monitored in real-time. MapCenter has been designed to be integrated with future technologies arising in Grid community, and especially with OGSA [8]. In particular, automatic service discovery can be realized through Registry interface, monitoring using SuscribeToNotificationTopic of NotificationSource interface and access to information through FindServiceData of GridService interface. However, performance issues involved will have to be carefully studied. The deployment on new grid environments will go on over next years.

References 1. Tierney, B., Aydt, R., Gunter, D., Smith, W., Taylor, V., Wolski, R., Swany, M. : ”A Grid Monitoring Service Architecture” Global Grid Forum Performance Working Group, 2001. 2. Tierney, B., Crowley, B., Gunter, D., Holding, M., Lee, J., Thompson, M. : ”A Monitoring Sensor Management System for Grid Environments” Proceedings of the IEEE High Performance Distributed Computing conference (HPDC-9), August 2000. 3. Bonnassieux, F., Chanussot, F., Harakaly, R., Primet, P. : ”MapCenter : An Open Grid Status Visualization Tool” PDCS Conference , Louiville, Kentucky 2002. 4. Fisher, W. S. : ”Relational Model for Information and Monitoring” Grid Forum Informational Draft GWD-GP-7-1. 5. Czajkowski, K., Fitzgerald, S., Foster, I., Kesselman, C. : ”Grid Information Services for Distributed Resource Sharing” Proceedings of the Tenth IEEE International Symposium on High-Performance Distributed Computing (HPDC-10), IEEE Press, August 2001. 6. Bassi, A., Beck, M., Moore, T. and Plank, J. : ”The Logistical Backbone: Scalable Infrastructure for Global Data Grids” Asian Computing Science Conference 2002, Hanoi, Vietnam, December, 2002. 7. CAIDA NetGeo, the Internet Geographic Database http://www.caida.org/tools/utilities/netgeo/ 8. Foster, I., Kesselman, C., Nick, J., Tuecke, S. : ”The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration” Open Grid Service Infrastructure WG, Global Grid Forum, June 22, 2002. 9. Chervenak, A., Deelman, E., Foster, I., Guy, L., Hoschek, W., Iamnitchi, A., Kesselman, C., Kunst, P., Ripeanu, M., Schwartzkopf, B., Stockinger, H., Stockinger, K., Tierney; B. : ”Giggle: A Framework for Constructing Sclable Replica Location Services” Proceedings of Supercomputing 2002 (SC2002),November 2002. 10. MapCenter Web Site for DataGRID http://ccwp7.in2p3.fr/mapcenter 11. European DataGRID projet http://web.datagrid.cnr.it 12. DataTAG Project http://datatag.web.cern.ch/datatag 13. CrossGrid Project http://www.eu-crossgrid.org/ 14. PlanetLab http://www.planet-lab.org/ 15. Logistical Backbone http://loci.cs.utk.edu/lbone/ 16. Atlas Grid http://atlas.web.cern.ch/Atlas/GROUPS/SOFTWARE/OO/grid/