Astronomical Database Related Applications in the ... - Semantic Scholar

5 downloads 3470 Views 106KB Size Report
ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XIV ... then query the Astrophysical DB that can be internal or external to the grid environment.
ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XIV ASP Conference Series, Vol. 347, 2005 P. L. Shopbell, M. C. Britton, and R. Ebert, eds.

Astronomical Database Related Applications in the Grid.it Project Alessandra Volpato,1 Giuliano Taffoni,2 Serena Pastore,1 Claudio Vuerli,2 Andrea Baruffolo,1 Riccardo Smareglia,2 Giuliano Castelli,2 Fabio Pasian,2 and Leopoldo Benacchio1 National Institute for Astrophysics, 00136 Rome, Italy Edgardo Ambrosi and Antonia Ghiselli National Institute for Nuclear Physics - CNAF - Bologna, Italy Abstract. We describe the activity done in the contest of the Grid.it project to access Astronomical Catalogues and Archives through grid environment. Different approaches are tested: job oriented, web service oriented. The crucial aspect of this work is the development and deployment of an IVOA compliant Data Source Engine model in the grid middleware.

1.

Introduction

The Astronomical Observatories of Trieste and Padova are involved in the Grid.it project aiming at ’Enabling platforms for high-performance computational grids orientated to scalable virtual organizations’. In this framework we explore the use of grid technologies for developing astrophysical applications. The testbed for grid applications is the Italian INFN Production Grid for Scientific Applications based on the LCG-2 (LHC Computing Grid, Robertson 2001) distribution, which is the evolution of the software produced by the European DataGrid project3 . Within Grid.it4 , the OAPd group focused on the portability to the Grid of an existing system for the consultation of large astronomical catalogues, currently serving on the net the Second Guide Star Catalog (GSC-II) (Benfante et al. 2000), while the Astronomical Observatory of Trieste (OATs) is dealing with the similar problem of integrating in the Grid the archive of observational data from the Italian Galileo National Telescope (TNG) (Smareglia et al. 2003) and, as a further step, to provide processing for the data retrieved through gridenabled pipelines. Since in the grid infrastructure based on LCG an adequate model for DBMSes is not available, INFN and INAF institutes are collaborating to design and implement an architecture to integrate DBMSes in the existing

1

Osservatorio Astronomico di Padova, Padova, Italy

2

Osservatorio Astronomico di Trieste, Trieste, Italy

3

http://www.eu-datagrid.org/

4

http://grid-it.cnaf.infn.it/

329

330

Volpato et al. Astrophysical Application UI

RB

SOAP

que

ryin

que

Astro Catalog engine

Astro Catalog engine

g

ryin

g

querying

RDF inference & querying

IVOA metadata catalog engine

Grid Environment

Figure 1. Grid DSEs schema, illustrating the Grid procedure to locate and query an Astronomical Database with IVOA standards. The Grid environment is equipped with a metadata engine that is first located by the Resource Broker. The metadata source engine is accessed by a Grid node and reply to the node the location of the Astrophysical source engine. The grid node then query the Astrophysical DB that can be internal or external to the grid environment.

grid infrastructure. In this paper we describe our effort in modeling the DBMS on the grid environment and the work done by OAPd and OATs to guarantee the access to archive and catalogues using the pre-DBMSes grid infrastructure. 2.

A Data Source Engine model for LCG

In the current grid architecture, the Information Services (IS) encompass semantic and interaction models for Grid resources. They are able to retrieve run-time data and information from the defined Grid node elements: Computing Elements (CE) and Storage Elements (SE). However, to publish and discover Astrophysical data sources and services it is necessary to add to the IS the Data Source Engine (DSE) as new grid resource entity. The basic assumption driving this work is that a DSE can be modeled in terms of a batch system architecture and can be evolved to a new Grid resource analog to a Computing Element. The deployment of a DSE implies the definition of the information schema to advertise on the Grid Monitoring and Discovering System the DSE resources, recording not only physical characteristics but also relevant semantic informations, with the additional constraint of compliance with the Resource and Data-Service schemas proposed by IVOA (Hanisch & Quinn 2003). Moreover, the Resource Specification Language instruction set must be enhanced to manage query activities and the Local Job Manager (LJM) capabilities must be extended to deal with the newly defined ’query jobs’. As we want to be compatible with the IVOA standards, we must supply the integration with the LJM of a library to allow query jobs to be specified in terms of IVOA compliant ADQL documents. Finally Grid Resource Information Index Backend should be interfaced with the Grid-DSE IP and the WN MOM should be evolved in order to

Astronomical Databases in the Grid.it Project

?wsdl

331

WSDL

INFO SERVICE ?info

(InfoService)

?wsdl

XML

WSDL

SQL

QUERY SERVICE

ADQL

(QueryService)

VOTable

Web Services

Figure 2. The Web Services implemented by the OaPd to access the GSC-II astronomical catalogue.

reflect DSE backend capabilities. Currently, a prototype solution is undergoing the test phase: it involves the integration of the new G-DSE schema in the IS, thus allowing the discovery of existing Astrophysical Data Sources within the Grid.

3.

Astronomical Archives and Catalogues: the grid access

The goal is to develop a grid-integrated system for accessing large astronomical catalogues and archives. This is the first step to use the grid to access to visualize Astronomical data and process them. 3.1.

Web Service Approach

The OaPd built a prototype system to access the GSC-II astronomical catalogue (Benfante et al. 2001) following the web services (WS) paradigm (Newcomer 2002). The system developed is made up of two web services: a Meta-data Access service and a Query service (see Fig. 2). Meta-data Access Service retrieves metadata information about catalogues by connecting with a “lightweight” DB running on the same machine as the service container. This service answers the info request with an XML file containing the catalogue metadata in an IVOA compliant format. Query Service answers queries expressed in SQL or ADQL with a VOtable containing the result set. The two services are installed on a WN connected to the Production Grid; they are deployed within an Apache web server equipped with Tomcat and Axis, and secured by EDG security packages (an extension of the standard Grid Security Infrastructure specific for web applications). The interaction between a client and the services is carried out by SOAP messages over https. Currently, the services can be accessed either interactively, through JSP, or in batch mode, requesting the execution of a command line client application by submitting a Job Description Language (JDL) file from a machine with the User Interface Grid software installed.

332

Volpato et al. NETWORK ENVIRONMENT Web Browser with GSI−style certificate

Astro DB

Web Services

Service provider

GRID User through UI SOAP Service Requestor

Certificate

GSI Security

WSDL

UDD

Secure/GRID web service container

UD

DI

I Service Broker GRID Information System

GRID Environment

Figure 3. An illustration of a client node accessing the GSC-II astronomical catalogue through a Web Service grid infrastructure.

3.2.

A Job-oriented approach

The OATs developed an application specific layer on the grid middleware to access Astronomical archive and produces on-the-fly calibration for the requested data (Taffoni et al. 2004). The testbed for this application is the Long Term Archive of Telescopio Nazionale Galileo (LTA-TNG) located at OaTS (Smareglia 2003). We designed the connection service as a client/server grid application based on a query service (QS), a client service (CS) and a reduction driver. The QS is a java application installed on a CE which uses the OJDBC driver to connect with the DB machine. Security is guaranteed by username/password and direct cable connection. The QS is queried by the CS, which is installed on each UI and allow users of INAF VO to access data. It is constituted by a set of shell script wrapping the globus job submission tools. Copy and registering of the query output and/or images is implemented as a Java application that uses the Replica-Manager client java classes to access the Replica-Manager grid service to copy and register the files. A logical file name (LFN) for each file must be supplied by the user. A reduction driver was implemented. At this stage our interest concerns the possibility of calibrating a set of Astronomical images removing the instrumental signature. The calibration driver is based on Eclipse. To deploy the “data calibration on demand” service we must first install in the grid WN the Eclipse software. Then a collection of shell scripts are used to run a reduction pipeline. The pipeline is submitted to the grid using JDL . We customized the UI, designing a graphical application to manage the access and reduction services. It is based on Java Swing. It allows users to make the query, display the results, save the query on a grid file, save data on grid files. It compiles the JDL file to run the reduction and submits it.

333

Astronomical Databases in the Grid.it Project ojdbc (query) wget (data)

ojdbc

Archive service

Query service User Interface

Data registry service

Computing Element

RLS

query/data

Service client

client

High Grid Speed SE Disk

GSI security

reduction

Resource Broker

Primergy

Grid WN

data request

Computing Driver

GRID

Figure 4. The job-oriented data access. We show the work flow for a joboriented database access with a calibration driver to call on demand. The security is based on GSI and OJDBC. A graphical user interface allow users to make query and reduction

4.

Conclusions and future work

We verified the possibility of accessing Astronomical data through a grid environment and to perform simple data processing. Our tests show that a crucial need is the integration in the Grid of a DSE to collect and query metadata information on the resource available in a IVOA format. Our future work will be centered on the complete modification of the Grid IS to finally integrate the DSE and on the refinement of the application developed to take full advantages of the presence of a DSE metadata collector. Acknowledgments. This work is done with the support of the Italian Government and particularly of MIUR. References Benfante, L., Volpato, A., Baruffolo, A., & Benacchio, L., 2001, in ASP Conf. Ser., Vol. 238, ADASS X, ed. F. R. Harnden, Jr., F. A. Primini, & H. E. Payne (San Francisco: ASP), 160 Hanisch R., J. & Quinn P., L., 2003, http://www.ivoa.net/pub/info/ Newcomer E., 2002, Understanding Web Services, Addison-Wesley Robertson L., 2001, CERN/2379/rev Smareglia, R., Becciani, U., Caproni, A., Gheller, C., Guerra, J. C., Lama, N., Longo, G., Pasian, F., & Zacchei, A, 2003, MmSAI, 74, 514

Suggest Documents