Metadata-driven Processing in the BIMA Image Pipeline

3 downloads 491 Views 190KB Size Report
Astronomical Data Analysis Software and Systems XI. ASP Conference ... We demonstrate this idea with our tool for searching data collections. A generalized ...
Astronomical Data Analysis Software and Systems XI ASP Conference Series, Vol. 281, 2002 D. A. Bohlender, D. Durand, and T. H. Handley eds.

Metadata-driven Processing in the BIMA Image Pipeline Raymond L. Plante, Damien Guillaume 1 , David M. Mehringer, Richard M. Crutcher1 National Center for Supercomputing Applications, University of Illinois, Champaign, IL 61820-5518, USA Abstract. In the BIMA Data Archive, metadata are encoded in XML and are stored along with the data collections they describe. The metadata summarize not only the contents of the collections but also their history and the original scientific intent. In this presentation, we illustrate how this information is used to drive automated processing of the data through the BIMA Image Pipeline. The data management chores of the pipeline are implemented as metadata transformations using XSLT; these include deciding when certain processing should take place and generating the processing scripts based on scientific intent. We also describe a framework for metadata schema definition and publishing; such a framework allows applications to adapt automatically to new schemas. We demonstrate this idea with our tool for searching data collections. A generalized framework such as this will be important for the Virtual Observatory which must not only manage diverse data collections but evolve to handle new types of data in service of the new science that the Observatory will inspire.

1.

Introduction

The BIMA Archive and Image Pipeline, which automatically processes data from the BIMA Millimeter Interferometer, has been designed as a metadata-driven system (Plante 2001). By this we mean that descriptions of data products, i.e. metadata, are used to move data through the system, connecting them with processing and users. Data management is achieved by transforming metadata from one form to another. XML is an ideal technology for such a system, particularly for the XML Stylesheet Language Transformation (XSLT) standard (Clark 1999), which can be used to facilitate the transformations that drive the system. To keep the system general and flexible, we strove to keep the core software as schema-independent as possible; that is, it avoids dependencies on the exact terms or concepts used to describe the data. As a result, not only is it possible to apply the system to new collections using a different schema, it is easy to update the schema as the data collection evolves without breaking the software.

1

Astronomy Department, University of Illinois Urbana–Champaign, 605 E. Springfield Avenue, Champaign, IL 61820-5518, USA

346

Metadata-driven Processing in the BIMA Image Pipeline

347

Archive System 1. sends "creation" event

Ingest Engine

2. requests XML description of collection

Describe Service

Figure 1.

3. tests metadata to determine if processing can take place

00jul02.raw 2000−07−02