Grid computing in Earth Science

0 downloads 0 Views 3MB Size Report
Inter-Comparaison of Parasol data with Lidar data aboard Calipso and Modis algorithm validation. •Full resolution processing of the Parasol data on EGEE by.
Grid computing in Earth Science M. Petitdidier (IPSL/LATMOS) [email protected] With collaboration D. Weissenbach (IPGP) ,J. Raciazek (IPSL), L. Schenini (Geoazur), H. Schwichtenberg4 (SCAI) & ESR VO and SEEGrid partners

http://www.euEarthScienceGrid.org

Introduction A continuous involvement of Earth Science in the Grid since 2000 DataGrid (2000-2004) : IPSL with ESA and KNMI  EGEE I & II & III (2004-2010) : IPSL & IPGP Cluster Earth Science-IPSL coordination MPI working group and Training activity Committee of the EGEE conferences

DEGREE (2006-2008): FP6 EU project IPSL Deputy coordinator Roadmap (Cossu et al., 2010)

CYCLOPS, SEEGRID, EnviroGrids…. EGI (2010-2014) Dissemination: booth and special sessions at the EGU conference 13/10/2011

Monique Petitdidier

Main concepts about Grid

 User and provider communities  Resources from different organisations, medium & long term dynamic collaboration  users from different institutes, organisations but from the same scientific & technical community and/or projects  Trust relationship among users & providers  Security: Authentification via certificate & Authorization via Virtual Organization (Virtual team)  Data policy – restrictive access, no replica….

 Synergy among all different scientific & technical communities (EGI) HEP, BioMed, A&A…..

e-collaboration for sharing resources, data, algorithms, tools, results, distributed expertise, scientific and technical experience, …. 13/10/2011

Monique Petitdidier

ES Virtual Organisatrions Earth Science Research (ESR) : Generic VO • Located since 2004 at SARA (NL) • Coordination: M. Petitdier (IPSL) & H. Schwichtenberg (SCAI) • 56 registered users, ~20 registered applications from France: IPGP, IPSL, LOA, BRGM, LGE/E. Mines Alès • 50 sites in Europe (12 countries, ~ 20 000 CPUs) • Atmosphere, Climate, Hydrology, Seismology, Environment… Related VOs ( ~7) in Europe • Limited projects: CYCLOPS, SEEGRID; or with only a national dimension: in Ireland, Russia … • EGEODE – Geocluster VO (CGG Veritas no more part of EGI and of France Grilles) • Since 2004 at CC-IN2P3, • CGG seismic processing and imaging software • ~30 users 13/10/2011

Monique Petitdidier

For a long time the observations from various sensors have been an effective way to discover and study the behaviour of the earth system with all its components (Cossu et al., Earth Science Informatics, 2010)

13/10/2011

Monique Petitdidier

D A T A

13/10/2011

Monique Petitdidier

A N Y W H E R E

ES Data Intensive Applications Data Integration and Data Analysis • Multi‐attribute and 4D (3D+time)data sets

• Exploration and visualization of distributed large volumes of data • Integration of large volume and/or large number data sets from distributed data resources • Distributed data analysis (HTC) of large data volumes (> 100s TB) > New large data sets • Data‐curation and Data‐fitness Data modeling • Large scale parameter exploration (HTC) • Large scale simulations (HPC) producing large volumes of synthetic data to be analyzed • Large scale Inversion and assimilation (HPC) applications (high memory and storage)

Orchestration workflows • Orchestration of Data analysis and Data modeling applications with efficient data transfers

between Data, Grid and HPC infrastructures

Phenomena as a function of latitude, longitude, altitude and time: Worlwide community 13/10/2011

Monique Petitdidier

Nowadays our total knowledge about the complex Earth system is contained in models and measurements, how we put them together has to be managed cleverly… The technical challenge is to put together databases and computing resources to answer the ES challenges. (Cossu et al.,2010)

13/10/2011

Monique Petitdidier

Applications on EGEE and SEEGrid Flood EMA (France), IISAS (Slovakia), SRI (Ukraine)

Climate - GRelC

Biodiversity

CMCC (Italy), IPSL (France), Univ Cantabria (Spain), SCAI (Germany)

BRGM (France), Footways(France),JKI (Germany) Scenarios climatological

Flood map for Normanton area, Queensland, Australia derived from RADARSAT-2 14-17 Feb 2009 (SRI)

G-OWS GEOSS

EGU GMES WDCS GMES WDC

Climate - El Niño Univ. Cantabria – EELA2

EGU

Meteorology Greece: AUTH, IASA, NOA; SRI (Ukraine), GCRAS (Russia)

GEOSS

From IASA

IEEE

ESA ESA

OGC OGC

Seismology

CGGVeritas (France), DSI-IRD, Geoazur, IPGP, IPGS, ISTEP, Sisyphe, UBO

AUTH (Greece), IPGP (France), AUTH (Greece), Tubitak Ulakbim (Turkey), Univ. patras (Greece), INFP (Romania)

INSPIRE INSPIRE Civil Protection Civil Protection NASA ERCIM UN-SPIDER ERCIM NASA CODATA CODATA

Pollution Geospatial platform IMAA(Italy), INFN(Italy), EMA (France), IMHO(Portugal) Real Time and Near Real Time Applications for Civil Protection (Data integration, high-performance computing and distributed environment for simulations)

13/10/2011

Business logic Services CYCLOPS Infrastructure CYCLOPS Infrastructure

Spatial Data Infrastructure Services Spatial Data Infrastructure Services ds Resources Geospatial Services Advanced Grid ds Services GRID Platform (EGEE) Processing Systems Infrastructure

Data Systems

Monique Petitdidier

Sensor systems

Interoperability InteroperabilityPlatform Platform

Presentation and Fruition Services

Environmental Monitoring Resource Infrastructure

of NO x

Security SecurityInfrastructure Infrastructure

CYCLOPS Platform b. 50% r eduction

From

IPGPE. Clévédé E.Clévédé

IPGP

Hydrology

IPP-BAS (Bulgaria), IASA (Greece), EnvVO-SEEGRID

a. Basic scenario

•Tools used: Open Geospatial Consortium components for GIS, access to distributed metadata and data bases, parametric tools….

SEIS SEIS

IEEE

Geosciences- Geocluster

•European ES Grid teams; 20-25 countries

CRS4 (Italy), Univ. Genève, Univ. Neuchâtel,INHGA ( Romania), Institute for Water Resources "Jaroslav Cerni", Belgrade and CSASA at University of Kragujevac, Serbia

Forecasting seawater intrusion in a coastal aquifer From J. Kerrou Univ. Neuchâtel

-

•Numerical methods: Monte Carlo, parametric studies, simulations with MPI, large collection of independent jobs •Relation with civil protection and United Nations – UN-SPIDER, UNEP •Participation to standardisation bodies (OGF, OGC, GMES….)

Earth Observation by Satellite: Atmosphere Inter-Comparaison of Parasol data with Lidar data aboard Calipso and Modis algorithm validation •Full resolution processing of the Parasol data on EGEE by LOA, Univ. Lille, France- F. Ducos Clouds and Aerosol remote •Support: D. Weissenbach (IPGP, France) sensing from the PARASOL •Input: 12,4TB Output: 1,7TB mission (and the A-train). •3-4 months instead of ~18 months

Infrared Atmospheric Sounding Interferometer Studies of photochemical pollution processes in the lower and medium-range troposphere. •Analyse of twice-a-day data on EGEE by IPSL/ LISA – France – M. Eremenko •Support: D. Weissenbach(IPGP) 13/10/2011

Monique Petitdidier

METOP/IASI

Millions of Scenarios Footprint: FP6 European project- eu-footprint.org Pesticide risk assesment and management in water resources •Coordinator: BRGM – I. Dubus •2-4 Millions of scenarios combining various pesticides, weather conditions, crops and soils- each around 1h •Database used for agriculture => Footways (www.footways.eu)) OSSE- geoQAIR (IPSL/LISA- G. Dufour, M. Eremenko & Co •Pseudo-observations to study impact on the satellite mission, Mageaq, on the air quality survey in lower troposphere •45millions of pixels => 1mn per pixel then 750 000h •Application on going 13/10/2011

Monique Petitdidier

gCSMT: Grid earthquake CMT determination

Operational platform in production in the French data centre GEOSCOPE: E. Clévédé (IPGP), G. Patau (IPGP) Embarrassingly parallel inversion with parameter space exploration Synthetic seismograms database dynamically built and stored on the Grid Support: D. Weissenbach (IPGP) Operating since 2004, and continuous developments since 13/10/2011

Monique Petitdidier

Seismology Applications

GEOCLUSTER Software platform for seismic data processing, imaging and reservoir characterization Developed and deployed on EGEE by CGG-Veritas

13/10/2011

GEOSCOPE 30 years of continuous records (~30 stations, ~2 TB) Data curation: Daily noise average analysis of the continuous seismic records archived in the Geoscope Data centers E. Stutzman (IPGP) Support: D. Weissenbach

Monique Petitdidier

SENU3D 3D seismic waves Propagation in complex geological media on a local scale and ground motion prediction MPI job E. Delavaud & J.P. Vilotte IPGP Support: G. Moguilny

13

HYDROLOGY (J. of Hydrology, 403, 189-199, 2011) Domain

Model

Flood Slovakia

Multi-models

Flood Ukrainia

NO

Flash Flood France

ALTHAIR

Data meteorology Prevision outputs

Satellite data

Local data + prediction outputs

Parallel system

Web interface

Distributed Job (MPI)

Portal

Distributed Jobs (OpenMP)

Portal + Web- GIS

Real time UN-SPIDER

Independent Jobs (scenario analysis)

Portal + Web-GIS

Real time In connection With SPC-GD

Independent Jobs (Monte Carlo)

GRIDA3 AQUAGRID Portal + Web-GIS

Water Resources management

No

Water Management

GANGA

Survey/ Water Management

Subsurface Hydrology Italy

CODESA3D

Local and Distributed

Subsurface Hydrology Serbia

LizzaPAKP

Local

Distributed Job (MPI), + Parametric analysis

Local

Independent Jobs (Monte Carlo)

Hydrology Catchment FP7 Eu 13/10/2011

SWAT Monique Petitdidier

Goal

Demonstration

Flash flood V. Thieron, P.A. Ayral, S. Sauvagnargues-Lesage – LGEI, France E

SPC-GD : official monitoring and forecasting of flash floods for around 20 watersheds

Annonay

!EE

E E

E E

E

• Ensemble of simulations => more compute resources • SDI => OGC platform (cf. CYCLOPS) •Feasibility tests 13/10/2011

Monique Petitdidier

E

E E E EE

E! E

EE

700 Kilomètres

E!

E

E

Avignon

E! E

Tarascon

E E E E

E

E !

Arles

E

E

E E

E Nimes E

E

350

E

E

E

E

E E

175

E

E

Uzes !

E

0

! E E E

E Orange !

E !

E E

Vaison-la-Romaine E

! E

E E

E

!

Toulouse Marseille

E E

E

Bollene

E

E

Ales

!

E

E

E

E E

E

E

E

E E

E

Montelimar !

E

E

E

E

E !

E

E

E

E

!

Testbed including:

E

E E

E

Valence

E

E Aubenas E !E

E E

Lyon

After several dramatic events:

E E

E E

!

Simulation: ALTHAIR (Ayral & al., 2007)

E

PrivasE

E !

E!

E E

Bordeaux

Data: rain gauges, water level stations, rain fall intensity from weather radar

E E

E

Paris

E E

E

E

E

Montpellier !

±

E

0

25

50

100 Kilomètres

Flash Flood – G-ALTHAIR

This improved system gives forecasters a wide range of forecasting options on the whole of SPC-GD watersheds. Potentially, forecaster can launch almost simultaneously more than 600 forecasting simulations 13/10/2011

Monique Petitdidier

Results

EU FP7 EnviroGRIDS: Building Capacity for a Black sea Catchment observation and assesment system N. Ray & EU Partners Past, present and future States •Data collection – not so easy SWAT: Soil and Water Assesment tool (Arnold & al., 1998) => long-term averages of river discharge, precipitation, actual & potential evapotranspiration, soil moisture & aquifer recharge (Abbaspour,

2011) ArcSWAT => OGC Grid: Important for High resolution computing, sharing of data and results 13/10/2011 Monique Petitdidier

Climate application Earth System Grid (ESGF)- EGI Earth System Grid Federation (ESGF) : distributed infrastructure developed to support CMIP5 (The Coupled Model Intercomparison Project, Phase 5), an internationally co-ordinated set of climate model experiments involving climate model centres from all over the world. Application on Grid Impact studies on climate changes, regional climate prediction, parametric studies, analysis of the different models 13/10/2011

Monique Petitdidier

Testbed IPSL, SCAI Inter-operation ESGF-EGI security protocols MPI Job,

Barriers for Grid adoption Different Security protocols related to data infrastructures, databases Different among them Different with EGI inter-operability: Problem addressed via Federated Identity Systems for Scientific Collaborations, in EGI, in various ES organisations Discussion ESGF-EGI on going Orchestration workflow see next slide  Still too Complex for end-users Lack of information, not enough local support

13/10/2011

Monique Petitdidier

Technology at the service of science Orchestration Workflow of Data analysis and Data modeling applications with efficient data transfers between Data, and Grid, or HPC infrastructures or cloud computing ata d EO

Climate data

Weather data Researcher An architecture hiding complexity of the underlying infrastructures through mechanisms to manage automatic mapping of credentials, data access adapters and to call heterogeneous security protocols 13/10/2011

Monique Petitdidier

Feedbacks: Publications Around 40 papers in international journals with peer review Special issues (1) Grid in Earth Sciences Earth Science Informatics 2009-2010-13 papers Eds:M. Petitdidier, R. Cossu, P. Fox, P. Mazzetti, H. Schwichtenberg, W. Som de Cerff (2) Development of virtual organizations, applications and services for earth science on grid e-infrastructure - Earth Science informatics 2010 -11 papers - Eds: C. Özturan, V. Kotroni, E. Atanassov Proceedings & books : >50 –difficult to estimate Around 7 French Thesis including results obtained on Grid 13/10/2011

Monique Petitdidier

Conclusion The Grid is now in the ES research landscape in France and Europe via application results and EU projects but still dissemination needed: • Data integration and data analysis of large volumes • Data intensive simulation, inversion and assimilation

ES community challenges: • A service-oriented architecture integrating Data, Grid and HPC infrastructures • Lower the barrier of uptake through scientific gateways

ES European community integration: • ES ESFRI projects: Atmosphere (ICOS, IAGOS, EISCAT3D); Ocean (EUROARG, EMSO), Solid Earth (EPOS-PP) … • IT initiatives: Environment (ENVRI), Seismology (VERCE), Data infrastructures (EUDAT), Hydrometeorology (DRIHM)…. … • EGI and PRACE and other standardization bodies 13/10/2011

Monique Petitdidier