Grid computing in Earth Science M. Petitdidier (IPSL/LATMOS)
[email protected] With collaboration D. Weissenbach (IPGP) ,J. Raciazek (IPSL), L. Schenini (Geoazur), H. Schwichtenberg4 (SCAI) & ESR VO and SEEGrid partners
http://www.euEarthScienceGrid.org
Introduction A continuous involvement of Earth Science in the Grid since 2000 DataGrid (2000-2004) : IPSL with ESA and KNMI EGEE I & II & III (2004-2010) : IPSL & IPGP Cluster Earth Science-IPSL coordination MPI working group and Training activity Committee of the EGEE conferences
DEGREE (2006-2008): FP6 EU project IPSL Deputy coordinator Roadmap (Cossu et al., 2010)
CYCLOPS, SEEGRID, EnviroGrids…. EGI (2010-2014) Dissemination: booth and special sessions at the EGU conference 13/10/2011
Monique Petitdidier
Main concepts about Grid
User and provider communities Resources from different organisations, medium & long term dynamic collaboration users from different institutes, organisations but from the same scientific & technical community and/or projects Trust relationship among users & providers Security: Authentification via certificate & Authorization via Virtual Organization (Virtual team) Data policy – restrictive access, no replica….
Synergy among all different scientific & technical communities (EGI) HEP, BioMed, A&A…..
e-collaboration for sharing resources, data, algorithms, tools, results, distributed expertise, scientific and technical experience, …. 13/10/2011
Monique Petitdidier
ES Virtual Organisatrions Earth Science Research (ESR) : Generic VO • Located since 2004 at SARA (NL) • Coordination: M. Petitdier (IPSL) & H. Schwichtenberg (SCAI) • 56 registered users, ~20 registered applications from France: IPGP, IPSL, LOA, BRGM, LGE/E. Mines Alès • 50 sites in Europe (12 countries, ~ 20 000 CPUs) • Atmosphere, Climate, Hydrology, Seismology, Environment… Related VOs ( ~7) in Europe • Limited projects: CYCLOPS, SEEGRID; or with only a national dimension: in Ireland, Russia … • EGEODE – Geocluster VO (CGG Veritas no more part of EGI and of France Grilles) • Since 2004 at CC-IN2P3, • CGG seismic processing and imaging software • ~30 users 13/10/2011
Monique Petitdidier
For a long time the observations from various sensors have been an effective way to discover and study the behaviour of the earth system with all its components (Cossu et al., Earth Science Informatics, 2010)
13/10/2011
Monique Petitdidier
D A T A
13/10/2011
Monique Petitdidier
A N Y W H E R E
ES Data Intensive Applications Data Integration and Data Analysis • Multi‐attribute and 4D (3D+time)data sets
• Exploration and visualization of distributed large volumes of data • Integration of large volume and/or large number data sets from distributed data resources • Distributed data analysis (HTC) of large data volumes (> 100s TB) > New large data sets • Data‐curation and Data‐fitness Data modeling • Large scale parameter exploration (HTC) • Large scale simulations (HPC) producing large volumes of synthetic data to be analyzed • Large scale Inversion and assimilation (HPC) applications (high memory and storage)
Orchestration workflows • Orchestration of Data analysis and Data modeling applications with efficient data transfers
between Data, Grid and HPC infrastructures
Phenomena as a function of latitude, longitude, altitude and time: Worlwide community 13/10/2011
Monique Petitdidier
Nowadays our total knowledge about the complex Earth system is contained in models and measurements, how we put them together has to be managed cleverly… The technical challenge is to put together databases and computing resources to answer the ES challenges. (Cossu et al.,2010)
13/10/2011
Monique Petitdidier
Applications on EGEE and SEEGrid Flood EMA (France), IISAS (Slovakia), SRI (Ukraine)
Climate - GRelC
Biodiversity
CMCC (Italy), IPSL (France), Univ Cantabria (Spain), SCAI (Germany)
BRGM (France), Footways(France),JKI (Germany) Scenarios climatological
Flood map for Normanton area, Queensland, Australia derived from RADARSAT-2 14-17 Feb 2009 (SRI)
G-OWS GEOSS
EGU GMES WDCS GMES WDC
Climate - El Niño Univ. Cantabria – EELA2
EGU
Meteorology Greece: AUTH, IASA, NOA; SRI (Ukraine), GCRAS (Russia)
GEOSS
From IASA
IEEE
ESA ESA
OGC OGC
Seismology
CGGVeritas (France), DSI-IRD, Geoazur, IPGP, IPGS, ISTEP, Sisyphe, UBO
AUTH (Greece), IPGP (France), AUTH (Greece), Tubitak Ulakbim (Turkey), Univ. patras (Greece), INFP (Romania)
INSPIRE INSPIRE Civil Protection Civil Protection NASA ERCIM UN-SPIDER ERCIM NASA CODATA CODATA
Pollution Geospatial platform IMAA(Italy), INFN(Italy), EMA (France), IMHO(Portugal) Real Time and Near Real Time Applications for Civil Protection (Data integration, high-performance computing and distributed environment for simulations)
13/10/2011
Business logic Services CYCLOPS Infrastructure CYCLOPS Infrastructure
Spatial Data Infrastructure Services Spatial Data Infrastructure Services ds Resources Geospatial Services Advanced Grid ds Services GRID Platform (EGEE) Processing Systems Infrastructure
Data Systems
Monique Petitdidier
Sensor systems
Interoperability InteroperabilityPlatform Platform
Presentation and Fruition Services
Environmental Monitoring Resource Infrastructure
of NO x
Security SecurityInfrastructure Infrastructure
CYCLOPS Platform b. 50% r eduction
From
IPGPE. Clévédé E.Clévédé
IPGP
Hydrology
IPP-BAS (Bulgaria), IASA (Greece), EnvVO-SEEGRID
a. Basic scenario
•Tools used: Open Geospatial Consortium components for GIS, access to distributed metadata and data bases, parametric tools….
SEIS SEIS
IEEE
Geosciences- Geocluster
•European ES Grid teams; 20-25 countries
CRS4 (Italy), Univ. Genève, Univ. Neuchâtel,INHGA ( Romania), Institute for Water Resources "Jaroslav Cerni", Belgrade and CSASA at University of Kragujevac, Serbia
Forecasting seawater intrusion in a coastal aquifer From J. Kerrou Univ. Neuchâtel
-
•Numerical methods: Monte Carlo, parametric studies, simulations with MPI, large collection of independent jobs •Relation with civil protection and United Nations – UN-SPIDER, UNEP •Participation to standardisation bodies (OGF, OGC, GMES….)
Earth Observation by Satellite: Atmosphere Inter-Comparaison of Parasol data with Lidar data aboard Calipso and Modis algorithm validation •Full resolution processing of the Parasol data on EGEE by LOA, Univ. Lille, France- F. Ducos Clouds and Aerosol remote •Support: D. Weissenbach (IPGP, France) sensing from the PARASOL •Input: 12,4TB Output: 1,7TB mission (and the A-train). •3-4 months instead of ~18 months
Infrared Atmospheric Sounding Interferometer Studies of photochemical pollution processes in the lower and medium-range troposphere. •Analyse of twice-a-day data on EGEE by IPSL/ LISA – France – M. Eremenko •Support: D. Weissenbach(IPGP) 13/10/2011
Monique Petitdidier
METOP/IASI
Millions of Scenarios Footprint: FP6 European project- eu-footprint.org Pesticide risk assesment and management in water resources •Coordinator: BRGM – I. Dubus •2-4 Millions of scenarios combining various pesticides, weather conditions, crops and soils- each around 1h •Database used for agriculture => Footways (www.footways.eu)) OSSE- geoQAIR (IPSL/LISA- G. Dufour, M. Eremenko & Co •Pseudo-observations to study impact on the satellite mission, Mageaq, on the air quality survey in lower troposphere •45millions of pixels => 1mn per pixel then 750 000h •Application on going 13/10/2011
Monique Petitdidier
gCSMT: Grid earthquake CMT determination
Operational platform in production in the French data centre GEOSCOPE: E. Clévédé (IPGP), G. Patau (IPGP) Embarrassingly parallel inversion with parameter space exploration Synthetic seismograms database dynamically built and stored on the Grid Support: D. Weissenbach (IPGP) Operating since 2004, and continuous developments since 13/10/2011
Monique Petitdidier
Seismology Applications
GEOCLUSTER Software platform for seismic data processing, imaging and reservoir characterization Developed and deployed on EGEE by CGG-Veritas
13/10/2011
GEOSCOPE 30 years of continuous records (~30 stations, ~2 TB) Data curation: Daily noise average analysis of the continuous seismic records archived in the Geoscope Data centers E. Stutzman (IPGP) Support: D. Weissenbach
Monique Petitdidier
SENU3D 3D seismic waves Propagation in complex geological media on a local scale and ground motion prediction MPI job E. Delavaud & J.P. Vilotte IPGP Support: G. Moguilny
13
HYDROLOGY (J. of Hydrology, 403, 189-199, 2011) Domain
Model
Flood Slovakia
Multi-models
Flood Ukrainia
NO
Flash Flood France
ALTHAIR
Data meteorology Prevision outputs
Satellite data
Local data + prediction outputs
Parallel system
Web interface
Distributed Job (MPI)
Portal
Distributed Jobs (OpenMP)
Portal + Web- GIS
Real time UN-SPIDER
Independent Jobs (scenario analysis)
Portal + Web-GIS
Real time In connection With SPC-GD
Independent Jobs (Monte Carlo)
GRIDA3 AQUAGRID Portal + Web-GIS
Water Resources management
No
Water Management
GANGA
Survey/ Water Management
Subsurface Hydrology Italy
CODESA3D
Local and Distributed
Subsurface Hydrology Serbia
LizzaPAKP
Local
Distributed Job (MPI), + Parametric analysis
Local
Independent Jobs (Monte Carlo)
Hydrology Catchment FP7 Eu 13/10/2011
SWAT Monique Petitdidier
Goal
Demonstration
Flash flood V. Thieron, P.A. Ayral, S. Sauvagnargues-Lesage – LGEI, France E
SPC-GD : official monitoring and forecasting of flash floods for around 20 watersheds
Annonay
!EE
E E
E E
E
• Ensemble of simulations => more compute resources • SDI => OGC platform (cf. CYCLOPS) •Feasibility tests 13/10/2011
Monique Petitdidier
E
E E E EE
E! E
EE
700 Kilomètres
E!
E
E
Avignon
E! E
Tarascon
E E E E
E
E !
Arles
E
E
E E
E Nimes E
E
350
E
E
E
E
E E
175
E
E
Uzes !
E
0
! E E E
E Orange !
E !
E E
Vaison-la-Romaine E
! E
E E
E
!
Toulouse Marseille
E E
E
Bollene
E
E
Ales
!
E
E
E
E E
E
E
E
E E
E
Montelimar !
E
E
E
E
E !
E
E
E
E
!
Testbed including:
E
E E
E
Valence
E
E Aubenas E !E
E E
Lyon
After several dramatic events:
E E
E E
!
Simulation: ALTHAIR (Ayral & al., 2007)
E
PrivasE
E !
E!
E E
Bordeaux
Data: rain gauges, water level stations, rain fall intensity from weather radar
E E
E
Paris
E E
E
E
E
Montpellier !
±
E
0
25
50
100 Kilomètres
Flash Flood – G-ALTHAIR
This improved system gives forecasters a wide range of forecasting options on the whole of SPC-GD watersheds. Potentially, forecaster can launch almost simultaneously more than 600 forecasting simulations 13/10/2011
Monique Petitdidier
Results
EU FP7 EnviroGRIDS: Building Capacity for a Black sea Catchment observation and assesment system N. Ray & EU Partners Past, present and future States •Data collection – not so easy SWAT: Soil and Water Assesment tool (Arnold & al., 1998) => long-term averages of river discharge, precipitation, actual & potential evapotranspiration, soil moisture & aquifer recharge (Abbaspour,
2011) ArcSWAT => OGC Grid: Important for High resolution computing, sharing of data and results 13/10/2011 Monique Petitdidier
Climate application Earth System Grid (ESGF)- EGI Earth System Grid Federation (ESGF) : distributed infrastructure developed to support CMIP5 (The Coupled Model Intercomparison Project, Phase 5), an internationally co-ordinated set of climate model experiments involving climate model centres from all over the world. Application on Grid Impact studies on climate changes, regional climate prediction, parametric studies, analysis of the different models 13/10/2011
Monique Petitdidier
Testbed IPSL, SCAI Inter-operation ESGF-EGI security protocols MPI Job,
Barriers for Grid adoption Different Security protocols related to data infrastructures, databases Different among them Different with EGI inter-operability: Problem addressed via Federated Identity Systems for Scientific Collaborations, in EGI, in various ES organisations Discussion ESGF-EGI on going Orchestration workflow see next slide Still too Complex for end-users Lack of information, not enough local support
13/10/2011
Monique Petitdidier
Technology at the service of science Orchestration Workflow of Data analysis and Data modeling applications with efficient data transfers between Data, and Grid, or HPC infrastructures or cloud computing ata d EO
Climate data
Weather data Researcher An architecture hiding complexity of the underlying infrastructures through mechanisms to manage automatic mapping of credentials, data access adapters and to call heterogeneous security protocols 13/10/2011
Monique Petitdidier
Feedbacks: Publications Around 40 papers in international journals with peer review Special issues (1) Grid in Earth Sciences Earth Science Informatics 2009-2010-13 papers Eds:M. Petitdidier, R. Cossu, P. Fox, P. Mazzetti, H. Schwichtenberg, W. Som de Cerff (2) Development of virtual organizations, applications and services for earth science on grid e-infrastructure - Earth Science informatics 2010 -11 papers - Eds: C. Özturan, V. Kotroni, E. Atanassov Proceedings & books : >50 –difficult to estimate Around 7 French Thesis including results obtained on Grid 13/10/2011
Monique Petitdidier
Conclusion The Grid is now in the ES research landscape in France and Europe via application results and EU projects but still dissemination needed: • Data integration and data analysis of large volumes • Data intensive simulation, inversion and assimilation
ES community challenges: • A service-oriented architecture integrating Data, Grid and HPC infrastructures • Lower the barrier of uptake through scientific gateways
ES European community integration: • ES ESFRI projects: Atmosphere (ICOS, IAGOS, EISCAT3D); Ocean (EUROARG, EMSO), Solid Earth (EPOS-PP) … • IT initiatives: Environment (ENVRI), Seismology (VERCE), Data infrastructures (EUDAT), Hydrometeorology (DRIHM)…. … • EGI and PRACE and other standardization bodies 13/10/2011
Monique Petitdidier