will allow you to 'build your own' toolbox from a set of scientific codes and utilities ... Portal. Provenance. Metadata. Scrip ng. Tool. eScript. Mag. Grav. NCI. Cloud.
Virtual Research Environments: enabling a step change in geoscience research globally Lesley Wyborn1 and Helen Glaves2 1Na4onal Computa4onal Infrastructure, Australian Na4onal University 2Bri4sh Geological Survey
© National Computational Infrastructure 2016
Virtual Research Environments: enabling a step change in geoscience research globally Lesley Wyborn1, Helen Glaves2 and ???? 3 1Na4onal Computa4onal Infrastructure, Australian Na4onal University 2Bri4sh Geological Survey 3Someone from the Geosciences in the US?
© National Computational Infrastructure 2016
This presenta4on has an iden4ty crisis
• Virtual Research Environments are currently funded: – in Europe as Virtual Research Environments – in Australia as Virtual Laboratories – in the USA as Science Gateways
• Elsewhere they have been called – – – –
Co-laboratories Virtual Observatories Collabora4ve Interac4ve Environments Analy4cs PlaPorms/Engines
• All enable sharing of resources and common infrastructures over the Internet
© National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
What is a VRE? Source: http://www.worldatlasbook.com/europe/europe-political-map.html
‘A Virtual Research Environment or VRE is an online
tool for researchers to facilitate sharing and collabora4on. VREs give you an integrated online environment with access to shared documents and resources needed in the course of a research project’ (Universiteit Leiden 2016)
‘A VRE is a set of online tools to facilitate or enhance the research process. A VRE can aid with collabora4on and communica4on amongst members of a research group, whether they share an office or work on different sides of the world (University of Newcastle, 2016) ‘A virtual research environment (VRE) or virtual laboratory is an online system helping researchers collaborate. Features usually include collabora4on support (Web forums and wikis), document hos4ng, and some discipline-specific tools, such as data analysis, visualisa4on, or simula4on management’ (Wikipedia, 2016) © National Computational Infrastructure 2016 GSA, Denver, Colorado, 2016
What is a science gateway? Source: http://www.freelargeimages.com/map-of-usa-772/
‘A Science Gateway is a community-developed set of tools, applica4ons, and data that are integrated via a portal or a suite of applica4ons, usually in a graphical user interface, that is further customized to meet the needs of a specific community’ © National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
© National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
What is a virtual laboratory? Source: http://www.travel-australia-online.com/maps-of-australia.html
National eResearch Collaboration Tools and Resources (NeCTAR)
See: https://nectar.org.au/
‘Virtual Laboratories are rich domain-oriented online environments that draw together research data, models, analysis tools and workflows to support collabora4ve research across ins4tu4onal and discipline boundaries’ © National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
Consensus view??? A VRE, science gateway or virtual laboratory is an on-line system suppor4ng collabora4ve research that enables harnessing of the power of the Internet to support a more dynamic, online approach to collabora4ve working Key features: – Provide access to data resources that are accessible online – Enable online use of discipline-specific tools, such as data analysis, visualisa4on, or simula4on management – Online access to compute resources – Collabora4on support (Web forums and Wikis) – May include publica4on management and teaching tools VRE’s are important in fields where research is primarily carried out in teams spanning mul4ple ins4tu4ons and even countries
© National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
VRE’s are the enablers of transdisciplinary research
© National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
Who Uses them?
• VRE’s are very diverse and can range from – individual researchers working on distributed data resources, – to teams of highly skilled researchers accessing online fairly substan4al High Performance Compu4ng environments that facilitate in situ processing of large volumes of data using community codes developed through interna4onal coopera4ve efforts
• They can be developed by any discipline on almost any infrastructure
© National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
In Australia: 13 Virtual Labs from Diverse Research Groups
1. All Sky Virtual Lab (Astronomy)
9. Industrial Ecology Virtual Laboratory
2. Virtual Geophysics Laboratory
10. Climate & Weather Science VL
3. Virtual Hazards, Impact & Risk Laboratory 11. Biodiversity & Climate Change VL 4. Humanities Network Infrastructure
12. Microbial Genomics Virtual Laboratory
5. Endocrine Genomics Virtual Lab
13. Genomics Virtual Laboratory
6. ALVEO 7. Characterisation Virtual Laboratory 8. Marine Virtual Laboratory
Source: http://www.travel-australia-online.com/maps-of-australia.html © National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
Serving a diversity of Use Cases and Competency of Users Needs HPC
Climate
Astronomy Genomics
Gigabytes of Data
Hazards
Users: Fewer More CS Skilled
Geophysics
Marine
Petabytes of Data Characterisation
Humanities
VRE’s down under Users: More Less CS Skilled
Biodiversity Cities Species
Doesn’t Need HPC © National Computational Infrastructure 2016
Source: http://www.travel-australia-online.com/maps-of-australia.html
GSA, Denver, Colorado, 2016
Common Components
Provenance & reproducibility
Software sustainability
Impact measures
Virtual Lab
User engagement
Knowledge transfer & skills development Software reuse
Slide Courtesy Of Michelle Barker, NeCTAR, Australia © National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
Working together to solve common problems
Virtual Virtual Lab Lab
Virtual Lab
Nectar common projects
Virtual Virtual Lab Lab
Slide Courtesy Of Michelle Barker, NeCTAR, Australia © National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
Solving Common Problems, Sharing Core Infrastructures
Researcher training
Movement of data
Data storage Security & authen?cai? on
Provenance & reproducibility
Virtual Lab
Advocacy & coordina?on
Compute access
User support
Data management
Research collabora?on pla@orms
SoBware Skills sustainability development & knowledge transfer
Slide Courtesy Of Michelle Barker, NeCTAR, Australia © National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
Sharing core components and infrastructures across multiple VL’s enhances sustainability and is more cost effective
Introducing the Virtual Geophysics Laboratory
© National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
Data Services (1) on a browser
Layers discovered via remote registries Layers consist of numerous remote data services
© National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
Data Services (2) on a browser
© National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
Sobware as a Service on a browser
A variety of different scientific codes are already available in the form of “Toolboxes”. Currently tool boxes correspond to VM images with codes installed – future versions will allow you to ‘build your own’ toolbox from a set of scientific codes and utilities © National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
Sobware as a Service (2) on a browser
Flexibility in what computing resources to utilise
© National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
Monitoring jobs from a browser
Wyborn AGU 2013 IN43B-05
© National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016
Wyborn AGU 2013 IN43B-05
Components of the Virtual Geophysics Laboratory Data Services
Magne?cs Gravity
Processing Services
eScript Under world
DEM
Compute Services
NCI Petascale NCI Cloud NeCTAR Cloud
Enablers (e.g., OGC ‘Glue’)
Service Orchestra?on
NCI Mag. Grav. Cloud VGL Portal eScript
VGL Scrip?ng Portal Tool Provenance Metadata
Amazon Cloud Desktop © National Computational Infrastructure 2016
Dynamic Virtual Geophysics Laboratories
GSA, Denver, Colorado, 2016
Mag. Grav.
DEM NCI Cloud
VGL Portal NCI Petascale
Under world
Repurposing to a Virtual Hazards Laboratory Data Services
Processing Services
Compute Services
Enablers (e.g., OGC ‘Glue’)
Unchanged
Magne?cs
ANUGA
Gravity
EQRM
NCI Petascale NCI Cloud NeCTAR Cloud
DEM
Service Orchestra?on VGL Scrip?ng Portal Tool Provenance Metadata
Amazon Cloud
Landsat Bathymetry © National Computational Infrastructure 2016
Desktop GSA, Denver, Colorado, 2016
Dynamic Virtual Hazards Laboratories NCI Mag. Grav. Petascale VGL Portal EQRM
DEM
Bathy DEM
NCI Cloud
VGL Portal Amazon ANUGA Cloud
Repurposing to a Virtual Environmental Laboratory Data Services
Processing Services
Compute Services
Unchanged Climate Records
Wind Modelling
Species Land Use Analy?cs
DEM
NCI Petascale NCI Cloud NeCTAR Cloud
Enablers Dynamic Virtual (e.g., OGC ‘Glue’) Environmental Laboratories
Service Orchestra?on VGL Scrip?ng Portal Tool Provenance Metadata
Amazon Cloud
Landsat Bathymetry © National Computational Infrastructure 2016
Desktop GSA, Denver, Colorado, 2016
Amazon Sat. Species Cloud VGL Portal Bug DEM tracking Weather DEM
NCI HPC
VGL Portal Amazon Tsunami Cloud
Repurposing to a Virtual Geochemistry Laboratory? Data Services
Processing Services
Interfaces are critical
© National Computational Infrastructure 2016
Compute Services
Enablers (eg. OGC “Glue”)
Virtual Laboratory
Sharing core components across multiple VL’s enhances sustainability and is more cost efficient GSA, Denver, Colorado, 2016
To advance VREs further To achieve our vision of virtual environments in which applica4ons can access data from mul4ple domains anywhere, and then process at their preferred loca4on using the the most appropriate sobware there are three key issues in moving forward: 1. Technical – more effort needs to be put into standardisa4on of the interfaces that enable distributed systems to be loosely coupled and interact in real 4me
2. Social – More effort is needed in raising awareness of the poten4al of virtual environments and to work collabora4vely to build globally shared infrastructures par4cularly around sobware. Data may be local but sobware should be global!
3. Sustainability – Is cri4cal and needs to be part of the development plan – Globally sharing developments will also enhance sustainability © National Computational Infrastructure 2016
GSA, Denver, Colorado, 2016