for the UK's Research Computing Ecosystem involving the different Research ... interoperation of highYperformance comput
Strategy for the UK Research Computing Ecosystem
October 2011
Cover images credits: (left to right) ITER Organization 2011; Virtual Physiological Human; Dr Kevin Stratford, EPCC; Dr Christopher Woods (University of Bristol), Dr Alessio Lodola (University of Parma) and Prof. Adrian Mulholland (University of Bristol).
Strategy for the UK Research Computing Ecosystem
October 2011
Foreword The current economic turmoil engulfing the world is having a serious impact on the UK’s economy. High-tech manufacturing and services are likely to be key factors in Britain’s recovery. But, in order to compete competitively at the international level, we must exploit our world-class research and innovation in academia, industry and government. To do this, the UK needs world-class facilities to enable key research. These include a computing "e-infrastructure" ranging from the desktop to the highest-performing supercomputers coupled with high-performance networks and data repositories. This report outlines the urgent need for the UK to develop and deploy an integrated research computing infrastructure. It was born out of the many major challenges experienced hitherto across the UK academic community in gaining access to sufficiently robust resources to pursue its scientific objectives and to attain its many ambitious goals. Indeed this report represents a true consensus, involving as it does all those active in all forms of research computing from the arts and humanities to the sciences and engineering, from the desktop to the largest computers, and encompassing collaborations via campus and regional resources. Our work started only five short months ago in May 2011.The report has only been made possible through the support of the UK e-Science Institute, funded by the UK Engineering and Physical Sciences Research Council, which provided ample funding and a national platform for consultative discussions through an e-Science Institute Mini-theme project. In particular, I must thank UCL Vice Provost for Research, Professor G David Price, for allowing me to lead this effort. Professor Price also introduced me to Brian Collins, formerly Chief Scientist at the UK Government Department of Business Innovation and Skills, whose advice has been invaluable. I would also like to thank Chris Mellor, Jo Newman, Clem Harris, David de Roure and Malcolm Atkinson for their steadfast support throughout this project. I am grateful to the members of the UK e-Science Institute Working Group and other friends and colleagues who have assisted in the compilation of this report. They are listed in Appendix A. Three people in particular have made the writing of this report a genuine pleasure rather than a burden. They are Mark Parsons, Nour Shublaq and Frannie Wray: thank you all for your never failing assistance and for your friendship. Our efforts are already having an impact. Even prior to its completion, the report has played a significant role in initiating high-level policy discussions and funding decisions within the UK Government Department of Business, Innovation and Skills. We have become actively involved in providing input to an official UK Government Report commissioned by the Science and Universities Minister, the Rt Hon David Willetts MP. We hope our report will continue to exert influence long into the future.
Professor Peter Coveney University College London October 2011
1
Strategy for the UK Research Computing Ecosystem
2
October 2011
Strategy for the UK Research Computing Ecosystem
October 2011
Executive Summary The fragmented UK funding regime and a lack of a coordinated policy and strategy for a research computing ecosystem have become increasingly problematic as researchers from both industry and academe have sought to access a wider range of facilities to support their research. These have resulted in situations where charging models have driven behaviour and the optimal resources have not always been affordable by specific projects nor have they been available when needed. The overall objective of this document is to develop a strategy to overcome the fragmentation of the UK’s e-infrastructure. This fragmentation exists because the resources comprising that infrastructure are funded via different bodies representing the various, diverse user communities. The goal is to develop a holistic approach for the UK’s Research Computing Ecosystem involving the different Research Councils, funding bodies and user communities in a coherent collaboration. It is expected that this strategy will embrace not only the Research Councils and user communities, but will also provide input to government (BIS) and University Vice-Chancellors in their deliberations. This report makes detailed recommendations which, in outline, propose that: • A body should be established to govern the Research Computing Ecosystem. This body should represent all stakeholders, report directly to Government and follow a 10-year rolling plan. This Ecosystem needs to be holistically governed1, recognising the roles of hardware, software and people and the different timescales involved; • This rolling plan should cover the entire computing ecosystem down to an appropriate level, encompassing software, people and hardware including compute, data storage and networks and Cloud-based resources with the aims of maximising the benefits to UK research whilst realising optimum return on investment; • There need to be flexibly funded mechanisms which enable rapid responses to hardware and software developments and to new research opportunities and requirements. Conversely, there needs to be longterm funding for ambitious software development projects; • Researchers should be able to use the correct scale of infrastructure for their problem and move between levels easily, ranging from the research-group level, to university, regional and national facilities. Peer review and allocation mechanisms should adhere to international best practice and be responsive to the needs of research. Restrictive and detrimental charging mechanisms must be avoided and replaced by incentives. There should be a forward plan for funding, supporting all disciplines, both existing and new, which ensures that scientists have access to adequate resources. This is essential to ensure that UK science maintains its competitive position; • There should be investment in developing the next generation of computationally aware researchers not only from the physical and life sciences, but also from the social sciences, arts and humanities. Crossdisciplinary interaction focused on productivity and participation should be encouraged with subsequent funding for research and deployment of solutions from mathematics, computer science and software engineering. Appropriate training needs to be given across a wide range of disciplines to ensure that researchers are able to use e-Infrastructure effectively.
1
The term “govern” includes management, coordination, budgetary control and incentivisation 3
Strategy for the UK Research Computing Ecosystem
October 2011
Table of Contents Introduction
6
The current status of the UK Research Computing Ecosystem
10
Building a UK Research Computing Ecosystem that delivers
13
Funding a sustainable Research Environment
15
Conclusions and Recommendations
17
Contributors
19
4
Strategy for the UK Research Computing Ecosystem
October 2011
Purpose of this Document The purpose of this document is to propose a holistic strategy for research computing in the UK which reflects the interests of the wider research community for whom a range of computing resources are a necessary tool for engagement in national and international research activities. This strategy presents a basis for the development of an implementation plan involving all the relevant stakeholders.
Overall objective The overall objective of this document is to develop a strategy to overcome the fragmentation of the UK’s einfrastructure. This fragmentation exists because the resources comprising that infrastructure are funded via different bodies representing the various, diverse user communities. The goal is to develop a holistic approach for the UK’s Research Computing Ecosystem involving the different Research Councils, funding bodies and user communities in a coherent collaboration. It is expected that this strategy will embrace not only the Research Councils and user communities, but will also provide input to government (BIS) and University Vice-Chancellors in their deliberations. This strategy takes forward the discussions at an ‘all parties’ meeting held at the National e-Science Centre in March 2011, which included representatives from most key stakeholder communities (including national Supercomputing service providers, University High Performance Computing (HPC) providers, the e-Science community, Research Councils, JISC2, JANET3, HEFCE4 , SFC5 , DELNI6 and HEFCW7). The Town Meeting on July 8th and the Oxford Meeting on September 9th brought these and other interested communities together again to continue and expand the dialogue started in March, leading to a strategic document for presentation to funding agencies, the UK government and UK Vice-Chancellors. This document builds on the discussions at the March meeting and takes forward the strong consensus of the need for and benefits of a coherently managed e-Infrastructure in the UK confirmed at the meetings on July 8th and September 9th. In planning the Town Meeting and that in Oxford and in the subsequent development of this document, particular attention has been paid to the requirement to ensure that the wider research community is directly involved, and their needs, concerns and ambitions are fully represented in the resultant strategy document.
2
Joint Information Systems Committee
3
Joint Academic Network
4
Higher Education Funding Council for England
5
Scottish Funding Council
6
Department for Employment and Learning, Northern Ireland
7
Higher Education Funding Council for Wales 5
Strategy for the UK Research Computing Ecosystem
October 2011
Introduction Setting the scene Research across a wide range of disciplines requires an ecosystem of computational resources (e-infrastructure) that can allow distributed collaboration and computation, large‐scale simulation and analysis and fast access to data and facilities. To solve the complex problems arising from this research, an effective ecosystem requires the interoperation of high‐performance computing, cloud computing, mid-range computer systems both University and departmental, databases, high-bandwidth networks, repositories of complex multimedia text, image and video resources and domain specific semantic vocabularies, instruments, sensors, software and skilled people, all often geographically dispersed, both nationally and internationally. Nearly every field of research in the UK is dependent on e-infrastructure and this dependence will only increase. The growing importance of data collections to both ‘hard’ science and the humanities, and the need for a long-term commitment to developing and maintaining the necessary data infrastructure, should be noted. The UK has world-leading skills in the exploitation of e-infrastructure for scientific research. These skills now need to be complemented by access to world-class computational and data resources in order to maintain the competitiveness of UK research and to underpin a competitive and innovative UK industry. In other words, an appropriate, fully integrated, coherently and comprehensively managed research computing infrastructure is now essential. The capabilities of e-Infrastructure have been growing rapidly for over 50 years. Their full exploitation has required the continual development of new algorithms, tools, software and methods of working. Generally, academic research has played a pioneering role, pushing the development and exploitation of new technology. The results and spin-offs from this research have been rapidly taken up by industry and widely deployed, often creating new mass markets and changing our lives. The primary vehicle for this technology transfer is skilled graduates and researchers who have developed their expertise at the cutting edge of academic research. Research computing also generates tools and software that facilitate the adoption of new applications by industry. The emergence of the highly dynamic European Independent Software Vendor (ISV) community bears clear testimony to this. Continued rapid growth in e-infrastructural capabilities offers the prospect of new kinds of knowledge. We will acquire deeper knowledge of how nature works, as our models are refined and expose phenomena that are inaccessible to experiment. We will be able to predict events reliably, enabling us to take remedial actions. We will acquire knowledge of behaviour without understanding how it comes about for systems that are too complex to model. Ultimately, there may be no natural or man-made phenomena whose behaviour cannot be predicted, giving us unprecedented control over our lives, our businesses, our environment and the economy.
The importance of the Research Computing Ecosystem Such an ecosystem is essential to address important issues central to science, society and industry. For example, data-intensive processing is an important new field involving the processing of biological data. The growth in this area has been driven by so called “Next generation” sequencing technologies, proteomics, and imaging technologies, all of which have been widely deployed across the life sciences, from medicine to field6
Strategy for the UK Research Computing Ecosystem
October 2011
based agriculture. The sequencing capacity per pound spent has doubled around every 6 months over the last four years, leading to a drop from ~$1 million for a genome sequence to around $10,000 today. This has underpinned profound new research avenues in diverse applications, such as cancer biology, the underlying genetic components of common disease and crop improvement, with resultant societal and economic benefits. Whilst it is difficult to present a simple return on investment for computing infrastructure and HPC in particular, it should be noted that competitor nations are investing heavily in this area. The video “The Power of Supercomputing”8 produced by DreamWorks on behalf of the USA’s Council on Competitiveness contains the key message “The country that out-computes will be the one that out-competes.” As a further demonstration of the importance of HPC, the following should be noted: • The 2012 US budget explicitly references exascale; • In the State of the Union address (Jan 2011) President Obama highlighted the use of supercomputing as a way for the US to maintain its competitive economic edge. He made it clear that the US government will provide “cutting edge scientists and inventors with the support they need”; • The US is currently spending about $2.5 billion per annum on HPC according to IDC who also say that spending will have to rise to $5billion in order for the US to remain competitive. Significant societal data have been collected from a variety of sources. Understanding the behaviour of individuals, groups and societies is increasingly dependent upon a digital infrastructure for storage and analysis of digital archives. Studies of population growth and movement, food management, water resources, climate change and global economies, are all crucially important for society and all require an effective national computing ecosystem and its integration with international resources, often at a global scale. UK fusion research has benefitted tremendously from access to facilities such as HECToR to study turbulent plasma processes. UK scientists are contributing to the international initiative, ITER, where the objective is to produce fusion power that exceeds the heating power by a factor of 10. This initiative is an important step towards the goal of developing fusion energy. A UK HPC infrastructure is absolutely essential for strengthening the competitive position of UK scientists relative to their international colleagues in this important area, which potentially has very significant economic implications. In the biomolecular area, detailed computer simulations add an essential extra dimension to the investigation of biological molecules by allowing functionally relevant motions to be visualised. This is transforming the fundamental understanding of biochemistry and molecular evolution, and has practical applications such as the design of new drugs and environmentally friendly catalysts. In astrophysics there is an intimate connection between theoretical research, which today is overwhelmingly reliant on modelling and simulations using HPC, and the exploitation of data from expensive observational facilities. The UK continues to be a recognised world leader in theoretical astrophysics. For example, the most cited paper in astronomy published in Nature during the 2000s presents the "Millennium simulation", in which UK-based researchers played a key role. The Millennium simulation data have subsequently been used in at least 400 other papers by astronomers across the world. Furthermore, a research computing ecosystem is central to the exploitation of diverse scientific resources both in the UK and in Europe. The government has recently invested £97 million in the Diamond light source. This resource can only be fully exploited if it is supported by large-scale, high-performance computer modelling, for example, of molecules and cells. Further examples of the uses of e-infrastructure combining computing, networking and data include: 8
http://www.isgtw.org/visualization/why-advanced-computing-matters 7
Strategy for the UK Research Computing Ecosystem
October 2011
• Research into climate change, dispersion of pollutants, next-generation power sources, energy distribution, nanoscience and new materials; • Responses to emergencies and other time-critical incidents. These currently include simulation and prediction of fires, earthquakes and extreme weather events, but in the future will extend to real-time clinical decisions; • The retrieval and mining of complex structured information from large corpora of historical texts, and the integration of this information with historical databases; • The scanning and documentation of ultra high-resolution 3D scans of museum objects and associated machine-readable metadata, and making the resultant data widely available; • Patient-specific computer models for personalised and predictive healthcare developed through the European Virtual Physiological Human programme; • Engineering design in the automobile and aerospace industries where the need for costly physical experiments can be avoided; • Medicine where complex surgery, tumour imaging and cancer treatments depend on advanced computing; • Drug design which depends both on advanced computing and access to large, diverse databases; • Novel public transport planning services devised using accessible public data; • Simulation of population dynamics and aging to predict future care requirements in support of planning by administrations. To remain at the leading edge of data processing, computational methods and modelling, and for their results to be exploited by UK industry, commerce, healthcare and government, the users of e-infrastructure need access to appropriate databases, software and computing resources. By nurturing the innovative capability of this community, a supply of technologists, equipped with those data-intensive and computational skills necessary to support new industry and economic activity, can be assured.
The way forward Investments in e-infrastructure are increasing in the USA, mainland Europe, Japan and China in recognition of its importance to research. An effective strategy and consequent implementation plan need to be put in place for the provision of a research computing ecosystem which provides the UK research community with the resources necessary for it to remain internationally competitive. It is important that these resources can interface to those of international initiatives to maximise the capabilities of UK science. Sustaining the rapid growth in e-infrastructural capabilities today presents new challenges. Even if technological progress is disruptive, rather than evolutionary, a holistic approach will be necessary to exploit it effectively. Any strategy for research computing must support all relevant aspects in a balanced way. If we can do this and better provide and manage e-infrastructure, then the overall vision of UK science can be much more ambitious, the quality of its output greatly enhanced and its contribution to the development of the economy made much more significant.
8
Strategy for the UK Research Computing Ecosystem
October 2011
Summary The main points of this section are: • An integrated computational infrastructure, which is comprehensively and coherently managed, is essential for world-class research and must be compatible with similar international initiatives (see recommendation #1 below); • Our competitors are investing heavily in e-infrastructure (see recommendations #1, #7, #8 below); • A research computing infrastructure requires the interoperation of computing, data, networks, sensors and people (see recommendations #1, #2, #5 below); • Such an ecosystem is essential to address important issues central to science, society and industry (see recommendations #7, #11 below); • Any strategy for research computing must support all relevant aspects in a balanced way (see recommendations #5, #6, #7 below); • A successful strategy will enable research that is more ambitious, of higher quality and contributing more significantly to the national economy (see recommendation #8 below).
9
Strategy for the UK Research Computing Ecosystem
October 2011
The current status of the UK Research Computing Ecosystem The technical landscape of the Research Computing Ecosystem Currently, research computing in the UK is in general rather well resourced thanks to several years of investment via, for example, SRIF, the e-Science programme and the national high-performance computing services.These are complemented by national data resources. Hardware facilities range from local desktop machines through departmental or University-provided clusters to powerful supercomputers and include a number of large-scale data repositories. Underpinning and providing access to these facilities is the SuperJANET network. For example, the National Service for Computational Software, addressing the needs of the chemistry community with both software and hardware, is an important national resource. Complementary to this, the National Grid Service and its European counterpart, EGI, are important components of a national and internationally collaborative research computing infrastructure. This apparent richness, however, masks a landscape in which UK provision of computational resources is patchy and uncoordinated. Many of the most important facilities have grown out of specific research projects, with little thought to sustainability or broader exploitation, and often with little connection to institutional or national strategies. Increasingly, the necessary requirement for UK researchers to participate and lead at international and national levels is a research ecosystem that provides effective and efficient access to a wide range of computational resources, including data services and user-friendly support software, tools and libraries to aid the development of new software. In other words, it is essential for the UK to have a research ecosystem which is continually upgraded as technology, software, systems and research requirements evolve. It should be noted that software and data are frequently crucial to research and have useful lifetimes much longer than those of individual hardware technologies and research projects. A balanced research computing ecosystem has to sustain that key software and data.
The political landscape of the Research Ecosystem The current funding of the UK research ecosystem comes from many different sources and has evolved as a result of different funding bodies providing financial support for its different components. As a consequence the UK research ecosystem is currently managed and sustained by a range of funding and policy organisations, each of which represents different interest groups with different objectives. This fragmented structure provides some competitive incentives, but it also prevents users from efficiently using the research ecosystem, be it for scientific research, industrial applications or the management of large and valuable data collections. For example, in the case of the UK’s highest-performance computing systems, there is currently no long-term strategy against which regional or institutional investments may be planned or optimised. The UK funding and scientific cases have to be constructed on an ad hoc basis with no guarantee that support will be forthcoming.
10
Strategy for the UK Research Computing Ecosystem
October 2011
At a European level, the need for a more strategic approach to infrastructure has been recently recognised through the coordinating initiatives such as PRACE (HPC), EGI (distributed computing), GEANT (networking), ELIXIR (biomedical data) and DARIAH (humanities). UK participation in these initiatives is often limited and focused around specific short-term research programmes, with little coordination between them. The engagement in PRACE, in particular, emphasises the lack of coherence in the UK strategy. Initially the UK was a leading player in this initiative, but subsequently took a significantly lesser role based on a decision by the Research Councils with very little input from the scientific community.
Why the current state is not favourable for the UK The fragmented UK funding regime and a lack of a coordinated policy and strategy in the area of a research computing ecosystem have become increasingly problematic as researchers seek to access a wider range of facilities to support their research. As a result: Researchers wishing to tackle the largest and most challenging scientific problems are denied access to the higher end. There is corresponding lack of support for “long-tail” research applications. Without any guarantee of future access to the necessary resources there is little incentive to invest years of effort into software development or common approaches. The lack of strategic planning and coordinated infrastructure encourages researcher communities to focus on meeting their own short-term needs thereby inhibiting interdisciplinary research. UK Science has suffered for over two decades from the lack of a coherent policy for high-performance computing and from consequent gaps in its provision. These have resulted in situations where charging models have driven behaviour and the optimal resources have not always been affordable by specific projects nor have they been available when needed. There have been several instances where the fragmentation in the UK ecosystem has meant that UK scientists have only been able to remain competitive through exploiting systems overseas. Whilst this has been possible because of the quality of the scientific contribution of UK scientists to international collaborations, this approach is not sustainable if the UK cannot offer reciprocal resources. This requires an appropriate international HPC strategy underpinned by a strong, coherent national strategy addressing resources at all relevant scales. In summary, the fragmented organisation of the current research infrastructure and the lack of a sustained vision, sustained international collaboration and sustained funding for a UK research ecosystem have resulted in a role of diminishing importance for the UK in the global research infrastructure landscape. As a result, the UK research ecosystem is increasingly unable to support ambitious endeavours such as peta/exascale supercomputing initiatives, data-intensive and data-dependent computing activities and large-scale collaborative digital science projects. The UK is no longer internationally competitive in the use of e-Infrastructures for science. This means we have to follow the leaders. There is no single point of engagement for international activities which makes it difficult for the UK to leverage and coordinate activities across national boundaries.
How to overcome these deficiencies This unstructured research ecosystem with many differing funding components can be coordinated if a coherent and comprehensive strategy and management system is introduced. This will require a guaranteed 10-year budget which will enable the ecosystem to respond flexibly to the extremely rapid changes in both hardware and software and nurture and reward high-quality long-term software development.
11
Strategy for the UK Research Computing Ecosystem
October 2011
We should therefore remove the current impediments to scientific progress by working to unify the UK Research Computing Ecosystem, aiming to create a research environment where the compute, storage and networking resources at national facilities and universities are integrated and accessible and where the HPC resources seamlessly interoperate with the high-quality networking, storage and post-processing resources that modern researchers require.
Summary The main points of this section are: • The current UK provision of computational resources is patchy and uncoordinated (see recommendations #1, #2, #3, #8 below); • It is essential for the UK to have a research ecosystem, which is continually upgraded as technology, software, systems and research requirements evolve (see recommendations#3, #8 below); • The current funding of the UK research ecosystem comes from many different sources each of which represents different interest groups with different objectives (see recommendation #1 below); • The fragmented UK funding regime and a lack of a coordinated policy and strategy for a research computing ecosystem seriously hampers the research community (see recommendation #1 below); • A coherent and comprehensive strategy and management system for the UK’s e-infrastructure is needed (see recommendation #1 below).
12
Strategy for the UK Research Computing Ecosystem
October 2011
Building a UK Research Computing Ecosystem that delivers The vision for a research computing ecosystem A successful research computing ecosystem for the UK must include the computing technologies, applications, support and training needed to deliver world-leading scientific research. The vision of the ecosystem must go beyond existing requirements. This ecosystem should allow its diverse user community to develop new methodologies and uses of the data and computing resources it delivers to enable new research across a broad range of disciplines. This infrastructure needs to be flexible and able to incorporate new resources as they become available. For example, a current trend in the public sector is away from capital investment in physical infrastructure towards the use of commercial services on a pay-by-use basis through the Cloud. The research computing infrastructure needs to be able to exploit such resources where appropriate. Clearly, the management of such an infrastructure must both reflect the requirements of users, that is it should be driven by research needs in biology, physics, chemistry, the humanities etc., and be guided by that specialist technical knowledge necessary to enable it to operate effectively and economically. In this respect, a homogeneous, centralised management approach, supported by a robust and effective implementation plan, is essential for a production-quality, integrated service meeting the needs of academia and industry. Furthermore, such an approach is needed for coherent management of resources and to ensure value for money. An integrated and holistic approach would achieve economies as provisioning could be statistically planned, with negligible waste and with procurement and operational savings similar to those of commercial cloud systems. The UK e-Infrastructure spans disciplines, geography and scale. It ranges from departmental and University resources to regional, national and international provision and spans the whole range of data and compute resources. Increasingly the interdisciplinary teams tackling large-scale scientific and socio-economic challenges come from multiple collaborating organisations, and their research requires the federation of large-scale computing and data resources and the necessary networking infrastructure to allow these to be brought together. The currently fragmented ecosystem greatly hampers the potential productivity of this new research paradigm. An opportunity exists to transform the UK’s Research Computing Ecosystem and in so doing to enhance greatly the productivity of the UK’s research base across all Research Council priorities.
A holistic approach is needed Academics and industry users of the UK’s e-Infrastructure have a broad spectrum of diverse, but overlapping needs. A more holistic and coordinated approach to e-Infrastructure provision is proposed that will allow more effective and efficient provision and use of resources, removing practical, but invisible barriers.
13
Strategy for the UK Research Computing Ecosystem
October 2011
The current fragmented approach to e-Infrastructure provision, particularly with regard to access to compute and data, creates barriers to its use. This leads to redundant, excessive bureaucracy that inhibits creativity and leads to a “silo mentality” in the user community. Without clear paths from one scale of resources to another, projects become stuck at a particular scale and fail to realise their true potential. Coupled with this, the requirements for a high-bandwidth network infrastructure are clear, particularly with the growth of data in all disciplines and the growth of many data-intensive disciplines. Cooperation between resource providers and Research Councils is essential to make the communities aware of the potential offered by the Research Computing Ecosystem and to support users effectively in achieving their goals.
The requirements of a successful research computing ecosystem A successful ecosystem requires community awareness and coordination between providers, funding bodies and expertise. It also requires incentives to secure the commitment and involvement of both industry and academe. An important role of the body governing the Research Computing Infrastructure will be to put in place appropriate incentives to ensure that infrastructure is fully exploited and that the maximum benefit is realised for UK research through participation, collaboration and conformance to standards and best practice. Involving the communities, both from industry and academe, when defining the research ecosystem is of prime importance, but once established it is also crucial that appropriate training is available. Likewise, it is important that community software is able to make optimal use of the available e-infrastructure resources. In particular we need to include basic and advanced theory and computational skills in undergraduate and graduate education. Except for the most able and motivated, graduate students in computational fields are not equipped with the skills (mathematics, numerical analysis, sequential and parallel programming, computer science, software design, etc.) necessary to immediately contribute to the development of and, sometimes, even use of, applications. In a similar fashion, the provision of effective outreach and appropriate training to the UK user community remains a key requirement. Research has a long-term reliance on e-infrastructure. Therefore, it is crucial that the UK e-infrastructure receives sustained and stable funding across the whole spectrum of provision from departmental, to university, regional and national levels.The strategy for this investment needs to span the whole Research Computing Ecosystem and Research Council priorities. By working together, opportunities for scientific discovery can be enhanced in all areas.
Summary The main points of this section are: • A successful research computing ecosystem needs effective management supporting the development of new methodologies across a broad range of disciplines (see recommendations #1, #5 below); • This ecosystem needs to span disciplines, geography and scale (see recommendations #3, #6 below); • The current fragmented approach to e-Infrastructure provision, particularly with regard to access to compute and data, creates barriers to its use (see recommendation #9 below); • Training is an important aspect of e-infrastructure provision (see recommendation #7, #11 below); • It is crucial that the UK e-infrastructure receives sustained and stable funding across the whole spectrum of provision from departmental, to university, regional and national (see recommendations #1, #2 below). 14
Strategy for the UK Research Computing Ecosystem
October 2011
Funding a sustainable Research Environment The opportunities A move to an integrated e-infrastructure, which is comprehensively and coherently managed, presents many opportunities to provide a better, more cost-effective and better monitored service. These include: • Economies of scale: These comprise availability of larger resources to individual users, more efficient utilisation of resources and stronger purchasing power with suppliers; • Partnership with industry: The use by industry of national resources would bring benefits through the economies of scale. Conversely, industrial resources could be made available to the academic community thereby increasing the overall capability of the e-Infrastructure. Not only industry, but also charitable research organisations such as Cancer Research UK could be engaged in this way. A further benefit of such a partnership would be the stimulation of knowledge exchange between the UK’s universities, national facilities, research charities and businesses; • Wider use of resources: The widest possible use of a national e-Infrastructure is important because it supports the case for that infrastructure and establishes a broader base to finance such infrastructure. This is clearly to the benefit of both industry and academe where the industrial or academic justification might not, on its own, be strong enough, but where, in the case of shared use, a valid case could be made. • Better evaluation of the benefits of e-Infrastructure: It is important to measure the benefits of eInfrastructure in order to assess its return on investment. This is necessary to justify funding. More coherent management of resources is an essential part of this process. It should be noted that societal benefits, such as improved medicine and healthcare, are an important component in assessing the overall return on investment.
The benefits of a unified research computing community The benefits of a unified research computing ecosystem have been clearly presented in the preceding text. In summary, these benefits include access to better resources, computational, networks and data, expertise and training, and support for new, interdisciplinary projects extending over the boundaries of the Research Councils. The priority must be better research, unencumbered by the overheads of redundant bureaucracy and the unproductive, competitive manoeuvring of individual stakeholders. The current fragmented community is leading to isolated pockets of resources and expertise which do little to benefit the overall community and which act as a barrier to wider-scale collaboration. There is a clear need to combine resources and knowledge to improve the competitiveness of UK science and to maximise the return on the investment made by the taxpayer.
15
Strategy for the UK Research Computing Ecosystem
October 2011
The outcome of this community effort to define the Research Computing Ecosystem The UK Research Computing Ecosystem needs to embrace all the Research Councils and stakeholders to offer a holistic approach to the provision of an essential UK-wide research tool. This ecosystem needs to have the following features: • It should be managed by a single body. This is essential for coherent and comprehensive management of the relevant resources. The remit of this body should extend down to an appropriate level, that is it should not interfere with local autonomy where to do so would be neither reasonable nor productive (see recommendation #1 below); • This body should have an annual budget which extends forward at least ten years. This is necessary for appropriate planning and deployment of resources including training and software development (see recommendation #1 below); • Funding should cover the entire computing ecosystem: software, people, all levels and forms of hardware including compute, data storage, networks and Cloud-based resources (see recommendation #2 below); • There needs to be a flexibility of funding enabling rapid responses to hardware and software developments and to new research opportunities and requirements (see recommendation #3 below); • There needs to be long-term funding for ambitious software development projects, including the rewriting of legacy code, with the guarantee that suitable hardware resources will be provided to protect investment in software and to maximise the impact of its development (see recommendation #4 below); • The ecosystem needs to maximise the use of resources at both the UK and international levels. This is essential if UK research is to remain world-class (see recommendation #8 below).
16
Strategy for the UK Research Computing Ecosystem
October 2011
Conclusions and Recommendations There are a number of major considerations that argue strongly for a paradigm shift in the current arrangements for the provision of a research computing ecosystem. It is clear that research computing will not realise its full potential without integrated and coherent management of the entire ecosystem involving significant restructuring of many research communities and funding streams. The following recommendations focus on fostering innovation, and enabling better science in the face of anticipated budget constraints. • Recommendation #1: The Research Computing Ecosystem should be governed by a single body that represents all stakeholders, both providers and users and from academe and industry, and reports directly to BIS (or the equivalent government body responsible for UK research and innovation). The remit of this body should include a rolling ten-year strategic plan that recognises the diversity of user requirements and rapid technological developments, and coordinates investment and operations to maximise the benefits to UK research; • Recommendation #2: The strategic plan should cover the entire computing ecosystem down to an appropriate level, encompassing software, people, hardware including compute, data storage and networks and Cloud-based resources; • Recommendation #3: The funding agencies should recognise the need for flexibility of funding and put in place mechanisms to enable rapid responses to hardware and software developments and to new research opportunities and requirements; • Recommendation #4: There needs to be long-term funding for ambitious software development projects, including the rewriting of legacy code, particularly targeting community codes, and for their on-going maintenance and support. This should be the end point of a support and development framework that starts with support for exploratory new projects. Access to the most expensive resources should require demonstration that the research quality and technical implementation meet the highest standards; • Recommendation #5: Investments in research computing need to be holistically managed and recognise the different timescales involved. The current approach is hardware centric and short-term (reflecting rapid technological development), but does not recognise the critical and long-term role that software and people play in the enterprise. • Recommendation #6: The strategic plan should recognise that need exists at all levels of research computing from the research-group level, to university, regional and national facilities. It should not be a case of one or the other. Researchers should be able to use the correct scale of infrastructure for their problem and move between levels easily. • Recommendation #7: Appropriate training needs to be given across a wide range of disciplines to ensure that researchers are aware of the capabilities afforded by e-Infrastructure and are able to use it effectively. The provision of effective outreach and appropriate training of the UK user community is a key requirement. • Recommendation #8: The strategy should include a forward plan for funding which reflects the needs across all disciplines and which ensures that scientists have access to adequate resources at appropriate 17
Strategy for the UK Research Computing Ecosystem
October 2011
levels. This is essential to ensure that UK science maintains its competitive position. New user communities will need access to all levels of provision and there should be plans to stimulate and meet that need. • Recommendation #9: It should be recognised that charging mechanisms for access can have a detrimental effect on the quality and range of the research supported and the utilisation of the resources. Peer review and allocation mechanisms should adhere to international best practice and be responsive to the needs of research. • Recommendation #10: There should be encouragement for cross-disciplinary interaction focused on productivity and participation (communities, not just teams) with subsequent funding for research and deployment of solutions from mathematics, computer science and software engineering. • Recommendation #11: There should be investment in developing the next generation of computationally aware researchers. This applies not only to the physical and life sciences, but also to the social sciences, arts and humanities. By implementing these recommendations the UK Research Computing Ecosystem can be transformed to deliver a long-term e-Infrastructure that not only meets the needs of its current users, but can also respond rapidly to their changing needs and stimulate innovative use of these vital resources. This can only be to the benefit of UK research and to the UK’s economy and ultimately will maximise the taxpayer’s return on investment.
18
Strategy for the UK Research Computing Ecosystem
October 2011
Contributors This report was part-financed as an e-Science mini-Theme through the e-Science Institute at The University of Edinburgh. The mini-theme was proposed by:
Prof Peter Coveney, University College London Prof Richard Kenway OBE, The University of Edinburgh
Theme leaders:
Prof Peter Coveney, University College London
Prof Richard Kenway OBE, The University of Edinburgh
Prof Ron Perott, Queen’s University Belfast Prof David de Roure, University of Oxford
Prof Anne Trefethen, University of Oxford
Report editor:
Prof Francis Wray, Kingston University
Other contributors:
Prof Malcolm Atkinson, The University of Edinburgh
Dr Philip Biggin, University of Oxford Dr Ewan Birney, European Bioinformatics Institute
Dr Richard Blake, Science and Technology Facilities Council
Dr Pete Bond, University of Cambridge Prof David Britton, University of Glasgow
Dr John Brooke, The University of Manchester Dr Richard Bryce, The University of Manchester
Dr Stephen Butcher, Higher Education Funding Council for England
Dr Leo Caves, The University of York Dr Stuart Dunn, King’s College London
Prof Jonathan Essex, University of Southampton Prof Jonathan Flynn, University of Southampton
Prof Carlos Frenk, Durham University
Dr Neil Geddes, Science and Technology Facilities Council Prof Carole Goble, The University of Manchester
Dr Clare Gryce, University College London Dr Derek Groen, University College London
Dr Sarah Harris, University of Leeds
Dr Richard Henchman, The University of Manchester Dr Andrew Herbert, Microsoft Research EMEA
Dr Syma Khalid, University of Southampton Prof Charles Laughton, The University of Nottingham
Dr Brian Lawrence, Natural Environment Research Council 19
Strategy for the UK Research Computing Ecosystem
Dr Alex Leach, The University of York
Prof Ben Leimkuhler, The University of Edinburgh
Dr Julien Michel, The University of Edinburgh Prof Adrian Mulholland, University of Bristol
Prof Sally Norman, University of Sussex Dr Mark Parsons, The University of Edinburgh
Prof Mike Payne, University of Cambridge
Dr Colin Roach, Culham Centre for Fusion Energy Prof Michael Robb, Imperial College London
Dr David Salmon, JANET Prof Mark Samson, University of Oxford
Dr Nour Shublaq, University College London
Dr Ian Stewart, University of Bristol Prof Arthur Trew, The University of Edinburgh
Dr Alexander Voss, University of St Andrews Dr Christopher Woods, University of Bristol
Dr Jeremy Yates, University College London
20
October 2011
This report can be downloaded from www.esi.ac.uk/files/esi/ResearchComputing.pdf Printed on behalf of the UK eScience community by The University of Edinburgh