The Concept of Grid. Grid-technologies and cloud computing ... - JINR

1 downloads 75 Views 20MB Size Report
Challenges of scientific computing – distributed computing and. Grid ... “A computational grid is a hardware and software ..... (presentation, interaction.
The Concept of Grid. Grid-technologies and cloud computing in science, education and business Tatiana Strizh Scientific Secretary Laboratory of Information Technologies, JINR

Dubna, 28.10.10 T.Strizh (LIT, JINR)

1.

Challenges of scientific computing – distributed computing and Grid

2.

Cloud computing

3.

Grids and Clouds

T.Strizh (LIT, JINR)

In 1961, computing pioneer John McCarthy predicted that “computation may someday be organized as a public utility” „

„

In the mid 1990s, the term Grid was coined to describe technologies that would allow consumers to obtain computing power on demand. T.Strizh (LIT, JINR)

Challenges of scientific computing – distributed computing and Grid T.Strizh (LIT, JINR)

Industry Journey Old World

New World

Static

Dynamic

Solo

Shared

Physical

Virtual

Manual

Automated

Application

Service

T.Strizh (LIT, JINR)

We do e-Science

„ „

„

„

“e” like in e-mail: • digital, distributed Science that is: • Computationally intensive • Operates on massive digital data sets • Carried out in a distributed network environment High-Throughput vs HighPerformance Computing: • HTC: distributed (serial tasks), free cycles, cheap • HPC: compact (parallel tasks), booked years ahead, expensive High-Energy Physics is a textbook example of e-science

T.Strizh (LIT, JINR)

New instruments, more data, more scientists, more computers

1989 - WWW born in CERN

today

T.Strizh (LIT, JINR)

Modern HEP data processing: workflow of very different tasks Event Event generation generation (Pythia) (Pythia) Detector Detectorsimulation simulation(Geant) (Geant) 10001111 01011101 01100101 11011010 0

Hit Hitdigitization digitization Reconstruction Reconstruction Analysis Analysisdata datapreparation preparation Analysis, Analysis, results results (ROOT) (ROOT)

Slide adapted from Ch.Collins-Tooth and J.R.Catmore

T.Strizh (LIT, JINR)

Software for HEP experiments

T.Strizh (LIT, JINR)

Grid to the rescue

T.Strizh (LIT, JINR)

Origin of the word “Grid” „

„

„

„

The Ian Foster and Carl Kesselman duo organized in 1997, at Argonne National Laboratory, a workshop entitled “Building a Computational Grid”. At this moment the term “Grid” was born. Refers to computing grids as analogy of power grids • Many producers • Competing providers • Simple for end-users Spelled “grid” or “Grid”

T.Strizh (LIT, JINR)

Why Grid? — The Changing Nature of Work Collaborative & Dynamic

Project Project focused, focused, globally globally distributed distributed teams, teams, spanning spanning organizations organizations within within and and beyond beyond company company boundaries boundaries

Distributed & Heterogeneous

Each Each team team member/group member/group brings brings own own data, data, compute, compute, & & other other resources resources into into the the project project

Data & Computation Intensive

Access Access to to computing computing and and data data resources resources must must be be coordinated coordinated across across the the collaboration collaboration

Concurrent Innovation Cycles

Resources Resources must must be be available available to to projects projects with with strong strong QoS, QoS, & & also also reflect reflect enterprise-wide enterprise-wide biz biz priorities priorities

IT IT must must adapt adapt to to this this new new reality reality T.Strizh (LIT, JINR)

What’s a Grid? 1999: The Grid: Blueprint for a New Computing Infrastructure: “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.” 2001: The Anatomy of the Grid: Enabling Scalable Virtual Organization : „ “. . . coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations.” • The International Journal of High Performance Computing Applications. Volume 15, Number 3, Fall, 2001

2002: Ian Foster‘s Checklist: • Computing resources are not administered centrally; • Open standards are used; • Non-trivial quality of service is achieved. • GridToday, GridToday, July 2002

T.Strizh (LIT, JINR)

Five big ideas Resource sharing: Global sharing is the very essence of grid computing. Secure access: Trust between resource providers and users is essential, especially when they don't know each other. Sharing resources conflicts with security policies in many individual computer centers, and on individual PCs, so getting grid security right is crucial. Resource use: Efficient, balanced use of computing resources is essential. The death of distance: Distance should make no difference: you should be able to access to computer resources from wherever you are. Open standards: Interoperability between different grids is a big goal, and is driven forward by the adoption of open standards for grid development, making it possible for everyone can contribute constructively to grid development. Standardization also encourages industry to invest in developing commercial grid services and infrastructure. Slide from Grid Cafe

T.Strizh (LIT, JINR)

The Grid Paradigm Mainframe

„

„

„

„

Distributed “supercomputer” making use of fast WANs Access to the great variety of resources by a single pass – a digital certificate Distributed data management A new scientific tool T.Strizh (LIT, JINR)

The Grid

PC Farm

Workstations

Grid is a result of IT progress

T.Strizh Graph from “The Triumph of the Light”, G. Stix, Sci. Am. January 2001(LIT, JINR)

GÉANT - Pan-European research and education network

Dark Fiber Core Among 19 Countries: ‹ Austria ‹ Belgium ‹ Croatia ‹ Czech Republic 34 NRENs, ‹ Denmark ‹ Finland ~40M Users; ‹ France 38 European ‹ Germany countries GEANT network ‹ Hungary ‹ Ireland topology – ‹ Italy lighting dark fiber for greater ‹ Netherlands ‹ Norway network performance, ‹ Slovakia 50k km Leased, ‹ Slovenia ‹ Spain 12k km Dark ‹ Sweden Fiber ‹ Switzerland ‹ United Kingdom

T.Strizh (LIT, JINR)

GEANT International Connectivity Links to networks in other world regions include extensive connectivity to North America as well as to TEIN (AsiaPacific), ALICE (Latin America), EUMEDCONNECT (Mediterranean), ORIENT (China) and UbuntuNet Alliance (Southern African). GÉANT is also working towards connecting to Central Asia (CAREN) and South Eastern Africa. T.Strizh (LIT, JINR)

From the conventional HPC…

To the Grid

T.Strizh (LIT, JINR)

Middleware lets users simply submit jobs to the Grid without having to know where the data is or where the jobs will run. The software can run the job where the data is, or move the data to where there is CPU power available. Using the Grid and middleware, all the user has to submit a job and pick up the results. T.Strizh (LIT, JINR)

Acting as the gatekeeper and matchmaker for the Grid, middleware • monitors the Grid, • decides where to send computing jobs, • manages users, data and storage, • check the identity of the user through the use of digital certificates. For users the Grid will enable them to treat the distributed resources as one integrated computer system, with one single log on, and the middleware will handle all the negotiations, the submission of jobs and the collation of the results. T.Strizh (LIT, JINR)

„

„

„

A digital certificate is a file stored securely on a users computer which allows the Grid to correctly identify a user. The certificates are given to a user by the Certification Authority, with numerous steps to ensure the person applying is who they say they are. The middleware automatically extracts the users' identity from their digital certificate and uses this to log them in. This means users don't have to remember user names and passwords to log onto the Grid, they're automatically logged on using their Grid certificate. After this seamless identification process the middleware will find the most convenient and efficient places for the job to be run and organise efficient access to the relevant data. It deals with • authentication to the different sites being used, • runs the jobs, • keeps track of progress, • lets the user know when the work is complete and transfers theT.Strizh result back. (LIT, JINR)

http://www.eugridpma.org/members/worldmap

Grid Computing aims to “enable resource sharing and coordinated problem solving in dynamic, multiinstitutional virtual organizations”. Grids provide a distributed computing paradigm or infrastructure that spans across multiple virtual organizations (VO) where each VO can consist of either physically distributed institutions or logically related projects/groups. T.Strizh (LIT, JINR)

Grids for different needs „

„

„

„

National grids are hosted by one country. Such grids are useful in emergency situations, such as earthquakes or terrorist attacks. National grids can also support scientific investigations, such as studies of climate change, space station design and environmental cleanup. Project grids are created to work on a specific goal. They are typically contructed from shared resources for a limited time and are designed to meet the needs of multi-institutional research groups and "virtual teams". The LHC Computing Grid (LCG) is an example of a project grid; it was set up to help with the Large Hadron Collider high energy physics experiment. Private grids are sometimes called local grids or intra-grids, and are used by institutions such as hospitals and corporations. These grids are relatively small and centrally managed. etc. T.Strizh (LIT, JINR)

Global Community

T.Strizh (LIT, JINR)

Who uses Grid

T.Strizh (LIT, JINR)

EGEE (Enabling Grids for E-sciencE) Enabling Grids for E-sciencE

The aim of the project is to create a global Pan-European computing infrastructure of a Grid type. - Integrate regional Grid efforts - Represent leading grid activities in Europe

10 Federations, 27 Countries, 70 Organizations

EGEE-III INFSO-RI-222667

EGEE-III Enabling Grids for E-sciencE

Flagship Grid infrastructure project co-funded by the European Commission

Main Objectives – Expand/optimise existing EGEE infrastructure, include more resources and user communities – Prepare migration from a projectbased model to a sustainable federated infrastructure based on National Grid Initiatives

EGEE-III INFSO-RI-222667

Duration: 2 years Consortium: ~140 organisations across 33 countries EC co-funding: 32Million €

The EGEE project - Bob Jones - EGEE'08 - 22 September 2008

28

EGEE – What can we deliver? Enabling Grids for E-sciencE

• Infrastructure operation – Currently includes >270 sites across 50 countries – Continuous monitoring of grid services & automated site configuration/management

– Support ~300 Virtual Organisations from diverse research disciplines

• Middleware – Production quality middleware distributed under business friendly open source licence

• User Support - Managed process from first contact through to production usage – Training – Expertise in grid-enabling applications – Online helpdesk – Networking events (User Forum, Conferences etc.) EGEE-III INFSO-RI-222667

Collaborating e-Infrastructures Enabling Grids for E-sciencE

Potential for linking ~80 countries by 2008 EGEE-III INFSO-RI-222667

Enabling Grids for E-sciencE

Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences … EGEE-III INFSO-RI-222667

350 sites 55 countries 120,000 CPUs 26 PetaBytes >15,000 users >300 VOs >370,000 jobs/day

The LHC Machine

T.Strizh (LIT, JINR)

What is an LHC “Tier1” center „

„

WLCG: Worldwide LHC Computing Grid • A CERN project aiming to provide HEP computing infrastructure • Tiered structure: Tier0 at CERN, a dozen of regional Tier1s, local Tier2s WLCG Tier1 is: • Storage for replicated data (tapes and disks) • Data indexing service • Computing power • 24/7 on-call support system • Infrastructure: network, power, cooling, safety etc • File transfer services between Tiers • Experiment-specific interfaces (“VOBoxes”) • Database services • etc

Illustration: Sandbox Studio

T.Strizh (LIT, JINR)

ATLAS Multi-Grid Infrastructure

Graphics from a slide by A.Vaniachine

T.Strizh (LIT, JINR)

Grids in LHC experiments „ „

„

„

Almost all Monte Carlo and data processing today is done via Grid There are 20+ Grid flavors out there • Almost all are tailored for a specific application and/or specific hardware LHC experiments make use of only 3 Grid flavors: • gLite • ARC • OSG All experiments develop own higher-level Grid middleware layers • ALICE – AliEn • ATLAS – PANDA, GANGA, DDM • LHCb – DIRAC, GANGA • CMS – ProdAgent, CRAB, PhEDEx T.Strizh (LIT, JINR)

Additional benefits of a Grid system „

During the development of the LHC Computing Grid, many additional benefits of a distributed "grid" system became apparent: • Multiple copies of data can be kept in different sites, ensuring access for all scientists involved, independent of geographical location. • Allows optimum use of spare capacity for multiple computer centres, making it more efficient. • Having computer centres in multiple time zones eases round-the-clock monitoring and the availability of expert support. • No single points of failure. • The cost of maintenance and upgrades is distributed, since individual institutes fund local computing resources and retain responsibility for these, while still contributing to the global goal. • Independently managed resources have encouraged novel approaches to computing and analysis. • So-called “brain drain”, where researchers are forced to leave their country to access resources, is reduced when resources are available from their desktop. • The system can be easily reconfigured to face new challenges, making it able to dynamically evolve throughout the life of the LHC, growing in capacity to meet the rising demands as more data is collected each year. • Provides considerable flexibility in deciding how and where to provide future computing resources. • Allows community to take advantage of new technologies that may appear and that offer improved usability, cost effectiveness or energy efficiency. T.Strizh (LIT, JINR)

Bioinformatics and Grid „

„ „ „

Many large clusters utilized for • Services „ Sequence similarity (BLAST queues) • Research „ Molecular modeling (folding, docking) „ Training of novel predictors Jobs are typically short (3 minutes) But plenty (all against all → 1012) Considerable preparation for single job (couple of gigabytes of data to transfer)

T.Strizh (LIT, JINR)

Biomedical applications „

„

„

„

Biomedicine is also a pilot application area More than 20 applications deployed and being ported Three sub domains • Medical image processing • Biomedicine • Drug discovery Use Grid as platform for collaboration (don’t need same massive processing power or storage as HEP) T.Strizh (LIT, JINR)

Applications Example: WISDOM

„

„

Grid-enabled drug discovery process for neglected diseases • In silico docking „ compute probability that potential drugs dock with target protein • To speed up and reduce cost to develop new drugs

WISDOM (World-wide In Silico Docking On Malaria)‫‏‬: • Three large-scale deployments with more than 6 centuries of computations achieved in 190 days • 3,5TB of data produced • Up to 5000 computers in 50 countries • Some promising in-vitro tests, with relevant biological results • Classical pharmaceutical needs 15 years – using WISDOM - 3 years T.Strizh (LIT, JINR)

neuGRID

T.Strizh

The neuGRID is a user-friendly grid-based e-Infrastructure which enables the neuroscience community to collect and archive large amounts of imaging data and to access resources for computationally intensive data analyses. The neuGRID allows neuroscientists to identify neuro-degenerative disease markers through the analysis of brain images, thanks to an innovative new set of distributed medical and grid services. The neuGRID e-Infrastructure provide European neuroscientists working in the field of Alzheimer’s disease imagingwith an environment that has hooks to the largest image dataset to date (North American ADNI and AddNeuroMed), with applications and algorithms that can work with these images. The infrastructure is designed to be expandable to other medical applications. (LIT, JINR)

Astronomy needs Grid, too „

Enormous datasets, massive  computing, innovative  instrumentation • Dozens of new surveys launched recently • Many (10 – 100) terabytes per survey • High data rates • 10 – 100 researchers per survey • International collaborations (almost always) • Data is non-proprietary (usually)

T.Strizh (LIT, JINR)

Computational Chemistry „

GEMS (Grid Enabled Molecular Simulator) application • Calculation and fitting of electronic energies of atomic and molecular aggregates (using high level ab initio methods) • The use of statistical kinetics and dynamics to study chemical processes Angular distribution

„

„

Virtual Monitors • Angular distributions • Vibrational distributions • Rotational distributions • Many body systems

Vibrational distribution

Rotational distribution

End-User applications • Nanotubes • Life sciences • Statistical Thermodynamics • Molecular Virtual Reality T.Strizh (LIT, JINR)

Many body system Angular distribution

Fusion

„

Large Nuclear Fusion installations • E.g. International Thermonuclear Experimental Reactor (ITER) • Distributed data storage and handling needed • Computing power needed for „ Making decisions in real time „ Solving kinetic transport Æ particle orbits „ Stellarator optimization Æ magnetic field to contain the plasma

T.Strizh (LIT, JINR)

Earth Science Applications

„

„

„

Community • Many small groups that aggregate for projects (and separate afterwards) The Earth • Complex system • Independent domains with interfaces „ Solid Earth – Ocean – Atmosphere • Physics, chemistry and/or biology Applications • Earth observation by satellite • Seismology • Hydrology • Meteorology, Space • Climate Weather • Geosciences • Mars Atmosphere • Pollution • Database Collection T.Strizh (LIT, JINR)

Earth Sciences: Earthquake analysis „

„

„

Seismic software application determines: Epicentre, magnitude, mechanism Æ May make it possible to predict future earthquakes Æ Assess potential impact on specific regions Analysis of Indonesian earthquake (28 March 2005) • Data from French seismic sensor network GEOSCOPE transmitted to IPGP within 12 hours after the earthquake • Solution found within 30 hours after earthquake occurred „ 10 times faster on the Grid than on local computers • Results „ Not an aftershock of December 2004 earthquake „ Different location (different part of fault line further south) „ Different mechanism Rapid analysis of earthquakes is important for relief efforts T.Strizh (LIT, JINR)

T.Strizh (LIT, JINR)

T.Strizh (LIT, JINR)

BEinGRID, business experiments in GRID, has been successfully conducting real-world experiments to provide, use and validate Grid technologies to meet today’s business challenges - 25 business pilots Advanced Manufacturing Environment and e-Science •Grid-based Groundwater Modelling with FEFlOW •Earth Observation •Seismic Processing and Reservoir Simulation

•Computational Fluid Dynamics •Integration of engineering and business Processes in Metal Forming •New Product and Process Development •Shipbuilding Integrates Grid Technology •Workflows on Web 2.0

Financial Sector •Financial Portfolio Management •Risk Management in Finance •Data Recovery Service •Anti-money laundering in Grid (AMONG)

Telecommunication Telecommunication Anti-fraud Grid-based System

Tourism •Travel CRM

Agriculture

Retail and logistics • Retail Management • Collaborative environment in the Supply Chain Management for Pharmaceutics • Sales Management System Media • Textile Grid Portal •Movie Post-production Workflow • Logistics and Distribution optimisation •Visualisation and Virtual Reality • Grid Technologies within b2b Networks •Virtual Hosting environment T.Strizh (LIT, JINR)

•Grid Technologies in Agro-food business (AgroGrid)

Health •Enhanced IMRT Planning using Grid Services on Demand with SlAs (BEinEIMRT)

eLearning and grids Benefits of eLearning • Greater flexibility in terms of time, location and structure. • It can be tailored to the individual. • Improved accessibility to training and education, regardless of age, ability or social integration. • It is cost effective. • It encourages studentto-student interactions.

T.Strizh (LIT, JINR)

European Learning Grid Infrastructure (ELeGI) „

„

T.Strizh (LIT, JINR)

From 2004-2007 the European Learning Grid Infrastructure (ELeGI) project aimed to incorporate grids into eLearning. A Learning Grid is an enabling architecture based on three pillars: Grid, Semantics and Educational Modelling allowing the definition and the execution of learning experiences.

On demand Grid services for high education and training in Earth observation eLearning (eGLE) Consortium: Western University of Timisoara Romanian Space Agency Technical University of Cluj-Napoca National Institute for Aerospace Research Elie Carafoli

• Lesson ( model, content,

authoring, management, execution etc.) • Learning policies (presentation, interaction techniques, knowledge evaluation, learning tracking

Grid • Distributed tools (Lesson management, authoring, execution etc. ) • Services (Functionality, completeness, description, searching, composition) • Workflow based process description

Earth observation • Subjects, user community, T.Strizh (LIT, JINR)

data, information visualization, teaching

How to set up your own Grid

Slide from O.Smirnova (NDGF)

T.Strizh (LIT, JINR)

T.Strizh (LIT, JINR)

Grid-infrastructure for training and education that consists of three grid sites located at JINR and one site in each of the following organisations: Institute of High„ Energy Physics - IHEP (Protvino), Institute of „ Mathematics and Information Technologies AS of Republic of Uzbekistan - IMIT (Tashkhent, Uzbekistan), Sofia University "St. „ Kliment Ohridski" - SU (Sofia, Bulgaria), Bogolyubov Institute „ for Theoretical Physics - BITP (Kiev, Ukraine), National Technical „ University of Ukraine "Kyiv Polytechnic Institute" - KPI (Kiev, Ukraine). T.Strizh (LIT, JINR)

Useful References: „

Grid Café: http://www.gridcafe.org

„

OPEN GRID FORUM: http://www.ogf.org

„

GLOBUS: http://www.globus.org

„

TERAGRID: http://www.teragrid.org

„

Open Science Grid: http://opensciencegrid.org

„

LCG: http://lcg.web.cern.ch/LCG

„

EGEE: http://www.eu-egee.org

„

EGEE-RDIG: http://www.egee-rdig.ru

„

EGI:

„

International Science Grid this Week: http://www.isgtw.org

„

GridClub: http://gridclub.ru

„

Грид в ОИЯИ: http://grid.jinr.ru

http://web.eu-egi.eu

T.Strizh (LIT, JINR)

What next?

T.Strizh (LIT, JINR)

European e-Infrastructure Enabling Grids for E-sciencE

Need to prepare permanent, common Grid infrastructure Ensure the long-term sustainability of the European einfrastructure independent of short project funding cycles Coordinate the integration and interaction between National Grid Infrastructures (NGIs) Operate the European level of the production Grid infrastructure for a wide range of scientific disciplines to link NGIs

EGEE-III INFSO-RI-222667

The EGEE project - Bob Jones - EGEE'08 - 22 September 2008

57

Cloud computing

T.Strizh (LIT, JINR)

„

“Cloud computing is perhaps the most talkedabout shift in the technology industry today. The concept of running applications from the cloud is quickly evolving from a futuristic vision to a commercially viable alternative for mainstream business. A recent survey of global 2,000 companies revealed that 30 % are already using cloud infrastructure to host their applications, and another 20 % plan to do so by next year.”

Bill Loumpouridis (http://billloumpouridis.syscon.com/

The term cloud is used as a metaphor for the Internet. This is because in computer network diagrams the internet is often illustrated as a cloud. T.Strizh (LIT, JINR)

One of the first milestones for cloud computing Salesforce.com in 1999, which pioneered the concept of delivering enterprise applications via a simple website. http://www.salesforce.com/

https://www.mturk.com/mturk/welcome

http://aws.amazon.com/

The next development was Amazon Web Services in 2002, which provided a suite of cloud-based services including storage, computation and even human intelligence through the Amazon Mechanical Turk. Then in 2006, Amazon launched its Elastic Compute cloud (EC2) as a commercial web service that allows small companies and individuals to rent computers on which to run their own computer applications.

Another big milestone came in 2009, as Web 2.0 hit its stride, and Google and others started to offer browserbased enterprise applications, though services such as http://code.google.com/intl/en/appengine/ Google Apps. http://www.computerweekly.com

T.Strizh (LIT, JINR)

T.Strizh (LIT, JINR)

There are dozens of different definitions for Cloud Computing and there seems to be no consensus on what a Cloud is. On the other hand, Cloud Computing is not a completely new concept. It has intricate connection to: • the Grid Computing paradigm, • utility computing, • cluster computing, • distributed systems in general.

I. Foster et al. Cloud Computing and Grid Computing 360-Degree Compared

T.Strizh (LIT, JINR)

Definition A large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet. Cloud Computing is a specialized distributed computing paradigm; it differs from traditional ones in that 1) it is massively scalable, 2) can be encapsulated as an abstract entity that delivers different levels of services to customers outside the Cloud, 3) it is driven by economies of scale, 4) the services can be dynamically configured (via virtualization or other approaches) and delivered on demand.

I. Foster et al. Cloud Computing and Grid Computing 360-Degree Compared

T.Strizh (LIT, JINR)

What is Driving Cloud Computing? Customer • In one word: economics • Faster, simpler, cheaper to use cloud applications • No upfront capital required for servers and storage • No ongoing operational expenses for running datacenter • Applications can be accessed from anywhere, anytime

Vendor • Easier for application vendors to reach new customers • Lowest cost way of delivering and supporting applications • Ability to use commodity server and storage hardware • Ability to drive down data center operational costs • In one word: economics

T.Strizh (LIT, JINR)

Andy Bechtolsheim,Chairman & Co-founder, Arista Networks T.Strizh (LIT, JINR)

What is the relationship between virtualization and cloud computing? Virtualization is the ability to run “virtual machines” on top of a “hypervisor.” A virtual machine (VM) is a software implementation of a machine (i.e., a computer) that executes programs like a physical machine. Each VM includes its own kernel, operating system, supporting libraries and applications. A hypervisor provides a uniform abstraction of the underlying physical machine. Multiple VMs can execute simultaneously on a single hypervisor. Virtualization is seen as an enabler for cloud computing, allowing the cloud computing provider the necessary flexibility to move and allocate the computing resources requested by the user wherever the physical resources are available.

T.Strizh (LIT, JINR)

How are clouds classified? “Service style“ - depending on the portion of the software stack delivered as a service *-as-a-Service (*aaS), where *-I,P,S,D,H,etc.

„

„

Cloud "types" refer to the nature of access and control with respect to use and provisioning of virtual and physical resources.

T.Strizh (LIT, JINR)

Main cloud services SaaS Allows users to run existing online applications

Free or paid via subscription Accessible from any computer Facilitates collaborative working Generic applications not always suitable for business use … Google Docs, Zoho Creator, Pixlr, etc

PaaS Allows users to create their own cloud applications using supplier –specific tools and languages

Rapid development at low cost Private or public deployment Limits developers to provider languages and tools Risk of vendor lock-in Force.com, Google App Engine, etc

IaaS Allows users to run any applications they please on cloud hardware (collections of virtualized computer hardware resources, including machines, network, and storage) storage of their own choice Amazon Web Service(EC2), GoGrid, etc T.Strizh

(LIT, JINR)

http://explainingcomputers.com/clo uddir.html

What are cloud types? Public cloud „

Public clouds provide access to computing resources for the general public over the Internet. The public cloud provider allows customers to self-provision resources typically via a web service interface. Customer's rent access to resources as needed on a pay-as-you-go basis. Public clouds offer access to large pools of scalable resources on a temporary basis without the need for capital investment in data center infrastructure.

Private cloud „

Private clouds give users immediate access to computing resources hosted within an organization's infrastructure. Users self-provision and scale collections of resources drawn from the private cloud, typically via web service interface, just as with a public cloud. However, because it is deployed within the organization's existing data center—and behind the organization's firewall—a private cloud is subject to the organization's physical, electronic, and procedural security measures and thus offers a higher degree of security over sensitive code and data. In addition, private clouds consolidate and optimize the performance of physical hardware through virtualization, and can thus markedly improve data center efficiency while reducing operational expense.

Hybrid cloud „

A hybrid cloud combines computing resources (e.g., machines, network, storage, etc.) drawn from one or more public clouds and one or more private clouds at the behest of its users.

...

T.Strizh (LIT, JINR)

What are the Barriers to Cloud Computing? Customer Perspective

Vendor Perspective

Data Security „ Many customers don’t wish to trust their data to “the cloud” „ Data must be locally retained for regulatory reasons Latency „ The cloud can be many milliseconds away „ Not suitable for real-time applications Application Availability „ Cannot switch from existing legacy applications „ Equivalent cloud applications do not exist

Service Level Agreements „ What if something goes wrong? „ What is the true cost of providing SLAs? Business Models „ SaaS/PaaS models are challenging „ Much lower upfront revenue Customer Lock-in „ Customers want open/standard APIs „ Need to continuously add value

Not all applications work on public clouds

Each applications is unique

T.Strizh (LIT, JINR)

Cloud security: the grand challenge The most important classes of cloud-specific risks identified by ENISA (Cloud Cloud Computing: Benefits, Risks and Recommendations for Information security) security) : „ LOSS OF GOVERNANCE: in using cloud infrastructures, the client necessarily cedes control to the Cloud Provider (CP) on a number of issues which may affect security. At the same time, SLAs may not offer a commitment to provide such services on the part of the cloud provider, thus leaving a gap in security defences.

„

LOCK-IN: there is currently little on offer in the way of tools, procedures or standard data formats or

„

ISOLATION FAILURE: multi-tenancy and shared resources are defining characteristics of cloud

„

COMPLIANCE RISKS: investment in achieving certification (e.g., industry standard or regulatory

„

„

„

„

S. Purser, OGF28

services interfaces that could guarantee data, application and service portability. This can make it difficult for the customer to migrate from one provider to another or migrate data and services back to an in-house IT environment. This introduces a dependency on a particular CP for service provision, especially if data portability, as the most fundamental aspect, is not enabled..

computing. This risk category covers the failure of mechanisms separating storage, memory, routing and even reputation between different tenants (e.g., so-called guest-hopping attacks). However it should be considered that attacks on resource isolation mechanisms (e.g.,. against hypervisors) are still less numerous and much more difficult for an attacker to put in practice compared to attacks on traditional OSs. requirements) may be put at risk by migration to the cloud: • if the CP cannot provide evidence of their own compliance with the relevant requirements • if the CP does not permit audit by the cloud customer (CC). In certain cases, it also means that using a public cloud infrastructure implies that certain kinds of compliance cannot be achieved.

MANAGEMENT INTERFACE COMPROMISE: customer management interfaces of a public cloud provider are accessible through the Internet and mediate access to larger sets of resources (than traditional hosting providers) and therefore pose an increased risk, especially when combined with remote access and web browser vulnerabilities. DATA PROTECTION: cloud computing poses several data protection risks for cloud customers and providers. In some cases, it may be difficult for the cloud customer (in its role as data controller) to effectively check the data handling practices of the cloud provider and thus to be sure that the data is handled in a lawful way. This problem is exacerbated in cases of multiple transfers of data, e.g., between federated clouds. INSECURE OR INCOMPLETE DATA DELETION: when a request to delete a cloud resource

is made, as with most operating systems, this may not result in true wiping of the data. Adequate or timely data deletion may also be impossible (or undesirable from a customer perspective), either because extra copies of data are stored but are not available, or because the disk to be destroyed also stores data from other clients. In the case of multiple tenancies and the reuse of hardware resources, this represents a higher risk to the customer than with dedicated hardware.

MALICIOUS INSIDER: while usually less likely, the damage which may be caused by malicious insiders is often far greater. Cloud architectures necessitate certain roles which are extremely high-risk. Examples include CP system administrators and managed security service providers. T.Strizh (LIT, JINR)

Cloud security: the grand challenge „

„

„

„

S. Purser, OGF28

Cloud computing can represent an improvement in security for non-critical applications and data. But transparency is crucial: customers must be given a means to assess and compare provider security practices. In the current state of the art, migrating critical applications and data to the cloud is still very risky (even private clouds) It is not currently clear to what extent the Cloud Computing model can be applied to applications that require high levels of security

T.Strizh (LIT, JINR)

http://cloudtaxonomy.opencrowd.com/ T.Strizh (LIT, JINR)

http://aws.amazon.com T.Strizh (LIT, JINR)

http://code.google.com/intl/en/appengine/

T.Strizh (LIT, JINR)

http://www.ibm.com/ibm/cloud/

T.Strizh (LIT, JINR)

http://www.microsoft.com/cloud/ T.Strizh (LIT, JINR)

CLOUDS FOR BUSINESS, SURE, BUT WILL WE SEE CLOUDS FOR SCIENCE? Cloud computing makes it easier than ever before for businesses to access computing power. But can cloud computing also help scientists? Researchers have very specific IT requirements that so far have needed special grid-style computing power. But what if Google or Amazon could one day provide a computing cloud that perfectly suited our scientists? Would we still need scientific computing grids?

T.Strizh (LIT, JINR)

Eucalyptus, OpenNebula and Nimbus are three major open-source cloud-computing software platforms. The overall function of these systems is to manage the provisioning of virtual machines for a cloud providing infrastructure-as-aservice (IaaS). These various opensource projects provide an important alternative for those who do not wish to use a commercially provided cloud.

T.Strizh (LIT, JINR)

T.Strizh (LIT, JINR)

T. Kielmann OGF25

When is your HPC application ready for the Cloud ? An HPC Checklist „ If there are no issues with licenses, IP, secrecy, sensitive data, privacy, legal or regulatory issues, . . . „ If your application is (almost) architecture independent, not optimized for specific architecture (i.e. single process, loosely-coupled lowlevel parallel, I/O-robust) „ If it’s just one application and zillions of parameters „ If latency and bandwidth are not an issue „ If time (wait, wall, run) doesn’t really matter „ If your job is low-priority, simple SLAs, can re-run, . . . T.Strizh (LIT, JINR)

The Virtual Computing Lab „

„

Mladen A.Vouk “Using VCL to Power “Clouds” OGF25

T.Strizh (LIT, JINR)

VCL is a cloud computing idea developed at the North Carolina State University (NCSU) through a collaboration of its College of Engineering and IBM Virtual Computing Initiative to address a growing set of computational needs and user requirements for the university. The integration of HPC in VCL significantly increases resource utilization by the reuse of blade servers. This method allows the infrastructure to be shared among user requests, which leads to a greater availability of resources.

T.Strizh (LIT, JINR)

What is Cloud Computing? (Videos) http://tinyurl.com/kse4nh http://www.youtube.com/watch?v=hplXnFUlPmg&feature=related http://www.youtube.com/watch?v=XdBd14rjcs0 http://www.youtube.com/watch?v=QJncFirhjPg&feature=related http://www.youtube.com/watch?v=ae_DKNwK_ms&fmt=22 http://www.youtube.com/watch?v=QJncFirhjPg&NR=1 Geeting started with Cloud Computing http://library.dzone.com/sites/all/files/refcardz/rc082-010d-cloudcomputing.pdf Examples: Amazon AWS http://aws.amazon.com/ IBM Cloud http://www.ibm.com/ibm/cloud/ Google Apps Engine http://code.google.com/appengine/ Microsoft Azure Services Platform http://www.microsoft.com/azure/default.mspx Oracle http://www.oracle.com/technology/tech/cloud/index.html SUN http://www.sun.com/solutions/cloudcomputing/ HP https://h10078.www1.hp.com/cda/hpms/display/main/hpms_content.jsp?zn= bto&cp=1-11^40898_4000_100__&jumpid=go/cloudassure

HPC http://www.hpcinthecloud.com/ T.Strizh (LIT, JINR)

Grids and Clouds Cloud Computing not only overlaps with Grid Computing, it is indeed evolved out of Grid Computing and relies on Grid Computing as its backbone and infrastructure support. • Grids is a collection of computers, usually owned by multiple parties and in multiple locations, connected together such that users can share access to their combined power. • Clouds is a collection of computers, usually owned by a single party, connected together such that users can lease access to a share of their combined power. Grid computing is a relatively established form of distributed computing, adopted early by the high energy physics community and used now by all kinds of scientists. Cloud computing has recently experienced a surge and is a service now offered by a growing number of IT companies. Grids and clouds are much the same: both grids and clouds have adopted the concept of IT “as a service,” although grids are more likely to offer free access to shared resources, while clouds have a “pay-as-you-go” approach. T.Strizh (LIT, JINR)

GF1 - The First Grid Forum, June 16 - 19, 1999, GGF1 - The First Global Grid Forum March 4-7, 2001, OGF19 - 19th Open Grid Forum, January 29 - February 2, 2007

„

OGF28 March 1515-18, 2010 Munich, Germany

„

Grids have matured to production infrastructures in academic and commercial usage scenarios. In addition, Cloud computing is evolving as a promising new paradigm for infrastructure provisioning. For academics, such distributed computing and service infrastructures are considered key enablers for the future of eScience. Commercial application of such technologies leads to challenges and consideration similar to those in eScience. Technical topics: „ Future of Clouds, Grids and Clouds, Clouds and HPC „ Data Management, Access and Repositories for Distributed Computing „ Security in Grids, Clouds „ Production Grid Interoperability „ Software Sustainability for DCI „ Green IT, Energy Efficiency

„ „ „ „ „

„

„ „ „

„ „ „

OGF30/Grid2010 October 2525-28, 2010 Brussels, Belgium Topics of interest may include (but are not limited to): Data Management in Grids and Clouds Standards Roadmap for Science Clouds Interoperability of Grids and Clouds Performance measurement of Grid and Cloud Science Clouds across North America, Europe, and Asia Grid and Cloud Security - what we can learn from each other Federated Disriibuted Computing Infrastructures Performance of security implementations. Campus grids and clouds, developing production infrastructures Identity Management and Virtual Organizations OGF Standards in production environments InterInter-cloud interoperability and SLA management

T.Strizh (LIT, JINR)

Open Cloud Computing Interface The OGF Open Cloud Computing Interface Working Group (OCCI-WG) is developing a clean, open API for 'Infrastructure as a Service' (IaaS) based Clouds.

http://www.occi-wg.org/

T.Strizh (LIT, JINR)

Innovative Projects in Cloud Computing Architectures Resources and Services Virtualization without Barriers

• Open source technology to enable deployment and management of complex IT services across different administrative domains

Enhancing Grid Infrastructures with Cloud Computing •

Simplify and optimize its use and operation, providing a more flexible, dynamic computing environment for scientists. • Enhance existing computing infrastructures with “IaaS” paradigms

T.Strizh (LIT, JINR)

Grid Services Over Cloud Resources users Grid Resource Center Grid Services Cloud API

StratusLab Distribution Private Cloud

Public Clouds 89

Worker Nodes on Demands Service (WNoDeS) WNoDeS is in production at the INFN Tier-1 Computing Center (Bologna – Italy), fully integrated with its 7000-cores farm, for dynamic provisioning of integrated Grid/Cloud Virtual Environments. 2000 VMs (9/2010) can currently be instantiated ondemand out of this common farm. „

Main Features • • • • • • • • • • •

http://web.infn.it/wnodes/ T.Strizh (LIT, JINR)

Grid/Cloud Full Integration Authentication Gateway VM image selection VirtIO support Network throttling V-LAN support libguestFS support Multi-core VMs Cloud: OCCI API Cloud: Web Console Integration with shared distributed file systems (INFN Tier-1: IBM GPFS)

In 2009 CERN has started to develop an Infrastructure as a Service (IaaS) setup. In spring 2010 about 500 recent batch worker nodes have been added temporarily to the system, which allowed to perform large scale tests of the new infrastructure. It has been demonstrated that the system can sustain 15,000 or more concurrent virtual batch worker nodes.

T.Strizh (LIT, JINR)

Parallel Session: Grid and Cloud Middleware Conveners: Dr. Markus Schulz (CERN)

„ „

Clouds are used in production (but don’t replace grids) Technology developed for grid scheduling is used to link grids and clouds T.Strizh (LIT, JINR)

„

„

„

Scientific Grids have now a relatively long history of success with multi-domain, large-scale resource sharing Cloud computing offers significant advantages for many uses (among them, pay-as-you-go models, simplified access) The key problem - integration between several access interfaces (Grid, Cloud, or else)

T.Strizh (LIT, JINR)

Each project or person needs to make their own decision about whether to use grids or clouds or both. One thing is sure: both grids and clouds are rapidly evolving technologies that are helping scientists, businesses and individuals to achieve things never before possible.

What What will will happen happen to to these these technologies technologies in in 10 10 years? years? T.Strizh (LIT, JINR)

It’s a hybrid world - don’t stand still

Thank you for your attention !

John Barr : OGF28

T.Strizh (LIT, JINR)