Service-Oriented Science: Scaling eScience Impact - Open Science Grid

3 downloads 12619 Views 2MB Size Report
Build, buy, operate, & share infrastructure. ◇ Decouple consumer & provider ... Authentication and digital signature. ◇ Allows signer to assert attributes.
Service-Oriented Science: Scaling eScience Impact Ian Foster Computation Institute Argonne National Lab & University of Chicago

Acknowledgements z

z

z

z z

z

Carl Kesselman, with whom I developed many ideas (& slides) Bill Allcock, Charlie Catlett, Kate Keahey, Jennifer Schopf, Frank Siebenlist, Mike Wilde @ ANL/UC Ann Chervenak, Ewa Deelman, Laura Pearlman @ USC/ISI Karl Czajkowski, Steve Tuecke @ Univa Numerous other fine colleagues in NESC, EGEE, OSG, TeraGrid, etc. NSF & DOE for research support 2

Context: System-Level Science

Problems too large &/or complex to tackle alone … 3

Two Perspectives on System-Level Science z

z

System-level problems require integration ‹

Of expertise

‹

Of data sources (“data deluge”)

‹

Of component models

‹

Of experimental modalities

‹

Of computing systems

Internet enables decomposition ‹

“When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances” (George Gilder)

4

Integration & Decomposition: A Two-Dimensional Problem Function Resource

z

Decompose across network

Users

Discovery tools

Clients integrate dynamically

z

‹

Select & compose services

‹

Select “best of breed” providers

‹

Publish result as new services

Analysis tools Data Archives

Decouple resource & service providers Fig: S. G. Djorgovski 5

A Unifying Concept: The Grid “Resource sharing & coordinated

problem solving in dynamic, multiinstitutional virtual organizations”

1.

Enable integration of distributed resources

2.

Using general-purpose protocols & infrastructure

3.

To deliver required quality of service “The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001

6

System-Level Problem

Decomposition

Implementation Facilities Computers Storage Networks Services Software People

U. Colorado

UIUC

Experimental Model

Experimental Model

Grid technology

COORD.

NCSA Computational Model

Service-Oriented Systems: Applications vs. Infrastructure Users z

‹

‹

z

Composition

Service-oriented applications Wrap applications as services Compose applications into workflows

Service-oriented Grid infrastructure ‹

Workflows Invocation Appln Service

Appln Service

Provisioning

Provision physical resources to support application workloads

“The Many Faces of IT as Service”, ACM Queue, Foster, Tuecke, 2005 8

Scaling eScience: Forming & Operating Communities z

z

Define membership & roles; enforce laws & community standards ‹

I.e., policy for service-oriented architecture

‹

Addressing dynamic membership & policy

Build, buy, operate, & share infrastructure ‹ ‹

‹

Decouple consumer & provider For data, programs, services, computing, storage, instruments Address dynamics of community demand 9

Defining Community: Membership and Laws Identify VO participants and roles

z

‹

For people and services

Specify and control actions of members

z

‹

Empower members Æ delegation

‹

Enforce restrictions Æ federate B

A 1

1 10

A 1

Effective policy Access

Policy of site to community

10

1

B

2

1 16

2

Access granted by community to user

Site admissioncontrol policies

10

Evolution of Grid Security & Policy 1) Grid security infrastructure ‹

Public key authentication & delegation

‹

Access control lists (“gridmap” files)

Æ Limited set of policies can be expressed

2) Utilities to simplify operational use, e.g. ‹

MyProxy: online credential repository

‹

VOMS, ACL/gridmap management

Æ Broader set of policies, but still ad-hoc

3) General, standards-based framework for authorization & attribute management 11

Core Security Mechanisms z

Attribute Assertions ‹

z

Authentication and digital signature ‹

z

C asserts that S can perform O on behalf of C

Attribute mapping ‹

z

Allows signer to assert attributes

Delegation ‹

z

C asserts that S has attribute A with value V

{A1, A2… An}vo1 Ö {A’1, A’2… A’m}vo2

Policy ‹

Entity with attributes A asserted by C may perform operation O on resource R

12

Security Services for VO Policy z

Attribute Authority (ATA) ‹

z

Issue signed attribute assertions (incl. identity, delegation & mapping)

Authorization Authority (AZA) ‹

Decisions based on assertions & policy VO User A

VO AZA

VO ATA VO User B

VO A Service 13

Security Services for VO Policy z

Attribute Authority (ATA) ‹

z

Issue signed attribute assertions (incl. identity, delegation & mapping)

Authorization Authority (AZA) ‹

Decisions based on assertions & policy

Delegation Assertion VO Resource Admin User A User B can use Service A Attribute

VO AZA

VO ATA VO User B

VO A Service 14

Security Services for VO Policy z

Attribute Authority (ATA) ‹

z

Issue signed attribute assertions (incl. identity, delegation & mapping)

Authorization Authority (AZA) ‹

Decisions based on assertions & policy

Delegation Assertion VO Resource Admin User A User B can use Service A Attribute VO ATA

VO AZA

VO M embe r Attr ibute

VO Member Attribute

VO User B

VO A Service 15

Security Services for VO Policy z

Attribute Authority (ATA) ‹

z

Issue signed attribute assertions (incl. identity, delegation & mapping)

Authorization Authority (AZA) ‹

Decisions based on assertions & policy

Delegation Assertion VO Resource Admin User A User B can use Service A Attribute VO ATA

Mapping ATA

VO M embe r Attr ibute

VO Member Attribute

VO User B

VO AZA

VO A Service

VO-A Attr Æ VO-B Attr VO B Service 16

Closing the Loop: GT4 Security Toolkit Authz Callout: SAML, XACML

SSL/WS-Security with Proxy Services (running Certificates on user’s behalf)

Access Compute Center

Rights

CAS or VOMS issuing SAML or X.509 ACs

Users

Rights Local policy on VO identity or attribute authority

MyProxy

VO Rights’

Shib KCA 17

Security Needn’t Be Hard: Earth System Grid z

Purpose ‹

z

z

Access to large data

Policies ‹

Per-collection control

‹

Different user classes

Implementation (GT) ‹

‹

z

PURSE User Registration

Portal-based User Registration Service PKI, SAML assertions

Experience ‹

>2000 users

‹

>100 TB downloaded

Optional review

See also: GAMA (SDSC), Dorian (OSU)

www.earthsystemgrid.org 18

Scaling eScience: Forming & Operating Communities z

z

Define membership & roles; enforce laws & community standards ‹

I.e., policy for service-oriented architecture

‹

Addressing dynamics of membership & policy

Build, buy, operate, & share infrastructure ‹ ‹

‹

Decouple consumer & provider For data, programs, services, computing, storage, instruments Address dynamics of community demand 19

Bootstrapping a VO by Assembling Services 1) Integrate services from other sources ‹

Virtualize external services as VO services Content Services Capacity

Community Services Provider Capacity Provider

2) Coordinate & compose ‹

Create new services from existing ones “Service-Oriented Science”, Science, 2005

20

Providing VO Services: (1) Integration from Other Sources z

z

z

Negotiate service level agreements Delegate and deploy capabilities/services

Community A



Community Z

Provision to deliver defined capability

z

Configure environment

z

Host layered functions 21

Virtualizing Existing Services into a VO z

Establish service agreement with service ‹

z

E.g., WS-Agreement

Delegate use to VO user User A

VO User

User B

VO Admin

Existing Services 22

Deploying New Services Policy

Client

Allocate/provision Configure Initiate activity Monitor activity Control activity

Environment

Interface

Resource provider

Activity

WSRF (or WS-Transfer/WS-Man, etc.), Globus GRAM, Virtual Workspaces23

Available in High-Quality Open Source Software … Data Replication

Globus Toolkit v4 www.globus.org

Credential Mgmt

Replica Location

Grid Telecontrol Protocol

Delegation

Data Access & Integration

Community Scheduling Framework

WebMDS

Python Runtime

Community Authorization

Reliable File Transfer

Workspace Management

Trigger

C Runtime

Authentication Authorization

GridFTP

Grid Resource Allocation & Management

Index

Java Runtime

Security

Data Mgmt

Execution Mgmt

Info Services

Common Runtime

24 Globus Toolkit Version 4: Software for Service-Oriented Systems, LNCS 3779, 2-13, 2005

http://dev.globus.org Guidelines (Apache) Infrastructure (CVS, email, bugzilla, Wiki)

Projects Include …

25

Virtual Workspaces (Kate Keahey et al.) z

z z

z

z

GT4 service for the creation, monitoring, & management of virtual workspaces High-level workspace description Web Services interfaces for monitoring & managing Multiple implementations ‹

Dynamic accounts

‹

Xen virtual machines

‹

(VMware virtual machines)

Virtual clusters as a higher-level construct 26

How do Grids and VMs Play Together? request VM EPR

VM Factory

create new VM image

Client

use existing VM image inspect & manage

Create VM image

VM Repository

deploy, suspend

VM Manager

Resource VM

start program

27

Virtual OSG Clusters OSG

OSG cluster Xen hypervisors TeraGrid cluster

“Virtual Clusters for Grid Communities,” Zhang et al., CCGrid 2006

28

Dynamic Service Deployment (Argonne + China Grid) z

Interface ‹

Upload-push

‹

Upload-pull

‹

Deploy

‹

Undeploy

‹

Reload

“HAND: Highly Available Dynamic Deployment Infrastructure for GT4,” 29 Li Qi et al., 2006

Providing VO Services: (2) Coordination & Composition z

Take a set of provisioned services … … & compose to synthesize new behaviors

z

This is traditional service composition ‹

‹

But must also be concerned with emergent behaviors, autonomous interactions See the work of the agent & PlanetLab communities

“Brain vs. Brawn: Why Grids and Agents Need Each Other," Foster, Kesselman, Jennings, 2004.

30

The Globus-Based LIGO Data Grid LIGO Gravitational Wave Observatory

Birmingham• ƒCardiff

AEI/Golm

Replicating >1 Terabyte/day to 8 sites >40 million replicas so far MTBF = 1 month www.globus.org/solutions

31

Data Replication Service z

Pull “missing” files to a storage system

Data Movement Reliable File Transfer Service

GridFTP

GridFTP

“Design and Implementation of a Data Replication Service Based on the Lightweight 32 Data Replicator System,” Chervenak et al., 2005

Data Replication Service z

Pull “missing” files to a storage system Data Location

Data Movement Reliable File Transfer Service

GridFTP

Local Replica Catalog

Replica Location Index

GridFTP

Local Replica Catalog

Replica Location Index

“Design and Implementation of a Data Replication Service Based on the Lightweight 33 Data Replicator System,” Chervenak et al., 2005

Data Replication Service z

Pull “missing” files to a storage system Data Location

Data Movement Reliable File Transfer Service

Data Replication List of required Files

GridFTP

Local Replica Catalog

Replica Location Index

GridFTP

Local Replica Catalog

Replica Location Index

Data Replication Service

“Design and Implementation of a Data Replication Service Based on the Lightweight 34 Data Replicator System,” Chervenak et al., 2005

Composing Resources … Composing Services Deploy service Deploy container Deploy virtual machine Deploy hypervisor/OS Procure hardware

GridFTP

JVM VM Hypervisor/OS Physical machine

State exposed & access uniformly at all levels Provisioning, management, and monitoring at all levels35

Composing Resources … Composing Services Deploy service Deploy container Deploy virtual machine Deploy hypervisor/OS Procure hardware

DRS JVM VM

GridFTP

LRC

GridFTP

VO Services VM

Hypervisor/OS Physical machine

State exposed & access uniformly at all levels Provisioning, management, and monitoring at all levels36

Decomposition Enables Separation of Concerns & Roles

S1

User

D

S3

“Provide access to data D at S1, S2, S3 with performance P” Service Provider “Provide storage with performance P1, network with P2, …” Resource Provider

S2

S1 D

S2 S3

Replica catalog, User-level multicast, …

S1 D

S2 S3

37

Another Example: Astro Portal Stacking Service z

Purpose ‹

z

On-demand “stacks” of random locations within ~10TB dataset

=

Challenge ‹

‹

z

+ + + + + + +

Rapid access to 1010K “random” files Time-varying load

Solution ‹

Dynamic acquisition of compute, storage

S4

Web page or Web Service

Sloan Data

38

Astro Portal Stacking Performance (LAN GPFS)

39

Summary z

Community based science will be the norm ‹

z

Many different types of communities ‹

z

Community infrastructure will increasingly become the scientific observatory

Scaling requires a separation of concerns ‹

z

Differ in coupling, membership, lifetime, size

Must think beyond science stovepipes ‹

z

Requires collaborations across sciences— including computer science

Providers of resources, services, content

Small set of fundamental mechanisms required to build communities

40

For More Information z

Globus Alliance ‹

z

Dev.Globus ‹

z

www.opensciencegrid.org

TeraGrid ‹

z

dev.globus.org

Open Science Grid ‹

z

www.globus.org

www.teragrid.org

Background ‹

2nd Edition www.mkp.com/grid2

www.mcs.anl.gov/~foster 41

Suggest Documents