Build, buy, operate, & share infrastructure. â Decouple consumer & provider ... Authentication and digital signature. â Allows signer to assert attributes.
Service-Oriented Science: Scaling eScience Impact Ian Foster Computation Institute Argonne National Lab & University of Chicago
Acknowledgements z
z
z
z z
z
Carl Kesselman, with whom I developed many ideas (& slides) Bill Allcock, Charlie Catlett, Kate Keahey, Jennifer Schopf, Frank Siebenlist, Mike Wilde @ ANL/UC Ann Chervenak, Ewa Deelman, Laura Pearlman @ USC/ISI Karl Czajkowski, Steve Tuecke @ Univa Numerous other fine colleagues in NESC, EGEE, OSG, TeraGrid, etc. NSF & DOE for research support 2
Context: System-Level Science
Problems too large &/or complex to tackle alone … 3
Two Perspectives on System-Level Science z
z
System-level problems require integration
Of expertise
Of data sources (“data deluge”)
Of component models
Of experimental modalities
Of computing systems
Internet enables decomposition
“When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances” (George Gilder)
4
Integration & Decomposition: A Two-Dimensional Problem Function Resource
z
Decompose across network
Users
Discovery tools
Clients integrate dynamically
z
Select & compose services
Select “best of breed” providers
Publish result as new services
Analysis tools Data Archives
Decouple resource & service providers Fig: S. G. Djorgovski 5
A Unifying Concept: The Grid “Resource sharing & coordinated
problem solving in dynamic, multiinstitutional virtual organizations”
1.
Enable integration of distributed resources
2.
Using general-purpose protocols & infrastructure
3.
To deliver required quality of service “The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001
6
System-Level Problem
Decomposition
Implementation Facilities Computers Storage Networks Services Software People
U. Colorado
UIUC
Experimental Model
Experimental Model
Grid technology
COORD.
NCSA Computational Model
Service-Oriented Systems: Applications vs. Infrastructure Users z
z
Composition
Service-oriented applications Wrap applications as services Compose applications into workflows
Service-oriented Grid infrastructure
Workflows Invocation Appln Service
Appln Service
Provisioning
Provision physical resources to support application workloads
“The Many Faces of IT as Service”, ACM Queue, Foster, Tuecke, 2005 8
Scaling eScience: Forming & Operating Communities z
z
Define membership & roles; enforce laws & community standards
I.e., policy for service-oriented architecture
Addressing dynamic membership & policy
Build, buy, operate, & share infrastructure
Decouple consumer & provider For data, programs, services, computing, storage, instruments Address dynamics of community demand 9
Defining Community: Membership and Laws Identify VO participants and roles
z
For people and services
Specify and control actions of members
z
Empower members Æ delegation
Enforce restrictions Æ federate B
A 1
1 10
A 1
Effective policy Access
Policy of site to community
10
1
B
2
1 16
2
Access granted by community to user
Site admissioncontrol policies
10
Evolution of Grid Security & Policy 1) Grid security infrastructure
Public key authentication & delegation
Access control lists (“gridmap” files)
Æ Limited set of policies can be expressed
2) Utilities to simplify operational use, e.g.
MyProxy: online credential repository
VOMS, ACL/gridmap management
Æ Broader set of policies, but still ad-hoc
3) General, standards-based framework for authorization & attribute management 11
Core Security Mechanisms z
Attribute Assertions
z
Authentication and digital signature
z
C asserts that S can perform O on behalf of C
Attribute mapping
z
Allows signer to assert attributes
Delegation
z
C asserts that S has attribute A with value V
{A1, A2… An}vo1 Ö {A’1, A’2… A’m}vo2
Policy
Entity with attributes A asserted by C may perform operation O on resource R
12
Security Services for VO Policy z
Attribute Authority (ATA)
z
Issue signed attribute assertions (incl. identity, delegation & mapping)
Authorization Authority (AZA)
Decisions based on assertions & policy VO User A
VO AZA
VO ATA VO User B
VO A Service 13
Security Services for VO Policy z
Attribute Authority (ATA)
z
Issue signed attribute assertions (incl. identity, delegation & mapping)
Authorization Authority (AZA)
Decisions based on assertions & policy
Delegation Assertion VO Resource Admin User A User B can use Service A Attribute
VO AZA
VO ATA VO User B
VO A Service 14
Security Services for VO Policy z
Attribute Authority (ATA)
z
Issue signed attribute assertions (incl. identity, delegation & mapping)
Authorization Authority (AZA)
Decisions based on assertions & policy
Delegation Assertion VO Resource Admin User A User B can use Service A Attribute VO ATA
VO AZA
VO M embe r Attr ibute
VO Member Attribute
VO User B
VO A Service 15
Security Services for VO Policy z
Attribute Authority (ATA)
z
Issue signed attribute assertions (incl. identity, delegation & mapping)
Authorization Authority (AZA)
Decisions based on assertions & policy
Delegation Assertion VO Resource Admin User A User B can use Service A Attribute VO ATA
Mapping ATA
VO M embe r Attr ibute
VO Member Attribute
VO User B
VO AZA
VO A Service
VO-A Attr Æ VO-B Attr VO B Service 16
Closing the Loop: GT4 Security Toolkit Authz Callout: SAML, XACML
SSL/WS-Security with Proxy Services (running Certificates on user’s behalf)
Access Compute Center
Rights
CAS or VOMS issuing SAML or X.509 ACs
Users
Rights Local policy on VO identity or attribute authority
MyProxy
VO Rights’
Shib KCA 17
Security Needn’t Be Hard: Earth System Grid z
Purpose
z
z
Access to large data
Policies
Per-collection control
Different user classes
Implementation (GT)
z
PURSE User Registration
Portal-based User Registration Service PKI, SAML assertions
Experience
>2000 users
>100 TB downloaded
Optional review
See also: GAMA (SDSC), Dorian (OSU)
www.earthsystemgrid.org 18
Scaling eScience: Forming & Operating Communities z
z
Define membership & roles; enforce laws & community standards
I.e., policy for service-oriented architecture
Addressing dynamics of membership & policy
Build, buy, operate, & share infrastructure
Decouple consumer & provider For data, programs, services, computing, storage, instruments Address dynamics of community demand 19
Bootstrapping a VO by Assembling Services 1) Integrate services from other sources
Virtualize external services as VO services Content Services Capacity
Community Services Provider Capacity Provider
2) Coordinate & compose
Create new services from existing ones “Service-Oriented Science”, Science, 2005
20
Providing VO Services: (1) Integration from Other Sources z
z
z
Negotiate service level agreements Delegate and deploy capabilities/services
Community A
…
Community Z
Provision to deliver defined capability
z
Configure environment
z
Host layered functions 21
Virtualizing Existing Services into a VO z
Establish service agreement with service
z
E.g., WS-Agreement
Delegate use to VO user User A
VO User
User B
VO Admin
Existing Services 22
Deploying New Services Policy
Client
Allocate/provision Configure Initiate activity Monitor activity Control activity
Environment
Interface
Resource provider
Activity
WSRF (or WS-Transfer/WS-Man, etc.), Globus GRAM, Virtual Workspaces23
Available in High-Quality Open Source Software … Data Replication
Globus Toolkit v4 www.globus.org
Credential Mgmt
Replica Location
Grid Telecontrol Protocol
Delegation
Data Access & Integration
Community Scheduling Framework
WebMDS
Python Runtime
Community Authorization
Reliable File Transfer
Workspace Management
Trigger
C Runtime
Authentication Authorization
GridFTP
Grid Resource Allocation & Management
Index
Java Runtime
Security
Data Mgmt
Execution Mgmt
Info Services
Common Runtime
24 Globus Toolkit Version 4: Software for Service-Oriented Systems, LNCS 3779, 2-13, 2005
http://dev.globus.org Guidelines (Apache) Infrastructure (CVS, email, bugzilla, Wiki)
Projects Include …
25
Virtual Workspaces (Kate Keahey et al.) z
z z
z
z
GT4 service for the creation, monitoring, & management of virtual workspaces High-level workspace description Web Services interfaces for monitoring & managing Multiple implementations
Dynamic accounts
Xen virtual machines
(VMware virtual machines)
Virtual clusters as a higher-level construct 26
How do Grids and VMs Play Together? request VM EPR
VM Factory
create new VM image
Client
use existing VM image inspect & manage
Create VM image
VM Repository
deploy, suspend
VM Manager
Resource VM
start program
27
Virtual OSG Clusters OSG
OSG cluster Xen hypervisors TeraGrid cluster
“Virtual Clusters for Grid Communities,” Zhang et al., CCGrid 2006
28
Dynamic Service Deployment (Argonne + China Grid) z
Interface
Upload-push
Upload-pull
Deploy
Undeploy
Reload
“HAND: Highly Available Dynamic Deployment Infrastructure for GT4,” 29 Li Qi et al., 2006
Providing VO Services: (2) Coordination & Composition z
Take a set of provisioned services … … & compose to synthesize new behaviors
z
This is traditional service composition
But must also be concerned with emergent behaviors, autonomous interactions See the work of the agent & PlanetLab communities
“Brain vs. Brawn: Why Grids and Agents Need Each Other," Foster, Kesselman, Jennings, 2004.
30
The Globus-Based LIGO Data Grid LIGO Gravitational Wave Observatory
Birmingham• Cardiff
AEI/Golm
Replicating >1 Terabyte/day to 8 sites >40 million replicas so far MTBF = 1 month www.globus.org/solutions
31
Data Replication Service z
Pull “missing” files to a storage system
Data Movement Reliable File Transfer Service
GridFTP
GridFTP
“Design and Implementation of a Data Replication Service Based on the Lightweight 32 Data Replicator System,” Chervenak et al., 2005
Data Replication Service z
Pull “missing” files to a storage system Data Location
Data Movement Reliable File Transfer Service
GridFTP
Local Replica Catalog
Replica Location Index
GridFTP
Local Replica Catalog
Replica Location Index
“Design and Implementation of a Data Replication Service Based on the Lightweight 33 Data Replicator System,” Chervenak et al., 2005
Data Replication Service z
Pull “missing” files to a storage system Data Location
Data Movement Reliable File Transfer Service
Data Replication List of required Files
GridFTP
Local Replica Catalog
Replica Location Index
GridFTP
Local Replica Catalog
Replica Location Index
Data Replication Service
“Design and Implementation of a Data Replication Service Based on the Lightweight 34 Data Replicator System,” Chervenak et al., 2005
Composing Resources … Composing Services Deploy service Deploy container Deploy virtual machine Deploy hypervisor/OS Procure hardware
GridFTP
JVM VM Hypervisor/OS Physical machine
State exposed & access uniformly at all levels Provisioning, management, and monitoring at all levels35
Composing Resources … Composing Services Deploy service Deploy container Deploy virtual machine Deploy hypervisor/OS Procure hardware
DRS JVM VM
GridFTP
LRC
GridFTP
VO Services VM
Hypervisor/OS Physical machine
State exposed & access uniformly at all levels Provisioning, management, and monitoring at all levels36
Decomposition Enables Separation of Concerns & Roles
S1
User
D
S3
“Provide access to data D at S1, S2, S3 with performance P” Service Provider “Provide storage with performance P1, network with P2, …” Resource Provider
S2
S1 D
S2 S3
Replica catalog, User-level multicast, …
S1 D
S2 S3
37
Another Example: Astro Portal Stacking Service z
Purpose
z
On-demand “stacks” of random locations within ~10TB dataset
=
Challenge
z
+ + + + + + +
Rapid access to 1010K “random” files Time-varying load
Solution
Dynamic acquisition of compute, storage
S4
Web page or Web Service
Sloan Data
38
Astro Portal Stacking Performance (LAN GPFS)
39
Summary z
Community based science will be the norm
z
Many different types of communities
z
Community infrastructure will increasingly become the scientific observatory
Scaling requires a separation of concerns
z
Differ in coupling, membership, lifetime, size
Must think beyond science stovepipes
z
Requires collaborations across sciences— including computer science
Providers of resources, services, content
Small set of fundamental mechanisms required to build communities
40
For More Information z
Globus Alliance
z
Dev.Globus
z
www.opensciencegrid.org
TeraGrid
z
dev.globus.org
Open Science Grid
z
www.globus.org
www.teragrid.org
Background
2nd Edition www.mkp.com/grid2
www.mcs.anl.gov/~foster 41