Mar 4, 2008 ... I. Important Challenges. III. Important Technologies, Trends and Standards. V.
Case Studies. VII.From GRID Computing to Cloud Computing ...
IBM India
From Grid Computing to Cloud Computing – The IBM Approach Garuda Partner Meet ,4th March 2008,Bangalore,India
P. Sambath Narayanan Ph.D India Systems & Technology Lab IBM
© 2005 IBM Corporation
Agenda I. Important Challenges III. Important Technologies, Trends and Standards V. Case Studies VII.From GRID Computing to Cloud Computing
2
IBM India
Terminology
Virtualization Service Orientation Data management information Data Service Policy Management Interoperable Automation Lifecycle
© 2005 IBM Corporation
IBM India
I. Some Important Challenges Data Management – Right Time & Right Data Network Bandwidth and Latency Security Software and Standards Need for many Grid based Scientific and Commercial Applications
Enable smooth scaling in many dimensions Integration with the physical world
© 2005 IBM Corporation
IBM India
© 2005 IBM Corporation
IBM India
© 2005 IBM Corporation
IBM India
Few Important Challenges(contnd.) Data Management is a Challenge Diverse
usage scenarios
Volume
of data - TBs
Right
data at right time
Format
of data
Heterogenity
of systems at all level
Bandwidth,
transfer, manipulation and analysis of large volume of data
© 2005 IBM Corporation
IBM India
Few Important Challenges (contnd.) Network Bandwidth
Large volume of data needs to be transferred across the network
Ensuring right data to be available at the right time
Latency, Bandwidth, transfer, manipulation and analysis of large volume of data
Cost of Bandwidth
© 2005 IBM Corporation
IBM India
Grid Security Intrusion Detection
Secure conversations
Credential & Id Translation
Access control enforcement
Audit
Anti Virus Management Service/end-point policy
Mapping rules
Authorize Policy
Privacy Policy
Policy Expression and exchange
Key Managmnt
Bindings security(transport, protocol,message security
Secure logging
User Managmnt.
Trust Model
Policy Management (Auth, Privacy, federation
© 2005 IBM Corporation
IBM India
II. Key Technologies Virtualization Storage / Data Management Grid Security Grid Software
© 2005 IBM Corporation
IBM India
Keywords
Virtualization Service SOA Data management information Data Service Policy Management Interoperable Automated Lifecycle
© 2005 IBM Corporation
IBM India
Grid Technology Evolution
Grid Adoption & Acceptance
Managed,Shared Virtual System
OGSA Standards,GT2 Many Deployments
Globus Toolkit Many Deployments Scientific Applications
1990
1995
2000
2005
2007
© 2005 IBM Corporation
IBM India
Grid Open Standards OASIS (organization for the advancement of structured information standard)
WS – Resource Framework
WS – Notification
Open Grid Services Architecture-GGF
OGSA Basic Profile
OGSA Security Profile
Basic Execution Services (OGSA-BES)
Job Submission Description Language (JSDL)
Data Access and Integration Services (DAIS)
Configuration Description, Deployment, and Lifecycle Management (CDDLM)
OGSA Byte I/O (Byte IO) © 2005 IBM Corporation
IBM India
OGSA Design Principles Service Orientation to virtualize resources
Everything is a service
From Web service
Standard interface service mechanisms, multiple protocols bindings, local/remote transparency
From Grids
Service semantics, reliability and security models
Life cycle management, discovery and other services
Multiple hosting environments
C,J2EE,.NET © 2005 IBM Corporation
IBM India
Technology Classification & Trends Application technologies Serial Applications
Client Server
P2P
Service virtualization
Parallel Applications
•CORBA
•App Integration
•Web services
•Multi-threaded
•COM/DCOM
•Reliable Messaging
•Service registration,
•MPI
•.NET, J2EE •Home grown work
•open
• distribution
Mainframes
•Location independent
Distributed
Virtualized
Open Systems
Clusters
Infra. Virtualization
•Unix, Linux,Windows
•DRM
•Grid
Storage Storage
Discovery, invocation •Lift App off the servers
Open
Monolithic
•DAS
•Reliable execution
•DAS
Storage •DAS
•OGSA •Data Grid •Service provisioning
Infrastructure
© 2005 IBM Corporation
Virtualization-Single system & partitioning Dynamically Resizable
Int Virt Manager
Linux
AIX 5L V5.2
AIX 5L V5.3
Storage Sharing Ethernet Sharing
3 3 6 Cores CoresCores Micro-partitioning AIX Linux 5L V5.3
Virtual I/O paths
POWER Hypervisor PLM Partitions Manager Server
LPAR 1 AIX 5L V5.2
LPAR 2 AIX 5L V5.3
PLM Agent
PLM Agent
Unmanaged Partitions
LPAR 3 Linux
AIX 5L V5.3 AIX 5L V5.3 AIX 5L V5.3
Linux
6 2 Cores Cores
AIX 5L V5.3
Virtual I/O Server Partition
3 Cores
AIX 5L V5.3 Linux Linux
1 Cores
Features Micro-partitioning Share processors across multiple partitions Minimum Partition: 1/10 processor AIX 5L V5.3 or Linux*
Virtual I/O Server Shared Ethernet Shared SCSI & Fiber Channel Int Virtualization Manager AIX 5L V5.3 & Linux partitions
Partition LoadManager AIX 5L V5.2 & V5.3 supported Balances Processor & memory request
Partition Mobility
POWER Hypervisor * = SLES 9 or RedHat v3 with update 3 16
IBM India
Virtualization – Information/Storage Technology
Helps in addressing Data Management Challenges
Integrated view of storage, fs and DB driven by standard
Data transformation, security and replication
© 2005 IBM Corporation
IBM India
Virtualization - Workload Technology
Workload Management Challenge
Single logical view of workload scheduling
Different type of scheduling environments and domains
Workload virtualization strategy is to create a single, logical view of workload scheduling.
This will enable users to accelerate performance of multiple large application workloads across their organization, leveraging and orchestrating IT resources in a flexible and dynamic fashion.
© 2005 IBM Corporation
IBM India
Who does virtualization Information
Virtualize Like Resources
Single Systems & Partitioning
Systems Edition
Sophisticated (4+)
Integrated Cluster Environment
Virtualize Outside the Enterprise
CSM
LSF LoadLeveler
Cluster Systems Manager
SAN Volume Controller GPFS SAN FS NFS V4
Virtualize the Enterprise
Management
Simple (2-4)
Cluster
Virtualize Unlike Resources
Workload
Information Integrator
Symphony Provisioning Manager Enterprise XD Enterprise EditionWorkload Extended Deployment Manager
GridServer
MP Enterprise Intelligent MultiClusterOrchestrator LoadLeveler MultiCluster
IBM Grid Toolbox
© 2005 IBM Corporation
IBM India
Overcoming Network Challenges Through efficient utilization of the Network.
IBM Download Grid example. Explained in later slides.
Integrating with Global Research Networks National Research & Education Networks.
Supporting Research and Education communities
Specialized ISP
Lambda Grid
© 2005 IBM Corporation
IBM India
© 2005 IBM Corporation
IBM India
Overcoming Grid Security Challenges Three key attributes of Grid Security Model
Enables integration and interoperability
Creation and management of dynamic trust domains
Supports dynamic creation of services
OGSA Security
Web services security standard
Grid Security Infrastructure (GSI)
Portion of the Globus tool kit that implements security function
© 2005 IBM Corporation
IBM India
Across the Spectrum: Real Life References
Virtualize Outside the Enterprise
Virtualize the Enterprise
Virtualize Unlike Resources China Grid
Virtualize Like Resources
IBM
Ministry of Education People’s Republic of China
National Digital Mammography Archive Cluster Single Systems & Partitioning Simple (2-4)
Sophisticated (4+)
© 2005 IBM Corporation
IBM India
Earth System Grid(ESG) - Case Study Overcoming Data Management Challenges Service = Repository Storage Repository for Model generated atmospheric data 3200 Users 91,000 files More than 150 TB of data downloaded More than 300 research papers 600
600
Daily
7-Day Average
400
300
200
100
10 /1 /0 6
9/ 1/ 06
8/ 1/ 06
7/ 1/ 06
6/ 1/ 06
5/ 1/ 06
4/ 1/ 06
3/ 1/ 06
2/ 1/ 06
1/ 1/ 06
12 /1 /0 5
11 /1 /0 5
10 /1 /0 5
9/ 1/ 05
8/ 1/ 05
7/ 1/ 05
6/ 1/ 05
5/ 1/ 05
4/ 1/ 05
3/ 1/ 05
2/ 1/ 05
1/ 1/ 05
0
12 /1 /0 4
11 /1 /0 4
GB/day
GB/day
500
© 2005 IBM Corporation
IBM India
ESG Architecture & Technologies Climate data Metadata NcML
ORNL HPSS
NCAR NCAR Cache MSS
catalog
(metadata schema)
RLS
SRM
RLS
SRM
OPenDAP-G
(aggregation and subsetting)
MyProxy SRM
Data management Data
Mover Lite
Storage
Resource Manager
SRM NERSC
RLS
ESG Web Portal User Catalogs RegistrationBrowsing
Globus toolkit Globus
Access Control
Security Infrastructure
Data Search
RLS
LANL Cache
Climate Data Metadata Download
Data Data Usage SubsettingPublishing Metrics
GridFTP Monitoring
DISK OPeNDAP-G Cache
Monitoring Services
and Discovery
Services Replica
Location Service
Security Access
control
MyProxy User
registration
Web Browser Data Provider
publish
search browse download
Web Browser DML
Data User
MSS, HPSS: Tertiary data storage systems © 2005 IBM Corporation
IBM India
Energy Exploration – Case Study Service = Seismic Computing 3-D Seismic imaging is the most resource intensive Grid Enabled system for Seismic Imaging Gulf of Mexico 3-D Marine Surveys Estimated run times on a cluster(128 cpu,2.4 GHz,Pentium) Compute intensive wave equation provides better accuracy
Slides are based on the work done by 3DGeo Team. See Reference Material © 2005 IBM Corporation
IBM India
Parallelization of PSDM on Multiple Clusters Clusters from 3DGeo processing centres and Clusters from SDSC MPICH-G2 / MPICHGP(Kum Rye Park) Globus Tool kit DCs – SantaClara,Houston,SanD iego Supercomputing Centre This slide is based on the work done by 3DGeo Team. See Reference Material © 2005 IBM Corporation
IBM India
Medical Education Over Access Grid Work Done by J.Silverstein, U. Chicago
© 2005 IBM Corporation
IBM India
National Digital Mammography Archive Electronic Medical Record data grid and repository
Motivation
To help doctors and medical students learn more about breast cancer and related diseases
Challenges
Managing and storing of huge files for fast retrival
Annual NDMA volume could exceed 5.6 peta bytes per year – Image size 160 MB per study
Minimum daily traffic estimated 28 TB
NETwork bandwidth and response
Encryption of patient data and transmission across public networks
© 2005 IBM Corporation
IBM India
Service Oriented Science – Cancer & Biology caBIG: sharing of infrastructure, applications, and data.
Data Integration!
© 2005 IBM Corporation
IBM India
© 2005 IBM Corporation
IBM India
Technology Evolution
Grid Adoption & Acceptance
Managed,Shared Virtual System
OGSA Standards,GT2 Many Deployments
Globus Toolkit Many Deployments Scientific Applications
1990
1995
2000
2005
2007
© 2005 IBM Corporation
IBM India
Business Challenges With demand for IT resources hard to predict, service providers usually over-provision resources in order to support peak demands and ensure continuous service availability and quality, while other systems run at lower capacity,
© 2005 IBM Corporation
IBM India
Cloud Computing Defined Large pools of systems are linked together to provide IT services Service-based online economy
resources and services are transparently provisioned and managed.
© 2005 IBM Corporation
IBM India
The Need - Cloud Computing Dramatic growth in connected devices Real-time data streams the adoption of service oriented architectures Web 2.0 applications Open collaboration, social networking and mobile commerce.
Massive increase in the scale of IT environments driving the need to manage them as a unified cloud. © 2005 IBM Corporation
IBM India
Business Solution After Cloud Computing Cloud-computing-based technologies that will enable the borderless delivery of IT services based on actual demands to keep costs competitive.
Seamless delivery of services to consumers regardless of demand or available computing resources
Virtualization and Grid Technologies
© 2005 IBM Corporation
IBM India
Cloud Computing Example Delivery of online entertainment. Distribution of television shows, movies and other videos
are moving to the Web the cloud computing technologies would enable a network of service providers to host the different media. Using cloud computing technology, the broadcasters can join forces to reach a service cooperation contract that enables them to tap into advanced services including content distribution, load balancing, and overlay networking across different platforms in different countries. If there is large demand for a show hosted by a particular site, it can dynamically 'hire' additional servers and services from other sites that are not being used.
© 2005 IBM Corporation
IBM India
Blue Cloud Series of cloud computing offerings Allow corporate data centers to operate more like the Internet
Enable computing across a distributed, globally accessible fabric of resources, rather than on local machines or remote server farms.
© 2005 IBM Corporation
IBM India
Reference Material A Virtualization Experience: IBM Worldwide Grid Implementation, Moon Kim et al., IBM Red Books, IBM Grid 2, Edited by Ian Foster & Carl Kesselman,Elsevier,2004 Grid computing for energy exploration, D.Beve,S.E.Zarantonello,N.Kaushik,I.Musat
© 2005 IBM Corporation
IBM India
Summary 1) Storage, Security, Network and Application availabilty are 2) 3) 4) 5)
major Grid challenges Virtualization is an important technology for the Grid Many large Grid projects have been working successfully Think of Grid for variety of services, not just for computing alone Grid, virtualization and service orientation have many things in common
© 2005 IBM Corporation
IBM India
From Grid Computing to Cloud Computing – The IBM Approach Garuda Partner Meet ,4th March 2008,Bangalore,India
P. Sambath Narayanan Ph.D
[email protected]
© 2005 IBM Corporation