Distributed and cloud computing : from parallel processing to ... - GBV

Distributed and Cloud

Computing From Parallel

Processing Internet of

to the

Things

Kai

Hwang

Geoffrey C. Fox Jack J.

AMSTERDAM

•

NEW YORK

BOSTON •

SAN FRANCISCO

ELSEVIER

•

OXFORD •

HEIDELBERG •

PARIS

SINGAPORE

Morgan Kaufmann is

an

•

•

•

LONDON

SAN DIEGO

SYDNEY

•

imprint of Elsevier

TOKYO

Dongarra

M< MORGAN

KAUfMANM

Contents Preface

xv

About the Authors

xix

Foreword

xxi

SYSTEMS MODELING, CLUSTERING, AND VISUALIZATION

PART 1

CHAPTER 1

Distributed System Models and

Enabling Technologies

Scalable Computing 1.1.1

The

Age

over

of Internet

4

the Internet

4

Computing

Paradigms Cyber-Physical Systems

1.1.2 Scalable Computing Trends and New 1.1.3 The Internet of Things and 1.2

Technologies 1.2.1 1.2.2 1.2.3

for Network-Based

Systems Multithreading Technologies. GPU Computing to Exascale and Beyond Memory, Storage, and Wide-Area Networking Multicore CPUs and

1.2.4 Virtual Machines and Virtualization Middleware

1.2.5 Data Center Virtualization for Cloud

1.3

System

Computing

Models for Distributed and Cloud

Computing

8 11

13 14 17 20 22 25

27

1.3.1 Clusters of Cooperative Computers

28

1.3.2 Grid Computing Infrastructures

29

1.3.3 Peer-to-Peer Network Families

32

1.3.4 Cloud Computing

over

the Internet

1.4 Software Environments for Distributed Systems and Clouds

34 36

1.4.1 Service-Oriented Architecture

37

1.4.2

40

1.4.3

(SOA) Trends toward Distributed Operating Systems Parallel and Distributed Programming Models

1.5 Performance, Security, and Energy

Efficiency

42 44

1.5.1 Performance Metrics and

45

1.5.2

Scalability Analysis Fault Tolerance and System Availability Network Threats and Data Integrity Energy Efficiency in Distributed Computing

48

Notes and Homework Problems

Bibliographic Acknowledgments

55

References

56

Homework Problems

58

1.5.3 1.5.4

1.6

3 4

Summary 1.1

1

49 51

56

vii

viii

Contents

CHAPTER 2

Computer Clusters for Scalable Parallel Computing

65

Summary

66 66

2.1 Clustering for Massive Parallelism

Development Design Objectives of Computer Clusters Fundamental Cluster Design Issues Analysis of the Top 500 Supercomputers

66

2.2 Computer Clusters and MPP Architectures

75

2.1.1 2.1.2 2.1.3

2.1.4

Trends

Cluster

2.2.1 Cluster

Organization

and Resource

2.2.2 Node Architectures and MPP 2.2.3 Cluster

Sharing Packaging

System Interconnects

2.2.4 Hardware,

68 69 71

76 77 80

Software, and Middleware Support

83

2.2.5 GPU Clusters for Massive Parallelism

83

2.3 Design Principles of Computer Clusters

87

2.3.1 2.3.2

87

Single-System Image Features High Availability through Redundancy

95

2.3.3 Fault-Tolerant Cluster Configurations 2.3.4

Checkpointing

99

and Recovery Techniques

101

2.4 Cluster Job and Resource Management

104

2.4.1 Cluster Job Scheduling Methods

104

2.4.2 Cluster Job Management

107

Systems

2.4.3 Load Sharing Facility (LSF) for Cluster

Computing

2.4.4 MOSIX: An OS for Linux Clusters and Clouds

2.5 Case Studies of

Top Supercomputer Systems

CHAPTER 3

110 112

2.5.1 Tianhe-IA: The World Fastest

112

2.5.2

116

Cray

XT5 Jaguar: The

2.5.3 IBM Roadrunner. The

2.6

109

Supercomputer in 2010 Top Supercomputer in 2009 Top Supercomputer in 2008

119

Bibliographic Notes and Homework Problems Acknowledgments

120

References

121

Homework Problems

122

Virtual Machines and Visualization of Clusters and Data Centers

121

129 130

Summary 3.1 Implementation Levels of Virtualization

130

3.1.1 Levels of Virtualization

130

3.1.2 VMM

133

Implementation Design Requirements and Providers

3.1.3 Virtualization Support at the OS Level 3.1.4 Middleware

Support

for Virtualization

3.2 Virtualization Structures/Tools and Mechanisms 3.2.1

Hypervisor

and Xen Architecture

135 138

140 140

3.2.2 Binary Translation with Full Virtualization

141

3.2.3 Para-Virtualization with

143

Compiler Support

Contents

3.3 Virtualization of CPU, Memory, and I/O Devices 3.3.1

Hardware

Support

145

for Virtualization

145

3.3.2 CPU Virtualization

147

3.3.3 Memory Virtualization

148

3.3.4 I/O Virtualization

150

3.3.5 Virtualization in Multi-Core Processors

153

3.4 Virtual Clusters and Resource Management 3.4.1

Physical

3.4.2 Live VM 3.4.3 3.4.4

versus

155

Virtual Clusters

Migration Steps

156

and Performance Effects

Migration of Memory, Files, and Network Dynamic Deployment of Virtual Clusters

159

Resources

162 165

3.5 Virtualization for Data-Center Automation

169

3.5.1 Server Consolidation in Data Centers

169

3.5.2 Virtual

171

Storage Management

3.5.3 Cloud OS for Virtualized Data Centers

172

3.5.4 Trust

176

Management

in Virtualized Data Centers

3.6 Bibliographic Notes and Homework Problems

PART 2

179

Acknowledgments

179

References

180

Homework Problems

183

COMPUTING CLOUDS, SERVICE-ORIENTED ARCHITECTURE, AND PROGRAMMING

CHAPTER 4

ix

Cloud Platform Architecture

over

189

Virtualized Data Centers

4.1 Cloud Computing and Service Models 4.1.1

192

Public, Private, and Hybrid Clouds

4.1.2 Cloud

191 192

Summary

Ecosystem

and

196

Enabling Technologies

4.1.3 Infrastructure-as-a-Service 4.1.4 Platform-as-a-Service

192

200

(IaaS)

(PaaS)

and Software-as-a-Service

4.2 Data-Center Design and Interconnection Networks 4.2.1 Warehouse-Scale Data-Center

203

206 206

Design

4.2.2 Data-Center Interconnection Networks 4.2.3 Modular Data Center in

(SaaS)

208

Shipping Containers

211

4.2.4 Interconnection of Modular Data Centers

212

4.2.5

213

Data-Center

4.3 Architectural

Management

Design

of

Issues

Compute

4.3.1

A Generic Cloud Architecture

4.3.2

Layered Cloud

4.3.3

Virtualization

4.3.4 Architectural

Architectural

Support

and

Storage

Design

Development

and Disaster

Design Challenges

Recovery

Clouds

215 215 218 221 225

x

Contents

4.4 Public Cloud Platforms: GAE, AWS, and Azure

227

4.4.1 Public Clouds and Service

227

4.4.2

229

Offerings Google App Engine (GAE)

4.4.3 Amazon Web Services (AWS)

231

4.4.4 Microsoft Windows Azure

233

4.5 Inter-cloud Resource Management

234

4.5.1 Extended Cloud Computing Services

235

4.5.2 Resource Provisioning and Platform

237

4.5.3 Virtual Machine Creation and

243

Deployment Management

4.5.4 Global Exchange of Cloud Resources

246

4.6 Cloud Security and Trust Management

249

4.6.1 Cloud Security Defense

253

4.6.3 Data and Software Protection

255

4.6.4 4.7

249

Strategies

4.6.2 Distributed Intrusion/Anomaly Detection

Reputation-Guided

Techniques

Protection of Data Centers


Bibliographic

Acknowledgements

261

References

261

Homework Problems

265

CHAPTER 5 Service-Oriented Architectures for Distributed Computing

271 272

Summary 5.1

257

261

272

Services and Service-Oriented Architecture 5.1.1 REST and Systems of

273

Systems

5.1.2 Services and Web Services

277

5.1.3

282

Enterprise Multitier

Architecture

283

5.1.4 Grid Services and OGSA 5.1.5 Other Service-Oriented Architectures and

Systems

5.2 Message-Oriented Middleware 5.2.1

Enterprise

287

289 289

Bus

5.2.2 Publish-Subscribe Model and Notification

291

5.2.3

291

and

Queuing

5.2.4 Cloud

or

Messaging Systems Applications

Grid Middleware

5.3 Portals and Science 5.3.1 Science

Gateways

Gateway Exemplars

5.3.2 HUBzero Platform for Scientific Collaboration 5.3.3

Environments (OGCE)

Open Gateway Computing 5.4 Discovery, Registries, Metadata, and Databases 5.4.1 UDDI and Service

291

294

Registries

295 297 301

304 304

5.4.2 Databases and Publish-Subscribe

307

5.4.3 Metadata

308

Catalogs

5.4.4 Semantic Web and Grid

309

5.4.5 Job Execution Environments and Monitoring

312

Contents

5.5 Workflow in Service-Oriented Architectures 5.5.1

Basic Workflow

314 315

Concepts

5.5.2 Workflow Standards

316

5.5.3 Workflow Architecture and 5.5.4 Workflow Execution 5.5.5

5.6

CHAPTER 6

Scripting

Bibliographic

Workflow

317

Specification

319

Engine

System Swift

321


322

Acknowledgements

324

References

324

Homework Problems

331

Cloud

Programming

and Software Environments

335 336

Summary 6.1

Features of Cloud and Grid Platforms

336

6.1.1

336

Cloud

Capabilities and

Platform Features

6.1.2 Traditional Features Common to Grids and Clouds

336

6.1.3 Data Features and Databases

340

6.1.4

341

Programming

and Runtime

6.2 Parallel and Distributed

Support

343

Programming Paradigms

6.2.1 Parallel Computing and Programming Paradigms

344

6.2.2

MapReduce, Twister,

345

6.2.3

Hadoop Library from Apache

and Iterative

MapReduce

355

6.2.4 Dryad and DryadLINQ from Microsoft

359

6.2.5 Sawzall and Pig Latin High-Level Languages 6.2.6 Mapping Applications 6.3

to Parallel and Distributed

365

Systems

370

6.3.1 Programming the Google App Engine

370

6.3.2 Google File System (GFS)

373

6.3.3 BigTable, Google's NOSQL System

376

Programming 6.4.1

on

Programming

6.4.2 Amazon

379

Amazon AWS and Microsoft Azure on

Amazon EC2

Simple Storage

Service

6.4.3 Amazon Elastic Block Store 6.4.4 Microsoft Azure

6.5

Emerging

379 380 382

(S3)

(EBS)

and

SimpleDB

383

Programming Support

384

Cloud Software Environments

387

6.5.1 Open Source Eucalyptus and Nimbus 6.5.2 OpenNebula, 6.5.3 Manjrasoft 6.6

368

Programming Support of Google App Engine

6.3.4 Chubby, Google's Distributed Lock Service 6.4

xi

Sector/Sphere,

and

Aneka Cloud and

Bibliographic Notes Acknowledgement

'.

OpenStack

Appliances

and Homework Problems

387

389 393 399 399

References

399

Homework Problems

405

xii

Contents

PART 3

GRIDS, P2P,

CHAPTER 7

Grid

AND THE FUTURE INTERNET

Computing Systems and Resource Management

7.1 Grid Architecture and Service Modeling Grid

7.1.2 CPU

7.1.3

415 416

Summary 7.1.1

413

416

and Service Families

416

and Virtual

419

History Scavenging

Supercomputers

Grid Services Architecture (OGSA)

Open

422

7.1.4 Data-Intensive Grid Service Models

425

7.2 Grid Projects and Grid Systems Built

427

7.2.1

National Grids and International

Projects

430

7.2.3 DataGrid in the

431

7.2.4 The ChinaGrid

Union

European Design Experiences

434

7.3 Grid Resource Management and Brokering

435

7.3.2

437

7.3.4

Management and Job Scheduling Grid Resource Monitoring with CGSP Service Accounting and Economy Model Resource Brokering with Gridbus

7.4 Software and Middleware for Grid Computing 7.4.1

Source Grid Middleware Packages

Open

439

440 443 444

7.4.2 The Globus Toolkit Architecture (GT4)

446

7.4.3 Containers and Resources/Data

450

7.4.4 The ChinaGrid 7.5 Grid

Application

7.5.1 Grid

Support

Platform (CGSP)

Trends and

Applications

and

Management

Security

Technology

Measures

Fusion

452 455 456

7.5.2 Grid Workload and Performance Prediction

457

7.5.3 Trust Models for Grid

461

Security

Enforcement

7.5.4 Authentication and Authorization Methods 7.5.5 Grid

Security

Infrastructure (GSI)

Bibliographic Acknowledgments

470

471

Distributed and cloud computing : from parallel processing to ... - GBV

Distributed and cloud computing : from parallel processing to ... - GBV

Suggest Documents

Distributed and Cloud Computing From Parallel Processing to the ...

Cloud Computing - Parallel and Distributed Systems

Service-oriented Computing and Cloud computing - Distributed ...

Parallel Distributed Processing Approaches to ... - Semantic Scholar

DISTRIBUTED PARALLEL PROCESSING TECHNIQUES ... - CiteSeerX

From Cloud Computing to Fog Computing

Parallel and Distributed Computing Laboratory - Semantic Scholar

Parallel and Distributed Computing Laboratory - CiteSeerX

Parallel, distributed and GPU computing technologies ... - BioMedSearch

Parallel, distributed and GPU computing technologies ... - BioMedSearch

Parallel and Distributed Computing Using the Java

Carrier-Grade Distributed Cloud Computing

Parallel and Distributed Stream Processing: Systems ... - Hal

Parallel and Distributed Processing - Semantic Scholar

[Book] [pdf] Distributed and Cloud Computing: From ... - Google Sites

IRJET- Cloud Computing and Distributed System

Grid, Distributed and Cloud Computing Resources Primer

Distributed and Cloud Computing - Elsevier Store

Cloud Computing and Distributed Systems - Usc

Teaching Parallel and Distributed Computing to ... - Semantic Scholar

Parallel Computing and GPU Processing Approaches to GAIL Routines

Distributed Aggregation for Data-Parallel Computing ... - SIGOPS

Parallel Distributed Computing using Python - High-Performance ...

Distributed Aggregation for Data-Parallel Computing - CiteSeerX