• Advanced Operating Systems (Distributed Operating Systems ...

97 downloads 24875 Views 4MB Size Report
Jan 21, 2009 ... Basic concepts of computer architecture ... “Distributed Systems: Concepts and Design (4th ... edition)” by George Coulouris, Jean Dollimore,.
CS550 • Advanced Operating Systems (Distributed Operating Systems) • Instructor: Xian-He Sun – Email: [email protected], Phone: (312) 567-5260 – Office hours: 1:30pm-2:30pm Tuesday, Thursday at SB229C, or by appointment

• TA: TBA – Email: [email protected] – Office: xxx – Office hours: TBA

• Blackboard: – http://blackboard.iit.edu

• Class Web site – http://www.cs.iit.edu/~sun/cs550.html X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 1

Outline • Course information • Key issues of distributed operating systems • Hardware concepts – Multiprocessors – Multicomputers – Distributed systems

• Software concepts – – – –

Uniprocessor OS Distributed OS Network OS Middleware

X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 2

What This Course is About • Understanding the fundamental concepts of distributed operating system, and distributed systems in general • Learning distributed programming techniques – Multithreading, RPC, RMI, Sockets, MPI, etc.

• Understanding the general principles of distributed paradigms – MPI, JINI, NFS, Web Service, Grid, etc.

X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 3

Prerequisite • CS450 “Operating Systems” • Familiar with – Programming in C/C++ or Java – UNIX tools and development environment • Command • Editors (vi, emacs), compilers (gcc), makefiles (GNU make)

– Networking programming • Sockets • Multithreaded • RPC, Java RMI

– Basic concepts of computer architecture X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 4

Course Materials • Required: – “Distributed Systems: Principles and Paradigms (2nd edition)” by Tannenbaum and Van Steen, Pearson/Prentice Hall 2007

• Recommended: – “Distributed Systems: Concepts and Design (4th edition)” by George Coulouris, Jean Dollimore, and Tim Kindberg, Addison-Wesley, 2005

• Supplemental readings – “Virtual Machines: Versatile Platforms for Systems and Processes” by Jim Smith and Ravi Nair, Morgan Kaufmann, 2005 X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 5

Misc. Course Details • You are expected to attend all of the lectures and presentations • Grading – Written and programming assignments (35%): individual work – One exam (35%) – Final project (30%): individual or group with 2-3 students

• Use the course blackboard – – – – –

Announcements Lecture notes Assignments Discussion …

X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 6

Policies • Collaboration – Encouraged for high level concepts and understanding the courses materials – but …..

• Cheating – Copying all or part of another student's homework – Allowing another student to copy all or part of your homework – Copying all or part of code found in a book, magazine, the Internet, or other resource

X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 7

Any Questions?

X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 8

Personal Introduction • Research interests – Middleware – High End Computing – Performance Analysis and Modeling

• Research group: – Scalable Computing Software Laboratory (SCS) – http://www.cs.iit.edu/~scs/ – Weekly Research seminar

X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 9

Distributed Computing at SCS Many workstations are made available for graduate students

X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 10

Scalable Computing Software (SCS) Lab. Distributed Optical Testbed (Grid)

NU-C UIC ANL

NU-E Star Tap IIT

Uof C

NCSA/UIUC

Parallel Computers at SCS

I-WIRE OMNI

Pervasive Computing Environments at SCS X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 11

Scalable Systems • A computer system is called scalable if it can scale up to accommodate ever-increasing performance and functionality demand • A software is called scalable if it can maintain its functionality and efficiency while the underlying computer system and problem scale up

X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 12

Evolution of Computing Bigger becomes even bigger Smaller becomes ever smaller, & connected Japan’s Earth Simulator •640 processor nodes (PNs) •Each PN is a system with 8 vector-type arithmetic processors (APs) •Peak performance 40Tflops

approx; 50m x 65m x 17m

1.4m x1m x 2m X.Sun (IIT)

.

CS546

Lecture 1 Page 13

Scalable Computing The way to high performance  The fastest

supercomputer in November 2007  Scalable ultracomputer targeted for 106,496 compute nodes  BlueGene/L is now running with 212992 processors and 478.2 TFlops on LINPACK  Peak performance: 596 TFlops

BlueGene / L X.Sun (IIT)

CS546

Lecture 1 Page 14

Multicore Add Another Dimension IBM Multicore • Cell – 1 PPE and 8 SPEs – Shared L2 cache – EIB

• Power6 – Dual core

– 5 GHz

X.Sun (IIT)

CS546

Lecture 1 Page 15

Multi-Core •

Motivation for Multi-Core – – – –



Exploits improved feature-size and density Increases functional units per chip (spatial efficiency) Limits energy consumption per operation Constrains growth in processor complexity

Challenges resulting from multi-core –

Aggravates memory wall •

Memory bandwidth – –

• •



Need for parallel computing model and parallel programming model

Pins become strangle point • •



Memory latency Fragments L3 cache

Relies on effective exploitation of multiple-thread parallelism •



Way to get data out of memory banks Way to get data into multi-core processor array

Rate of pin growth projected to slow and flatten Rate of bandwidth per pin (pair) projected to grow slowly

Requires mechanisms for efficient inter-processor coordination • • •

Synchronization Mutual exclusion Context switching

X.Sun (IIT)

CS546

Lecture 2 Page 16

16

Distributed Computing: What is the new  Supercomputers become ever powerful  Communities of “Virtual organizations” are formed  No VO possesses all required skills and resources  From “community sharing” to “information grid” X.Sun (IIT)

CS546

Lecture 1 Page 17

Integrated VOs: the Grid Mimic the electrical power grid Increased Efficiency

Higher Quality of Service

Increased Productivity

Reduced Complexity & Cost Improved Resiliency

X.Sun (IIT)

CS546

Lecture 1 Page 18

The Challenge of Grid Computing Virtualization and Resource Management Many sources of data, services, computation RM

Discovery R

R

RM Access

Registries organize services of interest to a community RM

RM

Security Security service service Data integration activities may require access to, & exploration/analysis of, data at many locations X.Sun (IIT)

Security & policy must underlie access & management decisions

CS546

RM

Resource management is needed to ensure progress & arbitrate competing demands Policy Policy service service Exploration & analysis may involve complex, multi-step workflows Lecture 1 Page 19

Cloud Computing Mimic the electrical power grid Increased Efficiency

Higher Quality of Service

Increased Productivity

Reduced Complexity & Cost Improved Resiliency

X.Sun (IIT)

CS546

Lecture 1 Page 20

What is Cloud Computing? • A computing paradigm in which tasks are assigned to a combination of connections, software and services accessed over a network • The network of servers and connections is collectively known as the cloud • Other terms – Mesh Computing – Elastic Cloud Computing – Network Computing

X.Sun (IIT) 2009-1-21

CS546

Lecture 1 Page 21 21

What are the diff. between Cloud & Grid? A commercial version of Grid computing More likely under single management (VO) More likely provide computing resources than resource sharing More like to serve lots of modest size jobs billions of dollars being spent by the likes of Amazon, Google, and Microsoft to create real commercial grids The prospect of needing only a credit card to get on-demand access to 100,000+ computers in tens of data centers distributed throughout the world

X.Sun (IIT) 2009-1-21

CS546

Lecture 1 Page 22 22

Cloud: Integrated Resource Provide virtual computing environments on demand

X.Sun (IIT)

CS546

Lecture 1 Page 23

X.Sun (IIT)

CS546

Lecture 1 Page 24

Virtualization and Virtual Machine • Virtual service and virtual machine: the key for data center and Cloud computing • Virtual machine (in distributed environment): A hosting platform where each user can create and operate in a private machine(s), based on Grid/distribute infrastructure, achieving: – – – – –

Virtualization Isolation and Protection Privacy Accountability and QoS On-demand creation and provisioning

X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 25

Distributed (Dynamic) Virtual Machine DVM Virtual service node DVM’

DVM Host (physical) X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 26

Virtualization: Key Technique •

Two-level OS structure – –



Strong isolation – – – –



Host OS Guest OS Administration isolation Installation isolation Fault / attack Isolation Recovery, migration, and reconfiguration

DVMS1

DVMSn

… Guest OS

Guest OS

Virtual service node – – –

DVM Service (DVMS) Guest OS Internetworking enabled

Host OS One DVM host

Embedded Systems: What is the new • Devices become smaller and more powerful • Devices are coordinated via network • From “autonomous computing” to coordinated “humancenter computing”

X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 28

Pervasive Computing MIT’s view of pervasive computing

X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 29

Evolution of Computing Federated Communities Virtualization Standardization Uneven Conditions Remote Comm. Coordination High Availability Security & Fault Tolerance

Grid Computing

Distributed Computing Mobile

Mobility Mobile Network Adaptive & Reflective Energy Aware system

X.Sun (IIT)

Global Smart Space (Cloud)

Pervasive Computing

Computing Smart Space Invisibility Localized Scalability Context Awareness CS550: Advanced OS

Lecture 1 Page 30

Service Oriented Computing Computing as a service

Started far apart in applications & technology

Have been converging

WSRF

WS-I Compliant Technology Stack

• Internet computing: Convergence of Core Technology Standards allows Common base for Business and Technology Services

Web service

• Grid computing: Grid service and is merging with WS

• Pervasive computing: Human centered service

• Mobile computing: Phone service X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 31

Future Computing: Human-centered Service A new IT booming is coming They are connected to form `smart space’

Grids link `smart spaces’ to support `global smartness’

Devices become smaller and powerful

A device is an entry of the cyber world X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 32

The Third Wave of Computing Revolutions • Network, communication, and interconnectivity • Begin in the late 90s until now • Machine/machine, software/software, people/people • Anytime, anywhere, WWW • The communications landscape is shifting • Promising but a continued work

How do we get there? – Many Challenges ahead! X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 33

Resource Management & Task Scheduling • DVM provider selection: – Among a set of DVM providers, which one should be chosen to host an DVM?

• DVMS selection: – Among a set of potential tenants (DVMSes), which ones to host? (for QoS, resource utilization, security…)

• The Grid Harvest Service (GHS) System – A long-term application-level performance prediction and task scheduling system for nondedicated distributed (Grid) environments – Reservation-based versus shared resources – Good, but more issues, such as QoS, FT, need to solve X.Sun (IIT)

CS550: Advanced OS

Lecture 1 Page 34

Processor-memory performance gap • Processor performance increases rapidly

• Intel TeraFlops chip, 2007

– Aggregate processor performance much higher

100,000 Multi-core/many-core processor 10,000 Uni-rocessor

20%

1,000

52% 100

25% 10 1 1980

• Memory: ~9% per year • Processor-memory speed gap keeps increasing

Memory

1985

1990

1995

2000

2005

2010

Year

The Memory-wall problem requires a rethinking of the design of OS X.Sun (IIT)

60%

Source: Intel

Performance

– Uni-processor: ~52% until 2004, ~25% since then – New trend: multi-core/manycore architecture

CS546

Source: OCZ

Lecture 1 Page 35

9%

Gordon Moore’s Law “the number of transistors that can be fabricated on a single integrated circuit at a reasonable cost doubles every year…”

• How? – Material techniques such as extreme ultraviolet lithography (