A Distributed Device Paradigm for Commodity ... - Semantic Scholar

2 downloads 0 Views 462KB Size Report
VComponent. CPU. RAM. Graphics Card. User Mode. Kernel Mode. Virtual Graphics. Card. Virtual Disk. Device Level Aggregaion/Driver Level. Implementaion.
A
Distributed
Device
Paradigm
for
 Commodity
Applica8ons


Enrique
Cauich,
John
Duselis,
Richert
Wang
and
Isaac
Scherson
 Dept.
of
Computer
Science
–
Systems
 Donald
Bren
School
of
InformaAon
and
Computer
Sciences
 University
of
California,
Irvine
 Irvine,
CA
92697


1


Observa8on
(Mo8va8on)
 In
most
modern
(corporate)
networked
system
 installa8ons,
a
large
number
of
computers
are
 underu8lized.
 Powerful
machines
are
devoted
to
liFle
work
 and/or
a
few
non‐demanding
tasks,
while
 other
processing
computers
(used
for
data
 analysis)
may
require
more
resources
than
 normally
implemented.

 2


Conjectures
 Remote high performance components/devices can be  aggregated and presented to a system as local  virtual devices.  These virtual devices can deliver be:er performance  than local physical devices by using the network links  as a virtual system bus.  Applica>ons using virtual devices do not require any  code modifica>ons.  3


Novel
Paradigm’s
Approach
 Applica8on


Applica8on


Applica8on


Middleware


Applica8on


System


System
 Network
 Network


CPU


Service
 Provider


Service
 Provider


Service
 Provider


Conven8onal
Library
Level
 Implementa8on
(e.g.
PVM)



RAM



Networked
Resources


Storage
 Devices


Graphics
 Devices


New
 System
Perspec8ve


Device
Level
Aggrega8on/Driver
Level
 Implementa8on


4


Device
Level
Aggrega8on/Driver
Level
 Implementa8on
 User
Mode


Applica8on


Applica8on


Applica8on


Opera8ng
System
APIs
 Virtual
Disk
 Kernel
Subsystem


Kernel
Mode
 Microkernel


Driver


Virtual
Graphics
 Card


Driver
 VDriver
 VDriver
 
























Middleware


Hardware
 High
Speed
Network


VComponent


CPU


VComponent


RAM


VComponent


FlashDrive


VComponent


VComponent


SAN


RAM


VComponent


Graphics
Card


5


Resource
Virtualiza8on
and
 Aggrega8on
 • 

Resource
Virtualiza8on
is
divided
in
 two
phases
 –  Lender
side:



•  Implements
a
Component
Abstrac8on
 Layer

(CAL).
 •  Provides
basic
Building
Blocks



DAL
 Aggrega8on


–  Borrower
side:

 • 

Implements
a
Device
Abstrac8on
Layer
 (DAL).


•  Creates
a
specific
virtual
device
 sa8sfying
the
needs
of
the
applica8on
 using
the
Building
Blocks



• 



















Network

















 
CAL


Resource
Aggrega8on
is
done
at
the
 borrower
side
 –  Resource
Discovery
 –  Resource
Management
 –  Resource
Consolidator


CPU
 Borrower
 Virtualiza8on


Memory


Graphics
Card


Aggrega8on


Lender
 Virtualiza8on


6


Virtualiza8on
Paradigm
 Virtual
Device
 Driver
Interface
 Consolidator



DAL


Resource
 Discovery


Aggrega8on


Resource
 Manager




















Network

















 
CAL


CPU
 Borrower
Virtualiza8on


Memory


Virtual
Components
 Virtualizer


Virtualizer


So^ware


Hardware


Graphics
Card


Aggrega8on


Lender
Virtualiza8on


7


Virtual
Device
Architecture


• Exploits
parallelism


Driver

 Interface


Main
Queue


Virtual
Driver


• Avoids
Data
hazards


I/O
Request
 I/O
Request


Resource

 Queues


• Maintains
coherency


I/O
Request


I/O
 Request


I/O


Request


I/O


Request


Consolidator


8


Two
Experimental
Case
Studies
 •  High
Performance
Distributed
RamDisk
 Improves
I/O
intensive
(Disk)
applica8ons’
performance
by
using
idle
 semiconductor
memory
resources
in
the
network.


•  Distributed
Graphics
Device
 
A
Codec
device
that
enhances
the
system
with
ray
tracing
capabili8es
by
adding
 virtualized
idle
processor
and
graphics
devices.


9


Distributed
RamDisk
 •  Harnessing
available
resources
 from
under‐u8lized
systems
in
the
 network.
 •  Memory
is
an
important
resource.
 •  Generally
used
as
temporary
 storage
to
speed
up
memory
 references


Driver
Interface
 Consolidator
 Resource
 Discovery



Improves
I/O
intensive
(Disk)
 applica8ons

performance
by
 using
idle
memory
resources
in
 the
network.


High
Speed
Network
 VComponent


Storage
bandwidth
is
 increased
and
storage
latency
 is
decreased


Resource
 Manager


RAM


VComponent


RAM


VComponent


RAM


VComponent


RAM


10


Distributed
RamDisk
Read
Throughput
in
MB/s

 1
KB


2
KB


4
KB


8
KB


16
KB


32
KB


64
KB


100
 90


Throughput
in
MB/s


80
 70
 60
 50
 40
 30
 20
 10
 0
 0


4


8


12


16


20


24


28


32


Outstanding
Requests
 11


Distributed
RamDisk
Write
Throughput
in
MB/s

 1
KB


2
KB


4
KB


8
KB


16
KB


32
KB


64
KB


90


Throughput
in
MB/s


80
 70
 60
 50
 40
 30
 20
 10
 0
 0


4


8


12


16


Outstanding
Requests


20


24


28


32


12


Distributed
RamDisk
Comparison
using
 8
outstanding
requests
 100
 90


Throughput
in
MB/s


80


10k
rpm
read


70


10k
rpm
write


60


7.2k
rpm
read


50


7.2k
rpm
write


40


Hinnes


DRD
1
read


30


DRD
1
write


20


Roussev
 Roussev


10


Hinnes


0
 1


2


4


8


16


32


64


128


256


512


Request

Block

Size


13


Accordion
Effect
 TCP
 S

egm

TCP
Send
Buffer


ents


 X 
X



ACKs
 s t n e gm

e 
TCP
S d e t a Repe

TCP
 S

TCP
Receive
 Buffer


egm

TCP
Send
Buffer


ent
R etran

smis


TCP


d smiFe n a r t e R


ACK
 s t n e Segm

sion



TCP
Receive
 Buffer


TCP
Send
Buffer
 Next
TCP
Segme

nt


Resource
 Lender


Resource
 Borrower


14


TCP
receive
window
size


15


TCP
receive
window
size


16


TCP
receive
window
size
(Cont.)
 Distributed
RamDisk
throughput
while
using
1,
2
 and
3
of
lenders
 100


80
 70
 60
 50
 40
 30
 20
 10


8


1
Memory
Lender
 64
KB
TCP
RcvWnd



16


2
Memory
Lenders
 32
KB
TCP
RcvWnd



64
KB


32
KB


16
KB


8
KB


4
KB


2
KB


1
KB


64
KB


32
KB


16
KB


8
KB


4
KB


2
KB


1
KB


64
KB


32
KB


16
KB


8
KB


4
KB


2
KB


0


1
KB


Throughput
in
MB/s


90


32


3
Memory
Lenders
 24
KB
TCP
RcvWnd



17


Database
Applica8ons’
Performance
 Improvement
 Given:
 •  a
distributed
RamDisk

1
borrower/3
lenders
 configura8on
interconnected
by
a
Gigabit
switch
 •  SQLServer
2008
 •  TPC‐H
benchmark


Index
File
Access


Conjecture:
a
database
applica8on
can
improve
its
 performance
by
storing
index
files
in
a
distributed
 RamDisk.
The
database
does
not
require
any
code
 modifica8on.


18


Database
Performance
Improvement
 Time
spent
by
the
database
performing
the
index
 search
 Request
Size:


Distributed
RamDisk


Size


%


160000


64
KB


35


140000


192
KB
 13
 256
KB
 17
 512
KB
 10


120000
 Milliseconds


128
KB
 17


WD10


100000
 80000
 60000
 40000
 20000
 0
 Query
2


Query
11


Query
16


Query
21


Query
22


19


Image
Processing
Applica8ons’
 Performance
Improvement


Conjecture:
a
image
processing
applica8on
 can
improve
its
performance
by
using
 the
distributed
RamDisk
as
scratch
 space.


ADDRESS


Given:
 •  a
distributed
RamDisk

1
borrower/3
 lenders
configura8on
interconnected
by
 a
Gigabit
switch
 •  Adobe
Photoshop
CS2


Image
File
Access
 000
 010
 020
 030
 040
 050
 060
 070
 080
 090
 0A0
 0B0
 0C0
 0D0
 0E0
 0F0
 100
 110
 120


20


Image
Processing
Performance
 Improvement


21


Second
Case
Study:
A
Distributed
 Codec
Device
 Win32
Applica8on


Windows
Image
Component


Codec
Device


High
Speed
Network


Graphics

 Hardware/So^ware


































































Virtual
HAL
 VComponent


CPU


VComponent


Graphics
Card


22


Second
Case
Study:
A
Distributed
 Codec
Device
 •  Consolidator
 par88ons
the
 request.
 •  Resource
manager
 calculates
the
 suitability
of
each
 virtual
component



Driver
Interface
 Consolidator
 Resource
 Discovery


Resource
 Manager


High
Speed
Network
 VComponent


CPU


VComponent


VComponent


Graphics
Card


Graphics
Card


VComponent


CPU


23


Codec
Results
 Image
decodificaAon
Ame
using
different
number
of
 lenders
and
different
load
distribuAon
mechanisms
 700000


Milliseconds


600000
 500000
 400000
 300000
 200000
 100000
 0
 0
Lenders
 1
Lenders
 2
Lenders
 3
Lenders
 4
Lenders
 5
Lenders
 6
Lenders
 7
Lenders


Intelligent
Distribu8on


Naïve
Distribu8on


Borrower/Lender
Homogeneous


24


Conclusions
 •  A
new
level
of
abstrac8on,
namely
the
Device
Level,
 and
a
new
level
of
implementa8on,
namely
the
Driver
 Level,
are
added
to
the

virtualiza8on
taxonomy.
 •  The
Device
Abstrac8on
Level
increases
system
 performance
capacity
by
aggrega8ng
idle
networked
 resources
and
binding
them
as
a
single
device.
 •  The
Driver
Implementa8on
Level
provides
support
for
 legacy
applica8ons
by
avoiding
code
modifica8ons.
 25


Suggest Documents