RI01006,April 2001
Computer Science
IBMResearch Report
An ArchitectureforVirtualServerFarms
VikasAgarwal,Girish Chafle,Neeran Karnik,Arun K Ashish Kundu,JoharaShahabuddin,Pradeep Varma
umar,
IBMResearchDivision IBMIndia ResearchLab Block II.I.T. , Campus,Hauz Khas New Delhi 110016, India.
IBMResearchDivision Almaden Austin Beijing Delhi Haifa T.J. - Watson
:
Tokyo Zurich -
LIMITEDDISTRIBUTION NOTICE Thisreport hasbeen s ubmitted for publication outs ide oIBM f and will probably bceopyrighted iascce pted for publication. It hasbeen issued asResearch a Report for early dissemination oits f contents. In view of the transfer ocopyright f to the outside publisher, itsdistribution outside of IBMprior tp oublication s hould bleimited to peer communicationsand s pecific requests. After outside publication, requestss hould bfeilled onl b yryeprintsolregally obtained copiesothe f art icle (e.g., payment of royalties). Copiesmay be requested from IBMT.J. Watson Research Center, Pub lications, P.O. Box218, Yorktown Heights, NY 10598 USA (email:
[email protected]).. Some reports are available on tinternet he at http://domino.watson.ibm.com/library/CyberDig.nsf/home .
AnArchitectureforVirtualServerFarms VikasAgarwal,GirishChafle,NeeranK arnik,ArunK umar, Ashish Kundu,JoharaShahabuddin,PradeepVarma IBMIndiaResearch Laboratory, Block I,Indian InstituteoTechnology, f HauzKhas,NewDelhi 110016,INDIA {avikas,cgirish,kneeran,kkarun,kashish,sjohara,pvarma}@
Abstract
Wegeneralizethetraditionalphysical,customer-dedi
logical,distributedcounterpart. The
in.ibm.com
catedapplicationserverinserverfarmstoa
virtualservers thusdefinedbreaktraditional,staticboundariessepar
resources(suchasphysicalserversosrtaticpartition
tshereof)dedicatedtocustomers,andopenupnewdegr
offreedomforoptimizedandshareduseosferverfar
mresources.Anadditionalbenefitofsharedresources
substantialsavingsinredundantcapacityrequiredforh
thelevelofeachstaticresourceunitdedicatedtoa otherunitsarenot). Byco-allocatingcomponentsof
termineddynamically,andthatredundancyhastobbe ui
ltat
customer(aunitmayindividuallygetoverloaded,eve
nwhile
independentvirtualserversonthesamephysicalserv
er,and
amic,theredundancyrequirementonthephysical
serverisharedandbroughtdown. Weproposeanarchi
tectureforaserverfarmbasedonthisnotionovf ir
servers. Oursystemhasthebenefitosfupportingcustom
tual
er-specificServiceLevelAgreements(SLAs)interms
ofsimplemetricsofuser-generatedload. Customersn
eednotknowaboutthedeploymentandconfiguration
detailsinsidetheserverfarm,butsimplyneedtospec
ifythefunctionalitysoughtbythemandnegotiatea(
based)price. Resourcedeployment,management,andoper
ationarethentheresponsibilitiesotfheserverfar
whilethecustomerisbilledforactualusageasperthe
SLA. Wedescribethearchitectureotfhevirtualser ls.
Keywords:VirtualServer,Resourcemanagement,Serverfarms, Quality oservice, f Service Level Agreements.
isthe
ultiplicatively. Thisbenefitderivesfromthe
bykeepingtheboundariesbetweentheco-allocationsdyn
farm and discussitsmanagementand operational detai
ees
andlingthepeakloadsoifndependentcustomersotfhe
farm –theprobabilityoffacingsimultaneouspeaksgoesdownm observationthat,typically,thecustomer’sloadids e
ating
ApplicationServiceProviders,Loadbalancing,
loadm, ver
I.INTRODUCTION ApplicationServiceProviders(ASPs)hostapplications
thatsupportmultiplecustomersoffashared
infrastructure. Thismodeltakesadvantageoeconom f
iesoscale f (intermsoredundancy, f peak-loadhandl
skills,etc.)andoftheneedforthecompaniestof
ocusontheircorecompetenciesratherthanincreasin
complexITmanagementtasks.Therefore,thereicslea
trend r forcorporationstoutsourceapplications,driv
theASPmodel.Moreover,sincetheinfrastructureis
ing,IT
rverfarmsuchthatacustomer’sapplicationis
supportedusinga
virtualserver –asetofdistributedsoftwareresources,communicatin
virtualserverfarm
isasetofvirtualserverscoexistingonasetofs
gwitheachother.A
haredphysicalmachines.Wedescribe
mechanismsforautomaticallymanaging(allocating,r
eallocating,provisioning,load-balancing)theresourc
suchafarm,includingdynamicallyreconfiguringthef
armbasedoncurrentneedsovarious f customers,govern
bytheirServiceLevelAgreements(SLAs).Thissyste
mimprovesresourceutilizationbyreducingtheresources
required forhandling peak loadsomultiple f customers,
while ensuring thatSLAsare notviolated.
Theproposedsystemallowscustomer-friendlySLAstobe
specifiedintermsotfherequest(
thecustomer’sapplication.Itmapstheseontoactualre
hit)ratefor
that
hits. Thustheactualloadimposedcanbesignificantl
y
inghits.Sinceoursystemisorientedtowards abletointerpretdeviationsfromexpectedloadasjust
anothervariationtobehandledusingsharedredundanc
y. Thisallowstheuseosf implerSLAdefinitionsan
reducestheITawarenessrequiredfromcustomers. Ac
ustomerneednotworryaboutsuggestingdetailedserver onalityandnegotiatesahit-rate-basedpricefunction
ASP. FromtheperspectiveotfheASP,ourapproachallo
redundancy needstboperovided.
ed
fficultyofworkingwithhitratesandbenchmarksis
handlingofpeakloadsusingsharedredundancy,iits
hardwaresharing.Thiscreatesmoredegreesoffre
esof
sourcerequirements,usingthebenchmarked
differentthantheexpectedloadcomputedusingbenchmark
configurations. Heonlyspecifiesahigh-levelfuncti
teand
ization oresource f usage.
WeproposeanarchitectureforimplementinganASP’sse
actualhitscanvarysignificantlyfrombenchmarking
enby
rented,mostotfhehardwareandsoftwareisoff-si
shared thusproviding the ASPwith ample scope foroptim
characteristicsoftheapplicationbeinghosted. Adi
gly
d
withthe
wssoftware/applicationstobesharedinadditionto edomforoptimizationandreducestheextenttowhich
Thisresultsisnignificantsavingsforthe ASP.
MostcurrentASPsprovidesimpleservicessuchaswebho servers. Othersallowserverstobesharedinrelat
sting,withcustomer-dedicatedphysical
ivelystaticmode,withSLAsthataredependentonf
arm
implementation. Whatislackingisthetechnologyf
ordynamic,global,efficientresourceallocationtha
ct an
minimizethecostohf ostingandhandlingpeakloads
intelligently. Moreover,manualadministrationcos
tsare
highintherunningofsuchserverfarmsandthelack
ofautomationcompromisesscalabilityandresultant
economiesoscale. f Océano[AFF01]improvesuponcurrentASPtechnology.Ipt r basedserverallocation,andhighavailability.This multipleapplicationsacrosscustomers.However,Océano machines.Besidesaninefficiencyasfarasredundan
ovidesSLAmonitoring,automaticneed-
enablesserver
farmstoautomaticallyhandletrafficsurgesin
handlesallocationandde-allocationinunitsoef nt
ire
ct apacityisconcerned,thisapproachrulesoutcustomer
s
fromthesmallandmediumbusiness(SMB)segmentwhose
individualrequirementsforanygivenapplicationdo
notadd uptwhole ao machine.Moreover,only hardwar
reesourcesare shared
AmajorityoA f SPsthatclaimto
offersharedserversmerelyuse
1
servers T . heydonotguaranteequalityofservice,highavai
–applicationsare not. virtual hostsaos fferedbymostHTTP
lability,scalabilityorresourceoptimizationofthe
hostingfarms.SomeexceptionslikeEnsim[ENSIM]do
supporttheconceptofmachinefractions,whichtheyc
privateservers(PSs).However,systemadministrationfor
all
say,migratingcustomersacrossPSsoirncreasingPS
resources,mustbedonemanually. Inparticular,therei
ns oautomaticsharingoredundant f capacityforhandl
dynamicfluctuation,ofrorthatmatterbalancingofl
ing
oadacrossdifferentservers.Finally,serversare
staticallypartitionedinthatredundancycannotbere
relatively
coveredautomaticallybypushingbackstaticminimum
boundariesoP af S,forsupporting anothercustomer. OursystemscoresoverEnsim’sandsimilarsolutions
byautomatingresourcemanagement,optimizing
resourceusage,distributingloaddynamicallybasedona
vailablecapacity,andsharingresourcesamongdiffere
customers.Thevirtualserversinoursolutionarenot
nt
static. Avirtualserverisdistributedovermultiplem
anditsboundarieswithinanymachinearekeptdynamic
achines
andautomaticallyadjustable.Thisisusefulfor
reassigning resourcesttohe customerswho need imo t
st.
Wedescribeoursystemasfollows.InsectionIIwe discussedinsectionIII. Systeminitialization,sc
provideanoverview.Systemarchitectureis heduling,andfaulthandlingarediscussedinsection
V discussesrelated work. Finally,conclusionsare pr
IV. Section
esented isnection VI.
II.OVERVIEW ResourcesandResourceClasses:
Aserverfarmcanbeviewedasacollectionof
hardwareaswellas oftwareresources. Thehardwar
resources. Theseinclude
erangesfromsimplePCserversforhostinglow-volum
websitestohigh-endmainframesforperformingcomplex
financialorscientificanalyses. Similarlytheso
mayvaryfromproductivitytools(wordprocessing,spre
e ftware
adsheets)toapplications(payroll,onlinemalls)to
complexalgorithmsforspecializeddomains(gameslike
chess,scientificcomputationslikeweatherforecast
ing,
etc.). WepartitiontheapplicationstypicallyneededbytheAS resourceclassencapsulatesthepropertiesandfunctiona DB2serverisan
P’scustomersintoasetof
resourceclasses A.
lityoapf articulartypeoresource. f Forexample,anI
instanceotfheresourceclass,Apache’sHTT
BM
Pserverisaninstanceotfhe
resource class,etc. VirtualServersandVirtualServerFarms: “applicationserver”comprisesinstancesorfesourcecl hardwareserverinthephysicalworld
– wecalliat
retailstoreapplicationcouldbceomposedoafront-en f
Anapplicationiscomposedofasetofresourceclasses
.An
assesinthatset.Suchaserverdoesnotcorrespond
toa
virtualserver
tohighlightthis.Forexample,aweb-based
d,amiddle-tieran
1 See http://www.mindsethosting.com/ andhttp://www.wwow.com
da
back-end.Whenthisapplicationisde
ployedintheserverfarm,multipleinstancesofeach
resourceclassmaybecreated,toenablehandlingoaf
lltheincomingrequestsinatimelyfashion.Forexa
mple,
AcmeInc.’svirtualservermaybceomposedof,say,fi
veinstancesothe f Apachewebserver,twoinstances
ofthe
2
WebsphereCommerceServer
andoneDB2databaseserver. Thesenumberscanvaryd
ynamically,basedonthe
currentload oA n cme’sweb-store. Thus,acustomer’sapplicationrequirementscanbeconver ThreesuchvirtualserversareshowninFigure
tedintothespecificationofavirtualserver.
1.TheASPmustmaptheresourceinstancesontophysical
machines.Ourapproachistodedicatemachinesforpar
ticularresourceclasses,thuscreating
machineswithidenticalsoftwareresources(seeFigur
sub-farmsof
e1). ThishastheadvantageofsimplifyingtheASP’
maintenancechores,andeliminatingtheinterferenc
oeof therresourcesinthemonitoringoaparticular f
Eachresourceinstancecanresideonasinglemachin hostmultipleresourceinstancesassignedtodifferent
s resource.
e,oor nasetom f achines.Conversely,onemachine
may
customers.Thesystemtreatsvariousinstancesotfh
esame
classtobeinterchangeablewithrespecttoacustomer’
sapplication,assumingthatresourceclassessupport
standard interfaces.
Figure1Sub-farms . ComprisingaVirtual Server Farm
Benchmarking: The capacityoamachine, f withrespecttoaresourceclass,ism thattheresourceinstancecansupport,whendeployedon
thatmachine.
to determinehowaninputhitrateontheapplicationtran edas
physicalresources(suchasCPU,RAM,etc.)requiredby
eachphysicalresource)ouf sageduetoa
Allhostedapplicationsarebenchmarked
slatestohitratesondifferentconstituentresource
classes.Theratiosbetweenthesehitratesareterm
request.Thisids oneforallthemachinetypesinth
easuredintermsotfhehitrate
correspondenceratios Secondly, . wemeasurethe aconstituentresourceclassofanapplicationfora
eserverfarm.Thebenchmarkingresultsinak-tuple(o referencehit onaresourceclass. Areferencehitisanotional
correspondstothemeasuredaverageload. Formanypo 2 See http://www.ibm.com/software/webservers/commerce
pularapplications,suchbenchmarkinginformationis /
nefor hitthat
availablefromdevelopers.Forotherapplications,the
ASPgeneratesthisinformationbyexperimentation.
Alternatively,theapplicationcouldbedeployedbasedon
liberalinitialestimatesoftherequiredphysical
resources. ServiceLevelAgreements:
AsdiscussedinSectionIwe , wishtoisolatecustome
ASP’sresources.Thecustomerspecifiesitsrequirement
rsfromtechnicaldetailsotfhe
sintermsoftherateofrequestsexpectedforthe
application.Differentvaluesofthehitratemaybespe
cifiedfordifferenttimeperiods.Baseduponthis,th
customerandASPnegotiatethechargestobelevied.
e
Infurtherdiscussion,weassumethatanSLAincludest
he
following: •
The range ohit f ratesthatthe application iesxpected
•
The price thatthe customerhasagreed tpoay fordiff
•
The penalty thatthe ASPmustpay ttohe customerin case i
to support,and optionally,an average hitrate erentsub-rangesothis f range tsminimum guaranteesare notmet
III.SYSTEMARCHITECTURE Figure2showsahigh-levelarchitectureoavf irtuals
erverfarm.TheASPhostsseveralcustomers,whose
usersaccesstheapplicationsontheASPinfrastructure applicationissplitinto
ResourceClasses
viatheInternet. Duringtheinitialization,acusto asdescribedearlier,andismappedontoavirtualserv
informationabouttheapplicationandlayoutofthevari ina
ConfigurationRepository
mer’s er. Detailed
ousresourceinstancesonphysicalmachinesim s aintai
(CR).Thesystemhasmany
resourceclass.Traffictoasub-farmisfedthroughi
ned
sub-farms,eachconsistingom f ultipleinstancesoaf
ts LoadDistributor
(LD),whichdistributestheloadamongst
the differentinstancesunderit(see Figure 1). Eachresourceinstancehasanassociated Thisinformationismadeavailabletothe
Monitor,whichirsesponsibleforcollectingmetricsonits Aggregator. Basedonthecurrentusage,theAggregatordetermin
changesrequiredintheresourceallocationforeach (GDM). TheGDMre-computestheresourceallocationtocust revenue,resourceutilization,orperturbationofthecurr ResourceManager
customer,andsuggeststhesetothe
collection ointeracting f modulesratherthan asas
es
GlobalDecisionMaker
omerswithaviewtooptimizeparameterssuchas entallocation. Thenewallocationisprovidedtot
(RM),whichisresponsibleforactuallyimplementingthec
Dependinguponthescaleoftheserverfarm,oneormo
usage.
hangedallocationplan.
reofthesesystemcomponentsmayberealizedasa ingle monolithiccomponent.
he
ImplementedAllocation
ModifiedConfiguration
www.rediff.com
RMagent
RM agent
Resourceclass1
Resourceclass2
Level1 Load Distributor
www.olympics.com
Resource Manager
Load Monitor
RMagent Resourceclass3
Load Monitor
LD2
Load Monitor
LD3
www.bazee.com
F
F L
F
L Aggregator andper customer Decision Maker
R
L CurrentResourceAllocation Configuration Repository
ResourceDemand SLA Global Decision Maker
R
Initialization
NewResourceAllocation CurrentResourceAllocation
SLA
F
LoadFactor
L
LoadInformation
Levelnloaddistributor
R
ResourceRequirements
LDn
Figure2Architecture . othe f Virtual Server Farm 1.Load Monitors Theseareusedtomonitorusageofvariousresourcein There are two typesomonitors f foreach resource ins
stancesformanagementandbillingpurposes. tance:
•
Hitrate monitors: They measure the numberofhitson
•
Machineloadmonitors:Theymeasurethephysicalreso
an instance perunittime. urcesconsumed,asak-tupleconsisting
ofparameterssuch aC s PU,memory,etc. Hitratemonitoringisdonebythe
loaddistributor module(discussedlater).Itmeasuresthehitratefo
eachresourceinstancebyaggregatingthenumberohf it systemprogramssuchas
psor
r
fsorwarded. Machineloadmonitoringcanbdeoneusi
topinUnix,OS-levelinformation,third-partymonitorin
ng gtools,orload
informationprovidedbytheapplicationitself. Inadd
ition,faultmonitoringofresourcescanbedoneusing
a
heartbeatmechanism. Bothloadandfaultmonitoring
interfacescanbestandardizedusingstandardslikeJ
ava
ManagementExtensions[JMX]forsupportingaplugandser
veapproach. Also,well-knowntechniquessuchas
[IBMSP]canbeusedinaccumulatingmonitoreddata. F
inally,simplerule-baseddecisionscanbeusedwhich
follow guidelinessuch a“sdeclare machine a faulty if
X%ofitsresource instancesare unavailable”,etc.
2.Aggregators Aggregatorscollectloadinformationfrommonitors There are two levelsoaggregators f ashown iF n ig
andaggregatethemforefficientdecisionmaking.
ure 3.
LoadMonitors LoadDistributor R1
R2
R3
LoadInformation
hit-rate
loadfactorperresourceclass instancepercustomer
Aggregator Level1Aggregator
hitrateperresourceclass percustomer
hitrateandcorrespondinghitweight perresourceinstancepercustomer
Level2aAggregator/per customerdecisionmaker
Level2bAggregator
changeinresourcerequirements perresourceclasspercustomer
hitrateandcorrectionfactor perresourceinstancepercustomer
Global Decision Maker
Figure Aggregation 3. oLoad f Data Level-1aggregator receivesmonitoreddata(currenthitrateandther resourceinstanceandaggregatesiot nper-resourcecl resourceclassofacustomer,isobtainedbyaddingthe informationtoLevel-2aaggregator.Secondly,itcom
esourceusagek-tuple)foreach
ass,per-customerbasis.Theaggregatedhitrate,oef
ach
hitratesofeachinstanceofthatclass.Ist ends putesthe
this
hit weightperresourceinstanceaaratio s oactual f
resourceusageperhittotheresourceusageoafrefere
ncehit.Thehitweightrepresentsthedeviationof
load
generatedbyobservedhitsfromthebenchmarkinginfor
mation.Iits enttoLevel-2baggregatoralongwith
the
correspondinghitrate.Finally,itcomputesthe
loadfactor foreachresourceinstanceatsheactuallyusedfract
ofthe allocated resourcesand sendsito tLD,which Level-2aaggregator
usesito tdistribute the load proportionally.
actsas
percustomerdecisionmaker
customer,inunitsorfeferencehits,totheGDM.Whe currentallocation,anincrease
"
ion
andmakesdemandprojectionsforeach
neveracustomer’sloadreacheswithinafactor
inallocationisdemandedforthatcustomer.Simila
correspondingparametersforde-allocationofresource
ofits
!
rly, and #
$
s.Thesedemandprojectionsarecomparedagainstthe
%
arethe
rs , , and
customer’sSLAandpruneddown,ifneeded.Theparamete variablessuch alsoad,currentresource allocation or Ademandforthe
&
'
(
maybeconstantsofrunctionsof
)
SLA limits,and are configurable foreach customer.
front-endresourceclass(whichreceivesrequestsfromtheuser
applicationistreateddifferently.Sucharequestgets
translatedinto
numberohf itsforalltheresourceclassesotfheappli
s)ofthecustomer’s
asetorfequestsconsistingofappropriate
cation. Tocomputethis,itusesthecorrespondenceratio
whosecurrentvalueisobtainedfromCR.Itfhereisa
,
significantchangeintheobservedratiooveraperio
dof
time,the value itnhe CRisupdated. Level-2baggregator computes
correctionfactors foreachresourceinstance.Thesefactorscompensate
fordeviationsoaf ctualhitloadfromthereference
-hitload.Theaggregatorobtainstheprevioushit-wei
the CR,and currenthitrate and hitweightfrom Lev
el-1aggregatorto compute the correction factoras
Correctionfactor=(currenthitweight
ghtfrom
–previoushitweight)*numberoreference f hitsallocated
Italsocomputesthetotaldeviationbyaggregatingth
esecorrectionfactorsacrossallinstancesoefach
classforeach customer,and sendsito t he GDMalo
resource
ng with the correction factors.
3.Global Decision Maker(GDM) TheGDMacceptsasetolfatestdemandsaisnputandc
omputesanallocationplanfortheserverfarm,
withtheaimtomaximizetheASP’srevenue.Theplanc
omputationmatchesthesupplyom f achinestocustomer
demandsinvariousresourceclasses.Wemodelthisas
amixedintegerlinearprogram[BHM77].Theglobal
optimizationproblemishardtosolveinanexactmanne
rsowerelyonaheuristicsolution.Therearemany
candidateapproachessuchas
standardLPtechniques,problemLPrelaxation,cuttingpl
andcolumngenerationtechniques,
anes,branch-and-bound,
thatcangenerateefficient,approximatesolutions.We
heuristiccomprisingofsolvingusingLPrelaxationand
chooseasimple
thenre-solvingthesimplifiedproblemaftersetting
integralvariablestoroundedoff(butfeasible)values.
AppendixAcontainsdetailsoour f model.Apreprocessin
modulesimplifiesvariablesandconstraintssoastos
peedupthesolution.Thesolutionprocessihs altedafte
shorttimeforthebestsolutionobtainedtillthen.T
henewallocationplanisgiventotheResourceManag
the g ar erfor
deployment.
4.ResourceManager TheResourceManager(
RM)translatestheGDM’sallocationplansintoactions
performedonphysical
machinesandactualprocesses,toarriveathedesir
edconfiguration.
Itcreatesanddeletesresourceinstances,
interfaceswithOS-levelmechanismstochangeresour
ceallocations,andcreatesanddeletessub-farms.A
llthese
actionsare transparentto the customers’applications. TheRMhasanactivecomponent(
RMAgent o) neachmachineinthefarm,whichprovidesanint
formanagingresourceinstancesonthatmachine. Wh registersitslocationintheConfigurationRepositor
erface
enthemachinebootsup,theagentstartsexecutingand y(CR). ItthenqueriestheCRtodeterminethelow
startupand shutdowncommandsforthe resource classointsmachine.
-level
Amachineinasub-farmmayhostmultipleinstancesof
oneresourceclass,eachhavingaunique
identifier( instanceID). Eachinstancemaysupporthitsfrommultiplecustomer allocatedadifferentmaximalhitrateonthatinsta
s,andeachcustomermaybe
nce. TheRMAgentforthemachinemaintainsatable
instance,containing these hit-rate allocationsper
foreach
customer.The agentpresentsthe following interface
Startup()
Start raesource instance and return itsinstance
shutdown(instanceID)
to the RM:
ID
Shutdown the specified resource insta
nce
setAllocation(instanceID,customerName,
Set caustomer’sallocation otnhe specified instance
hitRate )
maximal hitrate
to ,the given
M1
A R - MAgent
A
Mn nth - machineinthefarm
sub-farm
- Resourceinstance
M2
A GDM
Plan
RM M3
A sub-farm M4
CR
A
Figure Resource 4. Manager interactions Implementinganallocationplan
The : RMhasacurrentplanreflectingthecurrentsta
planprovidedbytheGDM. Atfarmstartuptime,thecurr
tusothe f farm,andanew
entplanindicatesthatnoresourcesareallocatedto
customer. TheRMwaitsuntilafreshallocationplan
any
isgeneratedbytheGDM.Thisplaniathree-dimensi s
table.Foreachcustomer,itliststheallocationfor
onal
eachinstanceoevery f resourceclassthatthecusto
TheRMcomparestheresourceinstanceallocationsof
thenewplan(
merrequires.
new_alloc)withtheonesinthecurrentplan
(curr_alloc).Several casesare possible: CaseI : curr_alloc=
new_alloc. Because there inscohange ianllocation,no acti
CaseII : curr_alloc=0,
*
. newinstancemayhavetobecreated(ifallcusto new_alloc 0 A
allocationsoniare t zero). TheRMinvokesthe instanceandeitherstartsanewloadmonitorfori
operationtoinitializethecustomer’shit-ratealloc LoadDistributoroiftssub-farm,informingitaboutthe
mers’current
startup operationontheappropriateagent. Theagentcreates t,oar ssignsthe
monitor.ItthensendsanacknowledgmenttotheRM
can bfeorwarded tiot.
on needstboteaken.
responsibilityofmonitoringittoanexisting
,whichsubsequentlyusestheagent’s ationtothe
the
setAllocation
newvalue. Theagentthensendsamessagetothe
creationofthenewinstance,sothatincomingreque
sts
+
CaseIII : curr_alloc 0,
new_alloc=0
The . instancemayhavetobdeestroyediitis fno
longerneeded(i.e.ifall
othercustomers’allocationsarealsoreducedtozero
The ). RMtheninvokesthe
appropriateRMagent.Theagentfirstdirectsthe
sub-farm’sLoadDistributortostopsendinghitstotha
instance,andthenusestheresource’s
+
+
new_alloc 0 The . RMusesthe
indicatingthenewallocation. TheLDisresponsible new_alloc>
or. The
agny pending requests. setAllocationoperationonitsagenttomodifythe
allocation.Theagentupdatesitsinternaltablesacco
necessary.If
t
shutdownscripttostoptheinstanceanditsassociatedmonit
instance can shutitselfdown cleanly afterservicin CaseIV : curr_alloc
shutdownoperationonthe
rdinglyandsendsamessagetotheLoadDistributor forenforcingthislimitbythrottlingincomingreq
curr_alloc,theallocationhasbeen
correspondingincreaseinthenextresourcetowhichi excessrequestsuntilthatnextresourcehasalsobeenexpa sendingmessagestotherespectiveagents,theLDmus receivedtheirincreasesbeforeallowingmorehitst messagestotheRM,whichimplementsbarriersynchron instanceshavebeenexpanded,theRMissuesa“
uestsif
increased.Anincreaseinaresourcewithouta
sends t hits,isnotuseful
–ionly t resultsinthrottlingothe f
nded.Becauseotfheunpredictabledelayinvolvedin w t aituntilalldependentresourceinstanceshaveals
o
hrough.Forthispurpose,RMagentssendacknowledgment ization. Afteralltheagentshaveconfirmedthat
their
switchplan ”messagetoalltheLDs,whichthenswitchtothe
increased allocation. TheRMagentisalsoresponsibleforinterfacingwith partitioningthemachineamongstcustomers.Suchpartit enforcestrictlimitsoneachapplicationintermsof
anyavailableoperating-systemmechanismsfor ioningmechanisms(e.g.,[ENSIM,WLM])typically CPUusage,diskspace,networkbandwidthandotherOS
-level
resources,andalsoprovidesecureisolationoaf pplicat
ionsfromeachother.However,evenisfuchamechan
ism
isnotavailable,thesystem’sthrottling-basedmecha
nism(performedbyLD)looselyenforcestheallocatio
n
limits. WhenalltheRMagentshaveperformedtherequestedr
eallocations,theRMcommitsthenewallocation
plan ttohe Configuration Repository. Sub-farmCreationandRemoval
W : henacustomerrequeststheuseofaresourceclass
thatisn’tcurrently
availableontheserverfarm,anewsub-farmmustbecr
eatedautomaticallytoexecuteinstancesotfhatresour
ce
class. TheRMusestheConfigurationRepositorytoid
entifytheappropriateunusedmachines,designatesthem
as
partofanewsub-farm,andsetsupaLoadDistributorto
managetheincomingrequests. Noresourceinstances
areactuallycreatedathis t stage. Theinstancesg
etcreatedaos utlinedearlier,whentheGDMallocat
esnon-zero
hit-ratesttohem. Asub-farmmayberemovedinf ocustomersrequirether agentstoshutdownallresourceinstancesrunningon farm ismade inactive,itupdatesthe statusoeach f m
esourceclassinstalledonit.TheRMusesits thatsub-farm,aswellasitsLoadDistributor.Oncet achine itnhe CR,to indicate thatitispartofthe
hesubfree pool.
Load 5. Distributor ThereisaLoadDistributor(
LD)associatedwitheachsub-farm,whichdistributesloa
instancesassignedtoacustomer.AnLDhasaqueuefo
rincomingrequests,anddatastructurestostorethe
allocatedhitsperinstancepercustomerandtheamount removesarequestatthe
ofcapacityconsumedduringthecurrenttimeinterval.
headothe f queue,identifiesthecustomerCforwhich
instance passignedtoCthathastheleastloadfactor.Ift isthrottled,else iisftorwarded to
heallocatedcapacityof
LDreceivesloadandfaultinformationaboutaninstan allocationsfromtheRMagents.Anyreductioninall
It
therequestismeant,andfindsthe pifsullyconsumed,therequest
p.Similarload balancing techniquesare described i[n
increaseinallocatedcapacity,however,istakenin
damongthe
CCY99b,HGKM98].
cefromtheaggregatorandnewresource
ocationforacustomeristakenintoaccountimmediat toaccountonlyafterthecorresponding“
ely.An switchplan ”message
isreceived,asdescribed earlier.
Configuration 6. Repository TheConfigurationRepositoryactsasacentralstorag
efortheserverfarm.Itstoresinformationon
hardwareavailability,softwaredependencyandrequire
ments,machinecapacities,theallocationtable,applic
ation
benchmark characteristics,SLAs,etc.
IV.SYSTEMINITIALIZATION,SCHEDULING,ANDFAULT-HANDL SystemInitialization :Atstartuptime,themachinesdesignatedtorunthe (GlobalDecisionMaker,Aggregators,LoadMonitors, predefinedscriptsthatstartandinitializetherespe file system,and uniform a view ofthe file system i
systemmanagementcomponents
ResourceManager,LoadDistributors)arebootedupwith ctivecomponents.Allapplicationsareinstalledona
distributed
psrovided oanll the machines.
Eachnewcustomerwhojoinstheserverfarmisfirst applications,ifthisinformationins otalreadyavai
hostedonanexperimentalsetuptobenchmarkhis lable.ThisinformationitshenaddedtotheCR.Next
farmsareconfiguredforthecustomer.Expectedresource
sweep,followedbyanaggregationsweep,a
GDMplancomputation,andfinallyaplandeploymentbyth
eresourcemanager. Afterthisthecyclerepeats. owprocesses,acyclelikethishasarelativelylar
period.Monitoringandaggregationforfeedingback
Distributorswith only small a minority guiding re-com
tributorsareexecutedper
tsinmonitoroutputareorientedtowardsfeedbacktoL
oad
putation othe f allocation plan.
HandlingFaults: Monitorspassadiscoveredfaulttoaggregatorsfor noticetothesystemadministratorandinformthea
getime
theloadinformationtoLoadDistributorsrequirea
catertothis,multiplecyclesofeedbacktoLoadDis
cycleinvolvingtheGDM. Thus,themajorityoefven
n
GDM’splan.
Scheduling: Thesystemusesasimpleschedulehavingamonitoring
Sinceplancomputationanddeploymentarerelativelysl
the , sub-
demandsarefedtotheGDM,inordertogeneratea
allocation plan.The Resource Managerthen deploysthe
relativelysmallertimeperiod. To
ING
furtheraction.Theaggregatorsposta
ffectedLoadDistributorstostopdirectinghitstoth
efaulty
resource. ItisthenuptotheLDstodistributethehitstoalte
rnativeresources,orthrottlesomehitsithey f can
behandled.Theaggregatorsalsopassfaultdatatoth
eGDMsothattheGDMcanremovethefaultyresource
fromitsallocationplans.Priortopreparinganalloc
not
ationplan,theGDMcollectsthelateststatusothe f
ofmachines(theremaybenewornewly-fixedmachines
freepool
addedtothepoolwhichcanbeused)andtakesinto
accountall faulty machinesthathave tboreemoved.
V.RELATEDWORK Duringthe1990s,researchinclustercomputingfocusedmai
nlyonprovidingcomputingenvironments
withhighavailabilityandscalabilityneeds[BSSDRP95,
FGCBG97].Asurveyandclassificationofserver
clusteringtechniquescanbefoundin[SGR00].Morerecen
tly,researchinthisareahasbeendirectedtowards
usingclustersforweb-basedapplications[AYI97,RA99,SLR98] scalabilitythroughmirroringandmigrationwhereas[
S . ystemssuchas[RA99]aimtoimprove
SLR98]andothersprovidefaulttolerance.TheFlexsyst
[C99]performslocality-awareloadbalancingforserver
em
farmstoimproveperformance,butappliesonlytostatic
websites.Allthesesystemsfocusmoreonscalability
andreliabilityissuesratherthanonresourceoptimiz
ation
and mostly deal with staticwebcontent. Mounties[FJNRV00]is imilartothepresentedworkin
thatiftocusesonthemanagementosfoftware
andhardwareresourcesbydevelopingplansusingacentr
alizedoptimizer,withaninfrastructurefordeploying
plans. Thenotionofavirtualservercorrespondstothenot
the
ionofresourcegroupsinMounties. However,in
contrasttoMounties,avirtualserverisgenerallyd
istributedovermultiplemachinesasopposedtoresiding
onemachine. Finally,thepresentworkdiscussesser
verfarms,includingissueslikedirecting/throttling
on incoming
load,which insotthe subjectofMounties. TheMultiSpace[GWB99]systemissimilartooursintha transparentlyscalableservicestoclients. Ithowev load redirection,and iosnly useful forfreshly deve
titprovidesamechanismforproviding errequirestheuseoafclient-sidestubtoperformdyn
loped Javaapplications.
[CCY99a]describesatechniqueforsharingloadamong redirectionmechanismsbuiltintoDNSandHTTP. Ino several levels(one persub-farm),and the redirectio
replicatedwebservers.Irteliesonthestatic ursystemincontrast,weperformloadredirectionat
incsontrolled btyhe latestavailable load informat
Darwin[CFKNSTZ98]providesresourcemanagementmechan Thenetwork-centricviewofDarwinpercolatesthrough
amic
ion.
ismsforvalue-addednetworkservices.
thekindoflowgraphsusedtospecifyresourcereques
ts
-nodesareservices,edgesarecommunicationflows.
Resourceavailabilityitselfiscastasadiscovery
problemin
thenetwork. Incontrast,ourworkiserver-farmc
entricwithpool a oflocal,reusableresourcesforfas
t,dynamic
allocation.Thiscausesustosolvedifferentkinds
ofoptimizationproblems. Forexample,optimizationof
networkcommunicationamongresources,orinchoosing
amongresourcesins otourfocus. Thisalsocauseso
optimization plansand the plan-implementing infrastruct
ure tdoifferfrom the work iD n arwin.
ur
VI.CONCLUSION EscalatingcostsoIfTinfrastructure,itsmanagemen
at ndeverchangingtechnologiesmakeiitmperative
toexploiteconomiesofscalebysharingITresources. addressesthisneedbyautomatedsharingohf ardware
Inthispaper,wehaveproposedanarchitecturethat andsoftwareresourcesat
Thesharingiscarriedoutusingglobalandlocaloptim
theintraandinter-machinelevel.
ization/decision-making.Thearchitecturedoesthe
decision-makinganddeploymentautomaticallyincompli
ancewithpre-definedSLAsthatareexpressive,yet
simpleanduserfriendly.Thearchitecturecomprisesa
notionofvirtualserversthatareasetofdistribute
softwareresourcescommunicatingwitheachother.Apr
d
eliminaryprototypeofthearchitecturehasbeen
implemented and furtherwork ipslanned.
ACKNOWLEDGEMENTS:WewouldliketothankVikramSinghBeniwal,Abhay AbhinavRoongta,andSaurabhSoodfortheirworkinbuil architecture.We also thank Dr.SugataGhosal forhis
Chrungoo,VishuGupta,
dingapreliminaryprototypeoftheproposed meticulousand insightful commentsotnhe paper.
REFERENCES [AFF01]Appleby,Fakhouri,Fongeal. t Proceedingsothe f7
Océano –SLABasedManagementofaComputingUtility
th IFIP/IEEE International Symposium on Integrated Netw
[AYI97]DanielAnderson,TaoYang,andOscarHIbarra. WorkstationClusters.
ork Management.
TowardsaScalableDistributedWWWServeron
Journal ofParallel and Distributed Computing (JPDC),Se
[BHM77]S.P.Bradley,A.C.HaxandT.L.Magnanti.
To a. ppearin
ptember1997.
AppliedMathematicalProgramming
,Addison-Wesley,
1977,pp.379-380. [BSSDRP95]DonaldJ.Becker,ThomasSterling,DanielSavarese CharlesVP. acker.
J, ohnE.Dorband,UdayaAR . anawak,and
Beowulf:AParallelWorkstationforScientificComputation
I.nProceedingsoIfnternational
Conference oP n arallel Processing,1995. [C99]LudmilaCherkasova.
FLEX:DesignandManagementStrategyforScalableWebHostingService.
Technical ReportHPL-1999-64R1,HPLabs,1999. [CCY99a] ValeriaCardellini,MicheleColajanni,PhilipS.Y DistributedWeb-serverSystems
u. Redirection AlgorithmsforLoadSharingin
In.Proceedingsofthe19thIEEEInternationalCon
ferenceonDistributed
Computing Systems(ICDCS99),pp.528-535,May June / 1999. [CCY99b]ValeriaCardellini,MicheleColajanni,andPhilipS Systems. IEEEInternetComputing,May-June 1999.
Y . u.
DynamicLoadBalancingonWeb-server
[CFKNSTZ98]PC . handra,AF. isher,CK . osak,EN . g,PS. teenki
ste,ET . akahashi,andH.Zhang,
CustomizableResourceManagementforValue-AddedNetworkServices
Darwin:
.InProceedingsof6thInternational
Conference oN n etwork Protocols,pp.177-188,Oct.1998. [ENSIM] Ensim Corp.
ServerXchangewhitepaper
http:// , www.ensim.com/pdf/wp.pdf
[FJNRV00]SamehA.Fakhouri,WilliamF.Jerome,VijayK.Naik
A , jayRaina,andPradeepVarma.
MiddlewareServicesinaDecisionSupportSystemforManagingHighlyAvailabl
Active
eDistributedResources
In.
LNCS#1795,Middleware 2000,New York,NY,USA,April 2000. [FGCBG97]ArmandoFox,StevenD.Gribble,YatinChawathe,Eric BasedScalableNetworkServices
A. Brewer,andPaulGauthier.
.InProceedingsoftheSixteenthACMSymposiumonOper
Cluster
atingSystems
Principles,October1997. [GWB99]StevenD.Gribble,MattWelsh,EricA.Brewer,David PlatformforInfrastructuralServices
Culler.
TheMultiSpace:AnEvolutionary
I.nProceedingsotfheUSENIXAnnualTechnicalConf
erence,Monterey,
USA. June 6-9,1999. [HGKM98]G.Hunt,G.Goldszmidt,R.KingandRMukherjee. . ScalableInternetServices. [IBMSP] IBMCorp.,
NetworkDispatcher:AConnectionRouterfor
In Proceedingsothe f 7th International World Wide W
RS/6000SP Monitoring:KeepingIAlive t
IBM , Publication#SG24-4873,1997.
[JMX]JavaManagementExtensions(JMX),http://java.sun.com [RA99]MichaelRabinovichandAmitAggarwal.
ebConference,1998.
/products/JavaManagement/,2000.
RaDaR:AScalableArchitectureforaGlobalWebHosting
Service.In Proceedingsothe f Eighth International World [RNC96]H.RaghavRao,KichenNamandAChaudhary. .
Wide WebConference,1999. InformationSystemsOutsourcing
Special , sectionin
Communicationsothe f ACM,39(7),July 1996. [SGR00]TrevorSchroeder,SteveGoddard,ByravRamamurthy.
ScalableWebServerClusteringTechnologies
.
IEEENetwork,pp38-45,May-June 2000. [SLR98]AshishSinghai,Swee-BoonLimandSanjayRRadia. . In Proceedingsothe f IEEE28th International Symposium
TheSunSCALRFrameworkforInternetServers on FaultTolerantComputing,1998.
[WLM] AIX5LWorkload Manager(WLM),http://www.redbooks.ibm.
com/redbooks/SG245977.html
.
APPENDIXA: OptimizationModel Inputs:fromaggregators:(1)Resourcedemands,intermsof demands.Inputsfromconfigurationrepository:(1)Corre
spondenceratios(2)Maximumallowablenumberof
instancesoefachresourceclassforacustomer(3)Co
stofstartinganewinstanceandshuttingdownane
one(4)machinesavailabletoeachresourceclass,(5)
heads),and(6)limitsonthenumberoifnstancesallo
oneachmachine(7)Currentallocationplan(8)Customer
SLAinformation.TheGDM
maximize ASP’srevenues,using mixed a integerlinearpr
Problemformulation: numberofcustomers,N
isti hetotal
iskthe total numberofmachines.
Dijdenotesthetotaldemandbycustomerfor i resourcec
lassjIt.isthesumofdemandsfromLevel-2a
andLevel-2baggregatorsinreferencehitunits.D
may eegative. ij bn
Aijkdenotesthecurrentallocations,interms
j,machine kWe . solve forthe final allocations
PijkdenotesthesumchargedperunitofresourceB
Bijk.
allottedtocustomer a i,resourceclass and j machi ijk
conformstothecustomer’sSLA. ijk
ne
Tijkrepresentsthesystemcosts(secondterm,
equation(1))ofadding(setupcost)orremovingacusto associatedwith
computestheallocation
N , krespectively, , whereN i, jN
istjhe total numberofresource classes,and N
ofreference hits,forcustomer on iresource class
wed
ogramming model ofthe system.
Thesubscriptsij,kv, aryfrom1toN
kcombination.ThevalueoP f
xisting
machinecapacitieswithrespecttoaresourceclass
(computedaftertakingintoaccountO/Sandotherover
plan, while aiming to
referencehits(2)Correctionfactor
mer(‘cleaning’cost).
having multipleinstancesoaresource f classforacustomer(
Uijdenotesthesystemcost
thirdterm,equation(1)).Thus,the
plan computation problem can bfeormulated afsollows. Maximize
(1)
,
Here Yijk in {0,1} machine).Let valuestY o
indicatesiA f
i, j , (P k
and ijk>0,
ij*Bijk)
-
Zijk in {0,1}
,
i, j , T k
ijk*|Yijk-Zijk|
indicatesiB f
,
i, j(Uij*, Z k
ijk).
(i.e.itindicatesallocation oinstance f on ijk>0
Sjk denote the capacity omachine f for k resource classj
Z ijk and
-
Then . the following equationsassign
. the second iconstraint: as ijkOnly Yijk >=A
ijk/Sjk,
>= B ijk
ijk/Sjk.
Z
Finalallocationibs oundedbythesumofcapacitydema
ndedandtheexistingallocation.Also,incasethe
doesnothavesufficientcapacity,thisallowsdropping
ofpartofthedemandduetocustomerswithlowpriori
system ty.
The demand constraintis: (2)
,
B