Job-oriented VM State Synchronization in CloudStack

1 downloads 126 Views 892KB Size Report
Jobs targeting on the same VM are serialized and executed in order .... temporary store for in-band job to query informa
Job-oriented VM State Synchronization in CloudStack

Kelven Yang Citrix

Agenda –  –  –  – 

The Sync problem Legacy solution and its pain points High level principles of the new solution and change details Future work



The Sync Problem –  VM lifecycle in CloudStack •  Starting, Running, Stopping, Stopped, Migrating, Expunging –  VM lifecycle in Hypervisor •  Powered-off, Powered-on, Suspended –  Resource implications with VM in CloudStack •  Hypervisor VM resource •  Network environment •  Storage environment •  Guest OS environment –  Bring things in sync •  In-band VM operations •  Out-of-band VM operations

Legacy VM State-sync implementation –  Designed for in-band VM operations –  Hypervisor resource agent to participate CloudStack VM lifecycle management –  Full-sync/Delta-sync •  Setup system initial state with full-sync process •  Perform delta synchronization periodically



Legacy VM state-sync pain-points –  Resource agent to participate VM lifecycle management increased complexity of writing hypervisor agent •  Maintain in memory cache •  Monitor in-band operations issued from CloudStack •  Generate delta report –  Full-sync chain of actions •  In-place sync triggers chain of actions if a large number of VMs are out of sync •  In-place sync processing logic needs to exhaust all possible scenarios. –  Out-of-band changes are hard to be incorporated into the process –  Make a very tightly-coupled situation

High level principles of new VM state-sync –  Decouple Hypervisor resource agent from VM lifecycle management •  Report raw power state only •  Carry on hypervisor specific low-level operations only –  Serialize VM operations •  Jobs targeting on the same VM are serialized and executed in order •  State transition is handled within the context of the job –  Loosely couple interactions with messaging bus •  Glue VM state report, VM state management, VM HA management through the in-memory bus facility

VM State-Sync Interactions

Sync  event  source  

PowerState   SyncManager  

Out-­‐of-­‐band  change  processing  

VirtualMachine   Manager  

In  band  change  processing  

Orchestra6on   Orchestra6on   Jobs   Jobs  

Resource  agent  report   Publish  sync-­‐change  no6fica6ons  

Sync-­‐change  no6fica6on  

Raw   reports  

Message  Bus   Driving  thread  

Sync-­‐change  no6fica6on  

VM State-Sync interactions –  VM Power state sync manager •  Responsible to maintain power state management •  Responsible to generate change event and publish to the message bus –  In-band state transition handling •  Change notification only triggers the wakeup of the job that is waiting for these change events •  Process happens within the job context to complete state sync process –  Out-of-band state transition handling •  Out-of-band changes can be detected easily by looking at existence of pending job working on the VM

Related Schema changes for new State-Sync –  VM Power state management •  New fields in vm_instance table power_host, power_state_update_time, power_state_update_count, power_host –  Job management •  New vm_work_job table •  New async_job_join_map table •  New async_job_journal table

VM Power state-sync server part –  Power state change detection Initial base point is set when host is connected Detect changes based on periodical report from resource agent –  Missing VM detection VM in previous known good state report may be missing from next round of report, the situation may happen in scenarios when out-of-band VM deletion happens. –  Performance consideration VM stays at stationary states most of time, we may have same power state update of a particular VM for a long time

When the number of consecutive same updates exceeds a threshold, no need to make update.

VM Power state-sync – resource part –  Retire of resource agent VM state cache New management server sync logic no longer needs resource to maintain such cache, tracking of state transition and delta change detection are also not needed.

–  Example of resource agent to compose VM power-state report

private HashMap getHostVmStateReport() {

foreach(vm on the host) {

gather VM power state

put it in the report

}

return the report }

In-band change processing –  –  – 

– 

Job that is performing in-band change is responsible to orchestrate the process Target state transition is monitored through message bus and completion determination is checked through Predicate interface Example orchestration flow (pseudo code) submit a worker job to carry on VM operation _jobMgr.waitAndCheck( new String[] { TopicConstants.VM_POWER_STATE, TopConstants.JOB_STATE }, 3000L, 60000L, new Predicate() { @Override public boolean checkCondition()) {

VMInstanceVO instance = _vmDao.findByid(vm.getId());

if(instance.getPowerState() == VirtualMachine.PowerState.PowerOff)

return true;

return false; }); Predicate interface public interface Predicate {

boolean checkCondition(); }

Out-of-band change processing @MessageHandler(topic  =  Topics.VM_POWER_STATE)    private  void  HandlePowerStateReport(String  subject,  String  senderAddress,  Object  args)  {                          ….            if  (pendingWorkJobs.size()  ==  0  &&  !_haMgr.hasPendingHaWork(vmId))  {                          //  there  is  no  pending  opera6on  job                          VMInstanceVO  vm  =  _vmDao.findById(vmId);                          if  (vm  !=  null)  {                                  switch  (vm.getPowerState())  {                                  case  PowerOn:                                          handlePowerOnReportWithNoPendingJobsOnVM(vm);                                          break;                                      case  PowerOff:                                  case  PowerReportMissing:                                          handlePowerOffReportWithNoPendingJobsOnVM(vm);                                          break;                                    default:                                          assert  (false);                                          break;                                  }                          }                    }  else  {                          _vmDao.resetVmPowerStateTracking(vmId);                  }          }  

VM_POWER_STATE  topic  on  message  bus   to  trigger   Determina6on  of  out-­‐of-­‐band  changes   Dispatch  of  handling  for  out-­‐of-­‐band  changes  

Consider  this  as  in-­‐band  change,  let  it  be   handled  within  job  context.  Reset  tracking     so  that  we  won’t  lose  triggering  events  

Supporting facilities –  Message bus •  In Memory •  Hierarchical subscriber management •  Annotation based dispatching (@MessageHandler) –  Job facility We now tie all system activities with associated jobs, logging system is also updated to help problem diagnostic with tracking on per top-level jobs

•  API job

API job gives a running context for an asynchronous API requests. •  Work job

Work job carries the real orchestration process, its run will be serialized on target VM •  Pseudo job

Pseudo job gives a background thread a job context

Journey of getting there –  Dual state reports •  Simple model, but it is a model shift at very low level. •  Changes touch all hypervisor resources, all major orchestration flows, HA etc, it is very hard for unit test •  Dual state reports allow overlapping of old model and new model –  Compromises •  Synchronous in-band process for state transitions •  Thread starvation •  Result/Exception propagation across boundaries

Next Step –  In-memory power-state table Reason to store power-state is for change detection and as a temporary store for in-band job to query information, storage persistency is not mandatory.

Using DB memory based table would help to scale and improve performance –  General pluggable model to sync other types of information Immediate needs could be sync-support of storage DRS