Using Multiple Feedback Loops for Object Profiling, Scheduling and ...

Using Multiple Feedback Loops for Object Profiling, Scheduling and Migration in Soft Real-Time Distributed Object Systems V. Kalogeraki, P. M. Melliar-Smith and L. E. Moser Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 93106 [email protected], [email protected], [email protected]

Abstract Complex soft real-time distributed object systems require object profiling, scheduling and migration algorithms to respond to transient changes in the load or in the availability of the resources. We have developed a Resource Management System for a soft real-time distributed object system that is based on a three-level feedback loop which employs a profiling algorithm that monitors the usage of the resources, a least laxity scheduling algorithm that schedules the tasks, and hot spot and cooling algorithms that allocate and migrate objects to balance the load on the resources. The Resource Management System consists of a single (but possibly replicated and distributed) Resource Manager, and Profilers and Schedulers located on each of the processors in the distributed system.

from monitoring the behavior of the objects is used as feedback for the control of subsequent executions. The three levels of the feedback loop comprise: (i) a least laxity scheduling algorithm to dispatch method invocations on a timescale of a few milliseconds, (ii) a profiling algorithm that measures and records the usage of resources and the behavior of objects over a second or so, and (iii) hot spot and cooling algorithms that allocate and migrate objects to balance the load on the resources over many seconds. The Resource Management System, which is part of our Realize system [8, 11], is structured as a single Resource Manager, and a Profiler and Scheduler located on each of the processors within the distributed system, as shown in Figure 1. The Resource Management System is based on the Common Object Request Broker Architecture (CORBA) [13], which is becoming a widely accepted standard for developing distributed object applications over heterogeneous platforms.

1. Introduction As real-time distributed systems become larger and more complex, it becomes more difficult to predict the needs of the applications in advance, particularly because those needs are likely to change dynamically while the applications execute. Dynamic allocation of objects to processors is necessary, so that the system can respond to transient changes in the load or in the availability of the resources. When application objects are replicated for fault tolerance or high availability, the number of object replicas and the location of the object replicas must be determined as faults occur and are repaired. In real-time distributed systems, local scheduling of tasks does not suffice to ensure that real-time deadlines will be met; rather, a system-wide scheduling strategy must be employed. We have developed a Resource Management System for a soft real-time distributed object system that is based on a three-level feedback loop in which information obtained

2. The Resource Management System The Resource Management System is structured as a single Resource Manager for the system, and Profilers and Schedulers located on each of the processors. The objectives of the Resource Management System are as follows:

To increase the probability of satisfying soft real-time response time requirements for each application task and to achieve steady flow of operation of the tasks To balance the load (utilization) of the resources (processors, memory, disk and network) by allocating the application objects to the processors and reallocating them as necessary To satisfy reliability requirements by determining the degree and type of replication of the objects, as well as the placement of the object replicas.

Feedback: Loads on resources Residual laxities

Realize Resource Manager

Object A

Task initiation

Task completion

Allocations of objects to platforms

Object B

Object B

Invocation Response

CORBA ORB Realize Profiler

CORBA ORB

Realize Replication Manager

Realize Profiler

Platform


Platform

Object method

CORBA ORB Realize Profiler


Platform

Figure 2: A Method Invocation Graph Representing an Application Task.

Reliable ordered multicasts to invoke operations on other objects

Figure 1: The Resource Management System.

The Resource Manager works in concert with the Profilers and the Schedulers. The Profilers monitor the behavior of each application object and measure the current load on the processors. The Resource Manager maintains a global view of the application objects in the system by collecting data from the Profilers. The Resource Manager distributes the objects across the processors, migrating objects when necessary, to maintain a uniform load on the processors. The Schedulers exploit information collected by the Resource Manager to schedule tasks to meet soft real-time deadlines. The Resource Manager is implemented as a collection of CORBA objects that are allocated to various processors across the distributed system and that are possibly distributed and replicated to increase reliability; logically, however, there is only a single copy of the Resource Manager. The Profiler and the Scheduler on each processor are implemented in a layer between the CORBA ORB and the operating system.

Laxityt :

the difference between Deadlinet and Projected latencyt , a measure of urgency of task t,

calculated by the Scheduler. The Scheduler schedules task t according to Laxityt , which it dynamically adjusts as the task executes.

Residual laxityt :

the remaining laxity when task completes, calculated by the Scheduler

t

Mean invocations of task on methodtm or xtm :

the mean number of invocations that task t makes on method m, calculated by the Resource Manager.

2.2. The Method Invocation Graph for a Task To represent the method invocations for each application task t, the Resource Manager employs a Method Invocation Graph, Grapht. Each node in Grapht represents a method m, while each edge corresponds to an invocation of method m. With each node of Grapht, we associate:

Method namem : the name of the method m that is represented by the node.

With each edge connecting node m to node n, we associate:

2.1. Task Metrics An application task t consists of a sequence of object invocations between an external input (or possibly a timer signal or completion of another operation) and the generation of an external result. With each task t, we associate:

Deadlinet : the time interval, starting at task initiation within which task t should be completed, specified by the application designer

Importancet :

a metric that represents the relative importance of task t, specified by the application designer, that affects the decision of which task should be abandoned in the event of system overload

Projected latencyt :

the estimated amount of time from initiation to completion of task t, measured by the Profiler

Method invokedmn : method m invokes.

the name of the method

n

Figure 2 shows an example of a Method Invocation Graph. The Method Invocation Graphs for the different tasks have different roots but are not necessarily disjoint.

2.3. Object and Object Replica Metrics For each object i, the Resource Manager maintains:

Object importancei : the importance of object i to an application task

Methodsi : the set of methods of object i Replication typei : the type of replication

(active/passive) of object i, determined by the Resource Manager

Replication degreei : the number of replicas of object i, determined by the Resource Manager

Host namesi : the set of hosts where the replicas of object i may be located.

2.4. Method Invocation Metrics For each method m, the Resource Manager maintains:

Objectm : the object of which m is a method Mean local timem : the mean time required,

after receipt by a processor of a message invoking method m, for that processor to complete the invocation. This excludes communication time but includes queueing time and the time of embedded invocations of other methods

Mean remote timemn :

the mean time required for method m to invoke method n remotely, including communication and queueing time and also the time of embedded invocations of other methods.

Mean communication timemn :

the mean time to communicate an invocation from method m to method n and to communicate the response back. This is computed as the difference between Mean remote timemn and Mean local timen

Mean execution timem :

the mean time required for method m to execute locally, including queueing time but excluding the time required for embedded invocations of other methods

Mean processing timemp

or mp : the mean time required for method m to execute on processor p, excluding queueing time and the time required for embedded invocations of other methods.

Mean invocationsmn : the mean number of invocations that a single invocation of method method n.

m makes on

3. Profiling Algorithm Each Profiler operates on a continuous basis and reports its computed values to the Resource Manager. The Profiler for each processor computes the current load on the processor’s resources (processing, memory, disk and communication) and periodically transmits this information to the Resource Manager. For each method m, the Profiler provides information about the invocations of method m, which the Resource Manager uses to construct the Method Invocation Graphs for the tasks, to calculate the projected latencies, and to estimate the laxities for the tasks.

3.1. Profiler Measurements Each processor p is equipped with a profiler, Profiler p, whose function is to report to the Resource Manager (i) the current utilization of processor p’s resources, and (ii) information about the method invocations. A processor is characterized by its speed, the size of its local memory, and the size of its disk space. A communication link is characterized by the bandwidth of the link. Each resource is characterized by its maximum and current value, as obtained from the Profilers at runtime. Each Profiler p periodically reports to the Resource Manager the following quantities:

Loadp : the current load on processor p Memoryp : the memory in use on processor p Diskp : the disk space in use on processor p Disk accessesp : the number of disk accesses on processor p Bandwidthpq : the bandwidth in use on the communication link connecting processor p to processor q. Each method invocation or response, as monitored by the Profilers, is associated with the following information: (Action, local method m, remote method n, time of invocation T), where Action is determined by the Profilers and is one of the following: LOCAL START, LOCAL COMPLETE, REMOTE START, and REMOTE COMPLETE. The Profilers distinguish between a remote method invoking a local method (LOCAL START, LOCAL COMPLETE) and a local method invoking a remote method (REMOTE START, REMOTE COMPLETE), and also between the corresponding responses. For each method m invoked locally on processor p by an incoming message, Profilerp records the time of the invocation in Local start timem . It zeroes the array Invocationsm;? that it uses to record the number of invocations made by method m on other methods. When a local method m terminates, the Profiler determines the time taken and averages it into Mean local timem . Similarly, the Profiler averages the number of invocations by method m of each other method n0 into Mean invocationsmn0 . For each method n invoked remotely by method m on processor p, Profilerp records the time of invocation in Remote start timemn and increments Invocationsmn . When an invoked method completes and returns its response to the invoking method, Profilerp calculates an actual time for the remote invocation, Remote timemn , from the time at which method m invoked method n to the time at which it received the response. The Profiler averages this time into Mean remote timemn .

Profiling Algorithm executed by processor p initialize last report time to the current time S while (true) obtain (Action, local method m, remote method n, time of invocation T) case Action LOCAL START : set Local start time[m] to time of invocation T for all j (j=1,...,k) Invocations[m,j] = 0 LOCAL COMPLETE : Local time[m] = T Local start time[m] Mean local time[m] = Smoothing(Mean local time[m], Local time[m]) for all n0 : Mean invocations[m,n0] = Smoothing(Mean invocations[m,n0 ], Invocations[m,n0]) REMOTE START : set Remote start time[m,n] to time of invocation T Invocations[m,n] ++ REMOTE COMPLETE : Remote time[m,n] = T Remote start time[m,n] Mean remote time[m,n] = Smoothing(Mean remote time[m,n], Remote time[m,n]) if (current time S last report time >= report time interval) set last report time to S measure Load[p] on processor p for all communication links connecting processors p and q compute Bandwidth[p,q] on link (p; q) measure Memory[p] on processor p measure Disk[p] and Disk accesses[p] on processor p Report Resource utilization and Method invocations to Resource Manager

?

?

?

Smoothing(mean value, value) mean value = mean value (1

? weightvalue ) + value weightvalue

Figure 3: Pseudocode for the Profiling Algorithm. For the computation of the mean values, the Profilers use a smoothing function based on exponentially weighted averaging. The Profiling Algorithm, shown in Figure 3, operates on a timescale of seconds. Periodically, the Profiler generates a report to the Resource Manager, which constructs a profile for the entire system. The Resource Manager maintains a global view of the system and does not need to act individually on each measurement made by each Profiler.

3.2. Resource Manager Computations The Resource Manager estimates the Mean local timem required to invoke method m (processing, embedded invocations, queueing) by calculating the mean of the means Mean local timem obtained from the Profilers. Similarly, it estimates the Mean remote timemn for

methods m and n (communication, processing, queueing, embedded invocations and overheads) and also the Mean invocationsmn . In addition, it calculates the Mean communication timemn by taking the difference of the Mean remote timemn and Mean local timem that it calculated. Moreover, the Resource Manager estimates the Mean execution timemp for method m on processor p, which includes queueing time but excludes embedded invocations of other methods. The estimate is derived from the Mean local timem for method m, Mean invocationsmn for each other method n invoked by method m, and Mean remote timemn for each other method n. The Resource Manager then determines the mean processing time mp for each method m on processor p. We let p be the load on processor p. In the absence of more specific information about the application’s behavior, we assume an M/M/1 queueing model for the execution time. Thus, the mean processing time for method m on processor p is given by:

mp (1 ? p ) Mean execution timemp Similarly, we let mc be the mean transmission time on communication link c for invoking method m, excluding queueing delays, and c be the load on communication link c. The mean transmission time for the invocations and responses of method m on communication link c is then given by:

mc (1 ? c ) Mean communication timemc From a traversal of the Method Invocation Graph for task

t and a classical equilibrium flow analysis, the Resource Manager determines the mean number xtm of invocations of method m for one execution of task t. Given the mean processing time and the mean transmission time for method m on processor p and the mean number xtm of invocations of method m made by task t, the Resource Manager computes the projected latency for the entire task as:

x xtmmc tm mp Projected latencyt pmin + 1 ? c m :mp 1 ? p where m p denotes that the object i of which m is a method executes on processor p. The minimum is taken over all processors p that host replicas of which m is a method. X

The Resource Manager’s estimate of the projected latency for task t, based on the reports of the Profilers, represents the second level of the feedback loop, as shown by the light arrows in Figure 5. If the Resource Manager’s estimate of the projected latency for task t is accurate, the residual laxity when task t completes will be the same as the initial laxity for task t. The Resource Manager uses the ratio of the residual laxity to the initial laxity to adjust its estimates of the projected latency for the task.

Least Laxity Scheduling Algorithm() get (event,event info) case event TASK STARTS: if Projected latency[t] for task t is available Initial laxity[t] = Deadline[t] Projected latency[t] Laxity[t] = Initial laxity[t] else Initial laxity[t] = k Deadline[t] Laxity[t] = Initial laxity[t] METHOD INVOKED: get event info for method m of task t on processor p from invocation message get Laxity[t] for method m schedule method m on processor p according to Laxity[t] METHOD RESPONSE RECEIVED: get event info for response of method n to method m of task t on processor p Laxity[t] = Laxity[t] Remote time[m,n] + Mean remote time[m,n] TASK COMPLETES: get event info for task t Residual laxity[t] = Laxity[t]

?

?

Figure 4: Pseudocode for the Least Laxity Scheduling Algorithm.

4. Least Laxity Scheduling In a soft real-time distributed system that changes dynamically, it is not easy to determine a fixed preplanned schedule for the tasks to be performed on the various processors. Dynamic scheduling algorithms are more appropriate than long-term or worst-case scheduling algorithms. Dynamic scheduling algorithms take advantage of the dynamic resource needs of the applications and try to ensure that, at any time, the timing requirements of the applications are met. In [3] it was shown that for two or more processors, no scheduling algorithm can be optimal without a priori knowledge of the deadlines, computation times, and real times of the tasks. Least laxity scheduling is, however, quite effective provided that the system is not overloaded. In least laxity scheduling, the laxity of task t represents a measure of urgency of the task. The laxity is defined by:

Laxityt = Deadlinet ? Projected latencyt where Deadlinet is the time by which task t must be completed and Projected latencyt is the estimated time to complete task t. The Scheduler on each processor executes the least laxity scheduling algorithm, shown in Figure 4. It calculates the initial laxity, Laxityt , of task t by subtracting Projected latencyt from Deadlinet . It then calculates the projected latency for task t using the formula in Section 3.3.

Method processing and communication profile Projected task processing profile: graph of method invocations

Task initiation Deadline

Resource performance and load

Projected method latencies

Residual laxity

Projected task latency Initial laxity

Remaining laxity Measured processing and communication times

Scheduling and processing

Figure 5: Multiple Feedback Loops. If the projected latency for task t is not available, the Scheduler gives task t an initial laxity that is a proportion of the task’s deadline. As task t executes, the Scheduler schedules the task according to its remaining laxity, Laxityt . The message conveying the invocation of a method m of task t on another method n carries the laxity, Laxityt , with it from one processor to another, yielding a system-wide scheduling strategy that requires only local computation. The Scheduler on the remote processor uses this laxity value to schedule the execution of method n to meet the soft real-time deadline of task t. There is no need to carry a task laxity in response messages. If two methods have the same laxity value and the time margins are sufficient to run either method, we do not preempt the currently running method but, rather, complete its execution. If a new method arrives at the processor with a smaller laxity value, then the new method is scheduled first. When a response is received by object m from a remote object n, the Scheduler subtracts from the remaining laxity, Laxityt , the difference between the actual time Remote timemn , measured by the Profiler, and the Mean remote timemn , calculated by the Resource Manager and used in the calculation of the initial task laxity. If the invocation completes more quickly than was projected by the Resource Manager, the task laxity increases and the task’s scheduling priority decreases. If the invocation completes more slowly, the task laxity decreases and the task’s scheduling priority increases. The adjustment of the laxity, Laxityt , provides the feedback loop shown by dark arrows at the lower right of Figure 5. This is the first level of the three-level feedback loop. All computations are simple and local, allowing the loop to operate on a millisecond by millisecond basis. The rest of the information flow, shown by light arrows in the figure, is recorded by the Profilers and reported to the Resource Manager, and form the second level of the feedback loop which operates more slowly, on a second-

Task

Deadline

Projected Latency

Laxity t=0

t=1

t=2

t=3

t=4

2 1

1 1

1 1

1 1

A

6

3

3

B1 & B2

9

8

1

t=5 t=6

1

1

t=7 t=8

1

Task A Schedule generated using earliest deadline first scheduling algorithm

B1

Processor P

Processor Q

Cooling Algorithm() for all objects i Check object[i] = 0 find processor low with least Load[low] find processor high with highest Load[high] find processor high2 with second highest Load[high2] while (Load[high] > Load[high2] && there exists an object j with Check object[j] == 0) for all objects i of processor high with Check object[i] == 0 for all methods m of object i and all tasks t that invoke m Object load[i] + = Mean processing time[m,high] r[t,m] Mean invocations sec[t] find object i with max Object load[i] Check object[i] = 1 if (Object Load[i] + Load[low] < min(MAX LOAD,Load[high])) move object i from processor high to processor low update host names of object i Load[high] = Load[high] Object Load[i] Load[low] = Load[low] + Object Load[i] find processor low with least Load[low] find processor high2 with second highest Load[high2]

Task B B2 Task A B1

Schedule generated using least laxity scheduling algorithm

Processor P

?

Processor Q Task B B2

Figure 6: Least Laxity Scheduling. Figure 7: Pseudocode for the Cooling Algorithm. by-second basis. When task t completes, the Scheduler records the task’s remaining laxity as the residual laxity, Residual laxityt . The effectiveness of the least laxity scheduling algorithm compared to the earliest deadline first algorithm in a distributed system can be seen by considering the following example. Assume that tasks A and B arrive independently at processor P, as shown in Figure 6. Task A consists of a single method, while task B consists of methods B1 and B2. The earliest deadline first algorithm is driven by the deadlines of the tasks and assigns the highest priority to the task with the earliest deadline. Consequently, task A is scheduled first on processor P and task B starts executing when task A finishes, i.e., at time t = 3 and, thus, task B misses its deadline. In constrast, the least laxity scheduling algorithm takes into consideration the execution time (i.e., processing and queueing time) of the tasks on the processor and schedules each method of the task independently according to the task’s remaining laxity. Thus, method B1 of task B is scheduled first on processor P. When method B 1 finishes execution, both tasks have the same laxity values, task A is scheduled on processor P, and method B2 of task B is scheduled on processor Q. Therefore, both of the tasks meet their deadlines.

5. Object Migration The Resource Manager maintains a global view of the application objects in the system and attempts to maintain a uniform load on all of the processors. It tries to meet the

requirements of all tasks in the system, but this cannot always be achieved. Moreover, the addition of a new task can result in degradation of the performance of existing tasks. To satisfy the requirements of a new task, the Resource Manager can migrate objects to different processors or reduce the degree of replication of objects [9]. Migration of objects may also be required when a processor fails, when the current load on a processor is too high, or when the latency of a task is too high. The Resource Manager uses a Cooling Algorithm to migrate objects when the load on a processor is too high, and a Hot Spot Algorithm to migrate objects when the latency of a task is too high.

5.1. Cooling Algorithm Periodically, the Resource Manager checks whether the load on a processor exceeds a predefined load limit, MAX LOAD, determined by the application designer, or whether the load on the processor has increased considerably compared to the load on the other processors. If so, the Resource Manager determines whether any of the objects on that processor can be moved to the processor with the lowest load. The Resource Manager identifies the processor low with the lowest load, the processor high with the highest load, and the processor high2 with the second highest load. It then calculates the contributions to the load on high made by the various objects on high. This load is derived from the mean processing time for the various methods of the object and the number of invocations per second of those

methods by the various tasks. The objects are considered in the order of their loads. The Resource Manager then determines whether the candidate object i can be moved to the least-loaded processor low without increasing the load on that processor above the allowable load. This continues until the load on processor high is below that of the second most highly loaded processor high2 or until all of the objects have been considered. Pseudocode for the Cooling Algorithm is given in Figure 7.

5.2. Hot Spot Algorithm Periodically, the Resource Manager estimates the projected latency for each task t (as described in Section 3.3), to check whether the latency of any task is too high a proportion of its deadline. It does this by searching for the object that is causing the largest delay for task t. This is the object i, executing on some processor p, whose methods m cause, in aggregate, the largest increase in the latency to the completion of task t because of queueing, computed as:

X xtm mp ? xtmmp mi (1 ? p ) X xtm mp p (1 ? ) =

Queueing latencytip

mi

p

where m 2 i indicates that m is a method of object i. The Resource Manager identifies the object i that is causing the largest queueing delay for task t and the processor p on which object i is executing. Once it has identified object i, it selects the processor low with the lowest load and attempts to move object i to processor low. If this object migration would cause the load on processor low to exceed either the predefined load limit, MAX LOAD, or the allowable load on processor p, it attempts to move the next candidate object (causing the next largest queueing delay for task t). This continues until the Resource Manager determines an appropriate object to move. Once it has identified such an object, the Resource Manager moves that object from processor p to processor low. Pseudocode for the Hot Spot Algorithm is given in Figure 8.

6. Experimental Results We have investigated the effectiveness of our multiple feedback loops by considering the time required to balance the load on the resources and to achieve steady flow of operation for the tasks. In these experiments we concentrated on the load on the processors and ignored the network traffic produced by the application objects. Our results show that, if the Resource Manager delays in adjusting the resource allocations, the load may be unevenly distributed across the processors and the application’s timing requirements may

Hot Spot Algorithm() find processor low with least load Load[low] for all processors p for all objects i of processor p compute Queueing latency[tip] compute Object load[i] for all objects i of processor p in decreasing order of Queueing latency[tip] if (Object load[i] + Load[low] < min(MAX LOAD, Load[p])) move object i from processor p to processor low update host names of object i break

Figure 8: Pseudocode for the Hot Spot Algorithm. not be met. On the other hand, fast reactions by the Resource Manager may also be inappropriate.

6.1. Analysis The load on a processor is affected by the following timing measurements, which are shown in Figure 9.

Reporting time: the time required for the Profilers to report the resource utilizations to the Resource Manager Monitoring time: the time required for the Resource Manager to detect an overloaded processor, based on the Profilers’ reports Migration time: the time required to migrate an object from the overloaded processor Adaptation time: the time required for the load on the overloaded processor to decrease, as a result of the object migration.

The reporting time required for the Profilers to report the resource utilizations to the Resource Manager depends on the communication between the Profilers and the Resource Manager. This is achieved through the use of CORBA’s Internet Inter-ORB Protocol (IIOP) in association with the use of TCP/IP sockets. The Profilers measure the utilization on the processors’ resources, construct IIOP messages containing this information and write the messages on the sockets to be read by the Resource Manager. Typically, constructing or parsing an IIOP message requires 100 microseconds, while reading or writing on the socket requires 5 microseconds. Based on the Profilers’ reports, the Resource Manager can determine whether a processor is overloaded. The monitoring time needed for the Resource Manager to detect an overloaded processor depends on the frequency with which the Profilers report their feedback information to the Resource Manager. The more frequent the Profilers report their measurements, the sooner the Resource Manager

5

Processor Load

overloaded processor detected

Profiler_Frequency=1 Profiler_Frequency=60 Profiler_Frequency=200

acceptable load limit 4

object starts execution

3

2

1

Reporting Adaptation Time Time Monitoring Time Migration Time

Time

Figure 9: Effect of Migration on Processor Load. can determine load fluctuations. If the Profilers’ reports are too frequent, this may drive the Resource Manager to act on each reported measurement individually and to decide to migrate objects more frequently than necessary. The migration time depends on the time required to move an object from the overloaded processor. Once the Resource Manager has selected a candidate object to move to a particular destination processor, it instantiates the object on the chosen processor and sets the state of the newly created object. The migration time depends on the size of the state of the object. The cost of moving an object can be justified by amortization over many invocations, and constrains the rate at which the Resource Manager moves objects. The adaptation time is the time period during which the load on the overloaded processor is reduced by the load imposed by the object that has been migrated. The more load the object imposes, the more time is required for the load to be reduced. The Profilers observe the load reduction and notify the Resource Manager through their reports.

6.2. Measurements The experimental platform for our measurements consisted of six 167 MHz Sun ULTRASparcs running Solaris 2.5.1 with the VisiBroker ORB over 100 Mbit/s Ethernet. Several application objects that imposed different loads on the processors for different lengths of time were used in the experiments. The experiments focused on the load fluctuations reported by a Profiler on a single processor, due to the application objects on that processor. Our experiments indicate that, although the load fluctuations on the processor are affected by all of the times given in Section 7.1, they are affected mainly by the frequency

0 0

200

400

600

800

1000

1200

1400

1600

1800

Figure 10: The Load on a Processor for Different Profiler Frequencies.

at which the Profiler provides feedback information to the Resource Manager. Based on this feedback information, the Resource Manager determines whether it is appropriate to migrate an object. In our study, we investigated the effect of three different reporting frequencies for the Profiler. Figure 10 shows the load on the processor as a function of time, using different reporting frequencies of the Profiler. In the first set of experiments, the Profiler reported its resource utilization measurements every millisecond. The more often the Profiler reports its measurements, the more accurate are the measurements held by the Resource Manager. In this case, the Resource Manager might detect an increase in the load on the processor, and might proceed with an object migration even though the increase is only transient. This might result in a large number of object migrations, which are unwarranted. In the second set of experiments, the Profiler reported its measurements less frequently (every 200 milliseconds). In this case, the Resource Manager did not determine the load fluctuations on the processor correctly. In the third set of experiments, the Profiler reported its measurements every 60 milliseconds. This suffices for capturing the load fluctuations on the processor accurately, so that the Resource Manager can react appropriately when the load increases. Note that the Profilers monitor both the resource utilizations and the behavior of the application objects. Our experiments, however, addressed only the rate at which the Profiler reported resource utilizations and not the rate at which it reported application object behavior. The effectiveness of our multiple feedback loop structure can be further improved if the characteristics of the monitored applications are also considered. The application tasks in the

system define the flow of operation and, therefore, determine how frequently the Profilers should provide feedback information to the Resource Manager.

7. Related Work Several efforts have focused on extending CORBA with real-time and QoS capabilities. Zinky et al [19] have developed an architecture, called Quality of Service for CORBA Objects (QuO). QuO specifies an application’s expected usage pattern and QoS requirements for each object connection, by extending the Interface Definition Language (IDL) of CORBA with a Quality of Service Description Language (QDL). Wolfe et al [18] have introduced timed distributed method invocations to enable the expression and enforcement of timing constraints in CORBA client/server interactions by attaching a separate object with timing characteristics to each method invocation. Schmidt et al [15] have developed a custom high-performance real-time ORB endsystem, called TAO, that provides end-to-end QoS specification and enforcement mechanisms and demultiplexing optimizations for efficient request dispatching. They have focused mainly on hard real-time systems using rate monotonic scheduling, but recently they have moved to maximum urgency first scheduling, which provides scheduling assurance for critical tasks while offering the flexibility to optimize the use of scarce resources [4]. Many researchers have realized the need for systems that can adapt to dynamic, unpredictable changes in the computing environment. Nett et al [12] have developed an adaptive object-oriented system using integrated monitoring, dynamic execution time prediction and scheduling to provide time-awareness for standard CORBA object invocations. Rosu et al [14] have introduced adaptive resource allocation mechanisms for complex real-time applications that can adjust resource allocation to changes in the applications’ needs and have proposed a satisfiability-driven set of performance metrics for capturing the impact of those mechanisms on the performance of the applications. Sydir et al [17] have implemented an end-to-end QoS-driven resource management scheme within a CORBA-compliant ORB, called ERDoS. They provide end-to-end QoS requirements corresponding to the resource demand requirements of each individual object and use an information-driven resource manager that enables applications to achieve their QoS requirements. Recent efforts have focused on enhancing distributed systems with mechanisms for dynamic reallocation of tasks, resources or objects to processors within a distributed system. Hou and Shin [7] have studied the migration of tasks to other processors so that each task can meet its current real-time deadline. Chen et al [2] have developed object migration techniques in a reflective object-oriented distributed

system, called RODS, oriented mainly towards distributed multimedia applications. Shrivastava et al [16] have investigated the dynamic reconfiguration of large-scale distributed applications, through the design of a workflow system based on CORBA, in order to reflect changes in the computing resources or the user requirements. Bettati et al [1] have developed mechanisms to ‘‘migrate’’ resources from node to node in the network by dynamically changing resource allocations, without affecting the timing guarantees provided to the end users. Research into scheduling has been dominated by hard real-time systems, but some useful results are available for soft real-time distributed systems. Manimaran et al [10] have shown than efficient dynamic scheduling algorithms require knowledge of real-time task information including the task’s deadline, resource requirements and worst-case computation time. Several researchers [3, 6] have shown that least laxity scheduling is an effective strategy for realtime distributed systems. Gupta et al [5] have investigated scheduling algorithms based on compact task graphs. We extend that work to real-time distributed object systems by combining least laxity scheduling with profiling to construct method invocation graphs and projected task latencies.

8. Conclusion We have developed a Resource Management System based on object profiling, scheduling and migration algorithms for soft real-time distributed object systems. The Profilers enable the Resource Manager to estimate projected task latencies. The Schedulers employ a least laxity scheduling algorithm to schedule tasks in real-time across multiple processors with low overhead. The Resource Manager detects overloaded resources and tasks that cannot meet their deadlines, migrating objects to other processors to correct such conditions. A three-level feedback loop allows activities with different levels of temporal granularity, scheduling at the level of milliseconds, profiling over seconds, and object migration over many seconds. By profiling the behavior of the application objects at run time, the system is able to balance the load across processors, even if the application designer provides little information and even after faults have reduced the number of available resources. Although our work is based on CORBA, the object profiling, scheduling and migration algorithms are general and applicable to other soft real-time distributed object systems.

Acknowledgments This research has been supported by DARPA and AFOSR, Contracts N00174-95-K-0083 and F3602-97-1-0248, and by Rockwell Science Center through the State of California MICRO Program, Grant 96-052.

References [1] R. Bettati and A. Gupta. Dynamic resource migration for multiparty real-time communication. In Proceedings of the IEEE 16th International Conference on Distributed Computing Systems, pages 646--655, Hong Kong, 5 1996. [2] L. T. Chen, L. Xu, T. Suda, T. Yamamoto, and K. Obinata. A reflective object-oriented distributed system for heterogeneous multimedia environments. In Proceedings of the Fourth International Conference on Computer Communications and Networks, pages 186--193, Las Vegas, NV, 9 1995. [3] M. L. Dertouzos and A. K. Mok. Multiprocessor on-line scheduling of hard-real-time tasks. IEEE Transactions on Software Engineering, 15(12):1497--1506, 12 1989. [4] C. Gill, D. L. Levine, D. C. Schmidt, and F. Kuhns. Evaluating strategies for real-time corba dynamic scheduling. International Journal of Time-Critical Computing Systems, special issue on Real-Time Middleware, 1999. to appear. [5] R. Gupta, D. Mosse, and R. Suchoza. Real-time scheduling using compact task graphs. In Proceedings of the IEEE 16th International Conference on Distributed Computing Systems, pages 55--62, Hong Kong, 5 1996. [6] J. Hong, X. Tan, and D. Towsley. A performance analysis of minimum laxity and earliest deadline scheduling in a real-time system. IEEE Transactions on Computers, 38(12):1736--1744, 12 1989. [7] C. J. Hou and K. G. Shin. Load sharing with consideration of future task arrivals in heterogeneous distributed real-time systems. IEEE Transactions on Computers, 43(9):1076-1090, 9 1994. [8] V. Kalogeraki, P. M. Melliar-Smith, and L. E. Moser. Soft real-time resource management in corba distributed systems. In Proceedings of the IEEE Workshop on Middleware for Distributed Real-Time Systems and Services, pages 46--51, San Francisco, CA, December 1997. [9] V. Kalogeraki, L. E. Moser, and P. M. Melliar-Smith. Dynamic modeling of replicated objects for dependable real-time distributed object systems. In Proceedings of the Fourth International Workshop on Object-oriented Realtime Dependable Systems, pages 65--75, Santa Barbara, CA, 1 1999. [10] G. Manimaran and C. R. R. Murthy. An efficient dynamic scheduling algorithm for multiprocessor real-time systems. IEEE Transactions on Parallel and Distributed Systems, 9(3):312--319, 3 1998. [11] P. M. Melliar-Smith, L. E. Moser, V. Kalogeraki, and P. Narasimhan. The realize middleware for replication and resource management. In Proceedings of the IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing, Middleware ’98, pages 123--138, The Lake District, England, 9 1998.

[12] E. Nett, M. Gergeleit, and M. Mock. An adaptive approach to object-oriented real-time computing. In Proceedings of the IEEE 1st International Symposium on Object-Oriented Real-Time Distributed Computing, pages 342--349, Kyoto, Japan, 4 1998. [13] I. Object Management Group. The Common Object Request Broker Architecture, 2.2 edition, 2 1998. [14] D. I. Rosu, K. Schwan, and R. Jha. On adaptive resource allocation for complex real-time applications. In Proceedings of the IEEE 18th Real-Time Systems Symposium, pages 320--329, San Francisco, CA, 12 1997. [15] D. Schmidt, D. Levine, and S. Mungee. The design of the tao real-time object request broker. Computer Communications, 21(4):294--324, 4 1998. [16] S. K. Shrivastava and S. M. Wheater. Architectural support for dynamic reconfiguration of large scale distributed applications. In Proceedings of the IEEE 4th International Conference on Configurable Distributed Systems, pages 10-17, Annapolis, MA, 5 1998. [17] J. J. Sydir, S. Chatterjee, and B. Sabata. Providing end-to-end qos assurances in a corba-based system. In Proceedings of the IEEE 1st International Symposium on Object-Oriented Real-Time Distributed Computing, pages 53--61, Kyoto, Japan, 4 1998. [18] V. F. Wolfe, L. C. DiPippo, R. Ginis, M. Squadrito, S. Wohlever, I. Zykh, and R. Johnston. Real-time corba. In Proceedings of the IEEE 3rd Real-Time Technology and Applications Symposium, pages 148--157, Montreal, Quebec, Canada, 6 1997. [19] J. Zinky, D. Bakken, and R. Schantz. Architectural support for quality of service for corba objects. Theory and Practice of Object Systems, 3(1):55--73, 1997.

Using Multiple Feedback Loops for Object Profiling, Scheduling and ...

Using Multiple Feedback Loops for Object Profiling, Scheduling and ...

Suggest Documents

Phase-Locked Loops using State Variable Feedback for Single-Phase ...

Phase-Locked Loops using State Variable Feedback for Single-Phase

Functional Roles of Multiple Feedback Loops in ... - Cancer Research

Multiple Object Retrieval for Image Databases Using Multiple Instance ...

Improved Multilevel Feedback Queue Scheduling Using Dynamic ...

Multiple Object Retrieval for Image Databases Using Multiple Instance ...

Background Feedback Loops and Innovation Process ...

A Machine Tool Controller using Cascaded Servo Loops and Multiple ...

Sensor scheduling for multiple target tracking and detection using

OPTIMIZING AND PARALLELIZING LOOPS IN OBJECT ... - CiteSeerX

engineering and simulating feedback loops for self-adaptive systems

Exploiting Multiuser Diversity Using Multiple Feedback ... - Eurecom

SCHEDULING MULTIPLE SENSORS USING PARTICLE ... - CiteSeerX

Energy Efficient Datapath Scheduling using Multiple ... - CiteSeerX

SCHEDULING MULTIPLE SENSORS USING PARTICLE FILTERS IN ...

Object Tracking Using Multiple Neuromorphic Vision ... - CiteSeerX

Multiple Object Tracking Using Local PCA

Video Object segmentation using Multiple Features

Multiple Object Tracking Using Local PCA - CiteSeerX

Multiple Object Tracking Using Particle Filters

A Transverse Feedback System using Multiple Pickups for ... - CERN

Eigenvalue Analysis of Dominant Feedback Loops - System ...

Service Discovery in Ubiquitous Feedback Control Loops

YourPassword: Applying Feedback Loops to Improve Security ...