Minimizing end-to-end response time in transactions

3 downloads 0 Views 70KB Size Report
Minimizing end-to-end response time in transactions∗. Enrico Bini ... *This work was partially supported by EU research project ACTORS. ICT/216586.
Minimizing end-to-end response time in transactions∗ Enrico Bini Scuola Superiore Sant’Anna Pisa, Italy [email protected]

Abstract

tasks onto the nodes is interesting, however it is not investigated in this paper.

Many applications are today implemented by chains of tasks (called transactions) that require to execute on several computational nodes. When more than one task are allocated on the same node, it is necessary to assign properly the available computational resource to each task. In this paper we show that the rule of thumb of assigning an amount of resource proportional to the task computation time does not minimize the end-to-end response time. We derive instead the optimal resource assignment.

As the tasks of the transactions are distributed over the available nodes, it is necessary to arbitrate the computational resources among the tasks. One commonly adopted solution is to assign a priority to all the tasks in a node and then to dispatch them by a scheduling algorithm. If priorities are assigned statically then a Fixed Priority (FP) scheduler is used. In this case the system can be analyzed by holistic schedulability analysis [11, 5]. Another possibility is to schedule the tasks by assigning intermediate deadlines to the tasks. In this case an Earliest Deadline First (EDF) scheduling algorithm is used and the analysis involves the assignment of offsets and intermediate deadlines to the tasks [7, 8].

1 Introduction The computation of modern applications is often divided in stages, where each stage consumes the output of the preceding one. The transaction task model has been introduced to capture the structure of this kind of applications. A transaction is a sequence of tasks where each task is activated by the completion of the preceding one [11, 6]. The first task instead is activated by an external event that can be periodically generated by a timer or triggered by some condition that occurs in the environment. This model is quite common in all the applications where a sequence of operations is required (in Figure 1 we show a graphical representation of a transaction). τ1

τ2

τ3

τ4

Figure 1. A transaction of tasks. Each task of the transaction is then mapped onto a computational node that may provide specific hardware acceleration or dedicated I/O devices. The problem of mapping the ∗ This work was partially supported by EU research project ACTORS ICT/216586.

However the problem in both the FP and the EDF approaches is that a misbehavior in a task may affect also the other transactions due to the interactions at the level of the node scheduler. In software design it is a recommended practice to isolate the transactions, so that if some task misbehaves the other transactions are not affected. This approach is called component-based because it establishes some degree of isolation between the transactions. When a component-based design is adopted, it is necessary to partition the computational resources provided by the node among the transactions requiring it. The allocation policy of the computational resources among the transactions requires to evaluate of the importance of the transactions and it is typically made at user level: the user decides what is the application that requires more resource and the one that requires less resource, based on some global utility function. In Figure 2 we can see two transactions mapped onto three nodes. The transaction i is allocated a share Si,j of the resource j. However one question remains unanswered: how do we further allocate the resource among the tasks

node 1 S1,1

node 2 S1,2

node 3 S1,3

τ1,2

τ1,3

τ1,4

τ1,5

τ1,1

S2,1

S2,2

τ2,1

S2,3

τ2,2

Figure 2. Two transactions on three nodes. belonging to the same transactions? If only one task of the same transaction is mapped on some resource fraction Si,j , then it is reasonable to assume that the task should get all the allocated resource. In Figure 2, tasks τ1,1 , τ2,1 , and τ2,2 are in this condition. On the other hand, it may happen that more than one task compete for the share on the same node (tasks τ1,2 , τ1,4 and τ1,3 , τ1,5 in Figure 2). One intuitive rule of thumb could be to assign the allocated resource among the tasks of the same transaction proportionally to their computation time. However we will see in the next sections that this allocation strategy is not optimal, in the sense that there is an assignment strategy that can better minimize the end-to-end response time of the transaction.

1.1

Related works

Tindell and Clark [11] proposed the holistic analysis for task transaction. In their analysis several iterations of the response time calculation are performed. At each iteration the task jitters are propagated until the analysis reaches a stability. Palencia and Gonz´alez Harbour [6] added static and dynamic offset to the holistic analysis. M¨aki-Turja and Nolin [5] proposed to simplify the holistic analysis by approximating the interference. All the cited works use the FP scheduler. Pellizzoni and Lipari [7] investigated instead the analysis of task transactions scheduled by EDF. Rahni et al. [8] proposed a simplified test to account for the offset in EDF scheduled transactions. Regarding the abstraction of the platform, many authors [1, 3, 9] independently proposed to extract a supply function that represents the minimum resource provided to the allocated application. Feng and Mok [2] introduced the bounded-delay abstraction of the platform. Stiliadis and Varma presented similar ideas in the field of network-

ing [10]. Finally Lorente et al. [4] proposed the holistic schedulability analysis when the computational nodes are abstracted by a bounded-delay server. Finally, we stress that all the mentioned works analyze whether or not the specified timing requirements (deadlines) are met for a given platform abstraction. To best of our knowledge the selection of the parameters of the platform that minimizes the end-to-end response time has not been addressed. Below we introduce the terminology and notation used in the paper.

2 Application and platform model The system we are considering is composed by n applications {Γ1 , . . . , Γn } that need to execute, and m computational nodes {Π1 , . . . , Πm } that are capable of executing instructions. Each application Γi is modeled by a transaction composed by a set of ki tasks {τi,1 , . . . , τi,ki }. For each task τi,j we label by ai,j its activation time, Ci,j its computation time, fi,j its finishing time, Ri,j its response time (that is Ri,j = fi,j − ai,j ). Since each task is activated at the completion of the preceding one we have ai,j = fi,j−1 . Finally, we define also the end-to-end response time of the transaction as Ri = fi,ki − ai,1 =

ki X

Ri,j

(1)

j=1

Each task is also characterized by the node where it is allocated. Hence we set xi,j ∈ {1, . . . , m} equal to the index of the node where τi,j is allocated. In general we allow a task to be mapped onto many different nodes. However at this stage we do not investigate this allocation problem and we assume that tasks are already partitioned among the nodes. In the system it is also present a resource manager that assigns the computational resource to the applications (transactions). This assignment responds to some userdefined policy and it is out of the scope of this paper. For each pair (i, j) of transaction Γi and computational node Πj , the resource manager allocates a share Si,j that describes the amount of computational resource of node Πj that is allocated to the transaction Γi . Since we assume that the allocation of the resources to the transactions are given, we can then focus our attention on one single transaction. Hence, for simplicity, from now on we drop the index of the transaction, and we will

be studying the generic transaction Γ composed by tasks τ1 , . . . , τk that is allocated S1 , . . . , Sm fractions of the resources Π1 , . . . , Πm . Notice also that due to our simpler notation, Equation (1) for the end-to-end response time becomes k X Ri (2) R=

Notice that the great benefit of assuming the (very simple) bandwidth as interface of the computational resource is in the simplicity for writing the response time. The goal of the allocation problem is to minimize the end-to-end response time of Eq. (3) such that the following constraints are satisfied

i=1

where Ri is the response time of task τi . Until now we did not clarify yet how the “share” Sj of the node Πj allocated to Γ is specified. In the literature we can find many valuable alternatives. Many authors [1, 3, 9] proposed to use a supply function Zj (t) that specifies the minimum amount of resource that is available to the transaction Γ in any interval of length t. Feng and Mok [2] proposed the (α, ∆) server model where only the bandwidth α and the delay ∆ are extracted from a detailed supply function. This idea is also present in networking [10]. Finally, the simplest platform model is based on the only bandwidth α that represents the speed of a virtual processor where the transaction is assumed to run. In the rest of the paper we call platform model the way that is used to specify Sj .

3.1

Modeling the platform by the bandwidth

In this case we assume that the share Sj of Πj allocated to Γ is expressed as processor speed. Hence, we assume that on each node it is implemented a reservation mechanism that is capable to allocate a “fluid” fraction of its computational power to the mapped transactions. Although the fluid allocation cannot be implemented in practice, but it can only be approximated, we consider this very basic model because the bandwidth indeed synthesizes the key feature of a platform abstraction. We denote by αi the bandwidth allocated to the task τi . Since the goal is to find the best bandwidth distribution that minimize the end-to-end response time, αi are our unknowns. The end-to-end response time can be written as R=

k X i=1

Ri =

k X Ci i=1

αi

(3)

i:xi =j

αi ≤ Sj

(4)

meaning that the bandwidth allocated to all the tasks running on the same node Πj cannot clearly exceed the bandwidth allocated to the transaction. We now minimize the end-to-end response time by the standard technique of the Lagrange multipliers. The Lagrange function is

L=

k X Ci i=1

αi

+

m X j=1



µj 

X

i:xi =j



αi − Sj 

(5)

If we differentiate with respect to αi and we set the partial derivative equal to zero, we find

3 Minimizing end-to-end response time In this paper we investigate how to make the best usage of the allocated resource Sj such that the end-to-end response time of the transaction Γ is minimized. This operation is performed assuming two possible platform models: the only bandwidth α and the (α, ∆) server model.

X

∀j = 1, . . . , m

∂L Ci = − 2 + µxi = 0 ∂αi αi

(6)

from which it follows that all µj > 0. This means that the constraint of Eq. (4) holds with the equal sign (this is quite expected since it doesn’t seem reasonable to allocate less resource than the available amount Sj ). From Eq. (6) we find Ci = µxi α2i



αi =

s

Ci µxi

and from the constraint of Eq. (4), that holds with the equal sign, we have P

i:xi =j



√ Ci

µj

√ Ci √ = Sj ⇒ αi = Sxi P Ci i:xi =j

which means that the value of αi that minimizes the end-to√ end response time is proportional to Ci and not to Ci as one would probably expect. √ This result (αi ∝ Ci ) is quite surprising since we could “naturally” expect that the best allocation strategy is αi ∝ Ci . In the next section we will see that the picture does not change if we assume a more refined platform model.

3.2

Modeling the platform by (α, ∆)

switches. Since ∆i is proportional to the server period we account for the context switches by adding the constraint

The major drawback of modeling the platform by the only bandwidth is that we do not take into account the time granularity. In fact the abstraction of a fluid bandwidth allocation can be approximated only by using a server whose period tends to zero. However, an arbitrary small server period implies an arbitrary large impact of the server overhead in the system. In this section we attempt to fix this defect by using the (α, ∆) platform model [2]. Since the end-to-end response time is given by the sum of the response times experienced by the tasks on each node, without loss of generality we can assume that all the tasks are mapped onto one node. Each task τi runs on a platform that is characterized by the parameters αi and ∆i . This means that the supply function of the task platform is Zi (t) = max{0, αi (t − ∆i )}

(7)

Starting from this supply function, it is easy to derive the task response time Ri . In Figure 3 we draw the supply Zi (t) αi

(10)

We solve the problem using the same technique applied previously. If we define the Lagrange function as !   k  k  X X Csw Ci αi + +µ − S (11) ∆i + L= αi ∆i i=1 i=1 we have ∂L Ci = − 2 + µ = 0 ⇒ αi = ∂αi αi Csw ∂L =1−µ 2 ∂∆i ∆i

s

Ci µ p = 0 ⇒ ∆i = µCsw

(12) (13)

From Equations (12) and (13), it follows that µ > 0. Hence the constraint of Eq. (10) must hold with the equal sign. From Equation (10) we find that √ P √ √ i Ci + n Csw µ= S that allows to find the values of αi and ∆i . We can notice that the best bandwidth allocation is again proportional √ to Ci , even using the more detailed platform abstraction (α, ∆).

Ci

t ∆i

 k  X Csw ≤S αi + ∆i i=1

Ri

4 Conclusions and future works

Figure 3. Supply function and response time.

function of the platform. Onto this platform the response time of one task becomes Ri = ∆i +

Ci αi

(8)

and the end-to-end response time to be minimized is  k  X Ci ∆i + R= αi i=1

(9)

Clearly the best solution is achieved when ∆i is zero. However in this case, we know in practice that the impact of the overhead becomes extremely significant. To account for this effect we introduce the quantity Csw that represents the time overhead spent by the resource manager to switch from one task to another. In this case the bandwidth required by a task must also include the bandwidth wasted in context

In this paper we showed that allocating the bandwidth proportional to the square root of the computation times is optimal for minimizing the end-to-end response time of transactions of tasks. This paper is only an initial contribution in the field. Future research includes: the investigation of the mapping techniques onto the computational nodes, the adjustment of the resources allocated at transactions level to achieve a global goal, and the adoption of a more refined model of the platform.

References [1] Lu´ıs Almeida, Paulo Pedreiras, and Jos´e Alberto G. Fonseca. The FTT-CAN protocol: Why and how. IEEE Transaction on Industrial Electronics, 49(6):1189–1201, December 2002.

[2] Xiang Feng and Aloysius K. Mok. A model of hierarchical real-time virtual resources. In Proceedings of the 23rd IEEE Real-Time Systems Symposium, pages 26–35, Austin, TX, U.S.A., December 2002. [3] Giuseppe Lipari and Enrico Bini. Resource partitioning among real-time applications. In Proceedings of the 15th Euromicro Conference on Real-Time Systems, pages 151–158, Porto, Portugal, July 2003. [4] Jos´e L. Lorente, Giuseppe Lipari, and Enrico Bini. A hierarchical scheduling model for component-based real-time systems. In Proceedings of the 20th International Parallel and Distributed Processing Symposium, Rhodes Island, Greece, April 2006. [5] Jukka M¨aki-Turja and Mikael Nolin. Efficient implementation of tight response-times for tasks with offsets. Real-Time Systems Journal, 40(1):77–116, February 2008. [6] Jos´e Carlos Palencia and Michael Gonz´alez Harbour. Schedulability analysis for tasks with static and dynamic offsets. In Proceedings of the 19th IEEE Real-Time Systems Symposium, pages 26–37, Madrid, Spain, December 1998. [7] Rodolfo Pellizzoni and Giuseppe Lipari. Holistic analysis of asynchronous real-time transactions with earliest deadline scheduling. Journal of Computer and System Sciences, 73(2):186–206, March 2007. [8] Ahmed Rahni, Emmanuel Grolleau, and Michael Richard. Feasibility analysis of non-concrete realtime transactions with edf assignment priority. In Proceedings of the 16th conference on Real-Time and Network Systems, pages 109–117, Rennes, France, October 2008. [9] Insik Shin and Insup Lee. Periodic resource model for compositional real-time guarantees. In Proceedings of the 24th Real-Time Systems Symposium, pages 2–13, Cancun, Mexico, December 2003. [10] Dimitrios Stiliadis and Anujan Varma. Latencyrate servers: A general model for analysis of traffic scheduling algorithms. IEE/ACM Transactions on Networking, 6(5):611–624, October 1998. [11] Ken Tindell and J. Clark. Holistic schedulability analysis for distributed hard real-time systems. Microprocessing and Microprogramming, 50:117–134, April 1994.

Suggest Documents