Priority Inversion at the Network Adapter when Scheduling ... - CiteSeerX

Priority Inversion at the Network Adapter when Scheduling Messages with Earliest Deadline Techniques Antonio Meschi Marco Di Natale Marco Spuri Scuola Superiore S.Anna via Carducci, 40, 56100 Pisa, Italy tel. +39-50-883284 fax +39-50-883215 e-mail [email protected] Abstract

In this paper we present a novel approach in the study of the predictability of real-time message transmission and its relationship with the design of network adapters for real-time distributed systems. The aim is to limit the occurrence of large priority inversions among messages, so as to achieve a better degree of predictability. We show that when the proper network adapters are used in conjunction with earliest deadline message scheduling the loss in processor utilization is minimized and predictable.

1 Introduction The growing interest of industries in robust and cost eective control systems, is pushing the real-time research community into deeper investigations in the distributed area. As a consequence, the literature on distributed real-time computing is expanding rapidly, according to the on-going research. One of the main goal in designing a real-time system is predictability. The scheduling of real-time messages in communication networks, too, is aimed at this goal. So far, mostly static approaches have been followed to tackle the problem. Among them we nd analysis of existing standards, such as the Token-Ring [7], or the Controller Area Network [8], which are both based on extensions of uniprocessor xed-priority analysis, or timed-token networks [4]. A dierent approach, still based on a static technique though, is found in the Time Triggered Protocol (TTP) [2], where the assignment of the network is done by TDMA (Time Division Multiple Access). As for processor scheduling, static techniques are proven to be less ecient then their dynamic counterparts in scheduling computer networks, especially when the number of messages is changeable and when high urgency messages must be granted an immediate access. Thus we are strongly interested in investigating dynamic techniques, namely EDF scheduling [3], on real-time message communication systems. Most of the results presented in the literature assume an ideal processor interface on the communication channel. However, when dealing with more realistic models, we have to face additional problems, i.e. priority inversions, which cannot be neglected in the analysis of the system [8, 9]. In this paper we intend to extend our previous work on EDF message scheduling [6], by including a more realistic model of the network adapter. Section 2 contains a brief resume of some of our previous results. In Section 3 we present an optimized design of network adapters, which strongly limits the possible priority inversions when scheduling messages. Finally in Section 4 we state our conclusions.

2 Applicability of Earliest Deadline Techniques to Network Scheduling In [5] we analyzed three families of protocols: the EDF/Token-Ring, the EDF/Binary Countdown and the EDF/CSMACD. A common characteristic to all these protocols is the following: the contention phase is resolved on the basis of a priority attached to each message, which is function of the message deadline. In all cases the priority assigned to each message Mi in the competition for the shared channel depends on the value of (di ? tSTART ), where di is the absolute deadline of message Mi and tSTART is a reference value computed every time there is a contention for the channel, and assumed as the starting time of the contention itself. The priority assigned to messages is inversely proportional to the value of (di ? tSTART ). Messages are divided into packets and each packet is inserted in the data eld of a frame, which is the elementary unit of transmission. The shorter is the transmission unit, the higher the cost of the protocol management, the 1

transmission of the current packet before being allowed to contend for the channel, it can happen that a message with the highest priority is not transmitted immediately. This causes a priority inversion phenomenon, which we proved to be limited to a time interval at most large B , where B is the worst-case remaining time for the current packet to leave the network. The actual value of B depends on the protocol used and on the frame format. If we assume that all the frames of a message get queued at the same time, then the priority inversion can happen only at the time a message is ready to be sent and on the transmission of the rst frame. Once the rst frame has gained access to the network, it is not possible that any other lower priority message gets the channel. This situation is similar to the one presented in a uniprocessor environment when tasks share critical sections according to the SRP policy [1]. In that case Baker proves the suciency of a simple schedulability test. We will show how his work can be applied to the network scheduling problem, where an EDF technique is used to schedule messages on a single shared channel. Assume to have an in nite number of priorities and an ideal interface. We take the proof by Baker [1] as a starting point to obtain the guarantee test. Theorem 2.1 1 Given a set of N periodic messages, M1; M2; : : : ; MN ordered by increasing deadlines, they can be sent within their relative deadlines D1 ; D2 ; : : : DN , if

Xj Ci + B 1

8j = 1; : : : ; N

i=1 Di

Dj

(1)

where Di is the relative deadline of message Mi in each period, B is the worst case blocking time de ned earlier and Ci corresponds to the time the channel is occupied by the transmission of the message Mi and includes the protocol overheads associated to the transmission of the message (In Section 2.1 we show how these values can be computed for the EDF/Binary Countdown protocol). Please note that the hypothesys that higher priority messages can suer priority inversion only at their arrival time and for a limited period of time is fundamental for the applicability of the theorem and will be recalled later, when we discuss the buer requirements for a real-time network adapter.

2.1 The EDF/BC case

The EDF/Binary Countdown is a real-time network protocol we developed [5]. The nodes are linked to a single shared channel and the time is divided in contention and transmission phases. The time axis is divided in slots which must be larger or equal to the value of Tp , where Tp is the bit transmission time through the channel2. If a node wishing to transmit nds the channel idle, it waits the starting of the next slot and starts a contention phase. Each node having a message to send can contend for the channel transmitting the priority bits of the message during the contention slots, one bit for each slot. The collisions among the priority bits are solved with a logical AND. If a node reads its priority bits on the channel without any change, it realizes it is the winner of the contention and gets the right to transmit. To grant every node the possibility to preempt, each frame transmission is preceded and followed by a time slot, during which any node can signal a preemption. To apply the test 1 we need to de ne Ci , the transmission sending time, and B , the worst case blocking. These parameters can be evaluated by analysing the time properties of the EDF/BC protocol. We de ne Vp as the transmission speed of the channel, Br the transmission bit-rate, L the channel lenght, d the number of bits in a packet and mi the number of bits in the message Mi . The bit transmission time through the channel is Tp = L=Vp . For our calculations, we suppose the utmost condition that each slot is large Tp . The time Ci to send the message data is Ttr = dmi =ded=Br plus all the protocol overheads associated to the message transmission. The protocol overheads can be divided in frame overheads and message overheads. The frame overheads are represented by the time to send the protocol bits for each frame (the starting delimiter, the destination and source address elds, the frame control sequence and the ending delimiter), the preemption slot associated to each frame transmission and one more slot to account for the propagation delay. Let Ovt be this frame overhead, then each message Mi has an additional overhead time

Ov1 = dmi =deOvt to be added to Ttr . Now we must consider the message overhead caused by the contention time. The contention doesn't necessarily happen before each frame transmission, but it starts when the channel is idle and a node wants 1 2

Due to lack of space the proof of the theorem has been removed. The interested reader can refer to [6]. We assume a perfect synchronization between station clocks.

2

overhead on its rst frame assuming it preempts another lower priority message, and another overhead when its transmission is over and a preempted message (or another one) gets back the channel. Because of the fact that each message preempts and ends only once, we considered all the contention overheads. Suppose EDF/BC grants np priority bits (each of them corresponds to a contention slot), the message overhead is given by the contention time np Tp multiplied by 2 (The message overhead includes the message propagation time):

Ov2 = 2np Tp = 2np L=Vp At this point, we can evaluate Ci as the sum of the previous factors

Ci = Ttr + Ov1 + Ov2 = dmi=de(d=Br + Ovt ) + 2np L=Vp To evaluate B , the maximum time an highest priority station must wait before beeing able to send a message, we consider the worst case when a message becomes ready immediately after the beginning of a contention. In this case it has to wait a whole contention time and a frame transmission time, before beeing able to start a new contention. Then we have:

B = Ttr + Ovt + Ov2 =2 Once the values Ci and B are known, the schedulability of the set of messages can be checked using eq. ( 1).

3 Transmission time between processor and interface Our calculations suppose that all the frames of every message are instantly available in an unlimited priority queue at the network interface and periodically ready to be sent. This is an approximation that doesn't account for the nite copy time between memory and network interfaces and for the limited number of buers at the interface. We'll show how the time to copy the frames to the interface can produce priority inversion and how a limited number of buers can make the situation even worse. We suppose that messages arrive periodically and are inserted in an higher level priority queue in memory or memory queue. We also suppose that the time spent to move a frame from the memory queue to the interface is lower than the frame transmission time on the channel. Let Bpr be the bit-rate between memory and interface, and Br the bit-rate on the channel. We have that Bpr Br . Let's assume that the interface has just one buer for one single frame. If an highest priority message arrives, its rst frame is placed on top of the memory queue and immediately copied into the memory interface. In the worst case, during the copy of this frame to the interface, a lower priority message can get the channel. This rst kind of priority inversion happens at the time the message is placed in the memory queue (message arrival or ready time) and is somehow similar to the worst case blocking time B of equation ( 1). A dierent case happens when a message gets ready and is not the highest priority message, so it is not copied into the network interface buer. When the rst frame of the message gets to the top of the memory queue and is being copied into the interface, another station with a lower priority message can acquire the channel, producing a dierent priority inversion, this time not coincident with the message arrival time. This kind of priority inversion invalidates our schedulability test. Another situation that doesn't allow the use of the schedulability formula, happens when a frame is ready to be sent at the interface buer and an highest priority message is signalled to the processor. The rst frame of the high priority message is copied into the interface with an internal preemption. During this copy time, the channel is available to a message with a priority lower than the one of the message being internally preempted. The preempted message suers a priority inversion that again does not happen at ready time. The message that makes the internal preemption suers a priority inversion as well, but at its arrival time, which allows to use the schedulability test. We'll see how to prevent the situations that invalidate the schedulability test from happening .

3.1 Dual Port Memory

Suppose our network interfaces have a dual port memory, with a buer that contains two frames ordered by priority. The rst two frames in the memory queue get copied in this dual port memory. 3

rst frame is not placed in the rst two positions of the memory queue. During the time the frame is sent to the interface, there will not be a priority inversion, since in the dual port memory there is at least another frame with an higher priority (the frame at the top of the processor queue) and the time to send a frame from the processor to the interface, is lower than the time to send a frame on the channel. In this case there is no priority inversion for the frames following the rst one. If the rst frame of the newly arrived message is placed in one of the rst two positions of the memory queue, it is immediately copied to the interface. At this point the message can suer a priority inversion, no matter if it gets the rst buer (a lower priority message gets the channel during the copy) or the second buer (the frame holding the rst buer can end its transmission during the copy and a lower priority message can get the channel). This kind of priority inversion happens only for the rst frame of the message and we'll show how to account for this priority inversion in the schedulability test. A dierent case happens when there is an internal preemption (see gure 1). A memory buer containing two frames leaves the possibility for a late priority inversion. This is due to the fact that when a frame is internally preempted, there is no guarantee that the other buer position is occupied by an higher priority frame. In fact, it might happen that the highest priority frame in the dual port memory (not the one being preempted) ends its transmission, and during the time necessary to copy the incoming message in the buer queue a priority inversion occurs. For example, in the upper half of gure 1 the interface buer contains two frames: frameA having priority 1 and frameB having priority 3 (position A). Later, frameD having priority 2 arrives and preempts frameB (position B). While frameD is being copied in the second position, frameA ends its transmission, a new contention starts on the channel and a message with a priority lower than 3 can get win the contention. As a result frameB suers a late priority inversion. The problem is solved when the dual port memory contains at least three frames. Since only the lowest priority frame can be internally preempted, there will always be a frame at the interface having a priority higher than the one being preempted. For example in the lower half of gure 1 we show how frameB prevents a frame with a priority lower than 5 from getting the channel and doing priority inversion on frameC (the preempted frame). frameD (2) frameA frameB

1 3

A

frameA

1

B

frameA frameB frameC

1 3 5

frameA frameB

C 1 3

frameB

3

Figure 1: Priority inversion with two or three available buers. In conclusion, a dual port memory with a three-frame buer, limits the priority inversion to happen only when a message arrives. As we have seen, an highest priority message is immediately copied to the interface, and during the time to copy the rst frame in the dual port memory, a lower priority frame can get the channel. Let Tf be the transmission time from the processor to the network interface and let f be the number of bits in a frame. If Bpr is the bandwidth of the internal bus, we have

Tf = f=Bpr + Tovh where Tovh is the transmission overhead due to the handshake between processor and interface to start and manage the transmission. If a message Mi is inserted in the processor queue at time tap , then in the interval [tap ; tap + Tf ] its rst frame cannot be sent on the channel. With a dual port memory containing at least three frames, the priority inversion can only happen at the beginning, when Mi gets ready, and for a time equal to the transmission time Tf of the rst frame. That means that after the time tap + Tf there can no more be any priority inversion due to the message copy. We can study the worst-case scheduling of the message Mi assuming it is ready to be sent at time tas = tap + Tf instead of time tap . If Mi is periodically inserted in the processor queue, we suppose that on each period the message ready time is delayed by a time Tf (see following gure). Since tas = tap + Tf and since the instants tap (arrival times at the processor) are periodic, then the instants tas are periodic as well. Then we have a message that is periodically ready to be sent, this time at the network interface, with a relative deadline Di whose value is: 0

4

Di Tf di t ap

t as

Figure 2: Worst case message arrival time.

Di = di ? tas = di ? (tap + Tf ) = (di ? tap ) ? Tf = Di ? Tf

(2) As the test ( 1) implies that the messages are ready to be sent at the network interface, we must change the deadlines according to the ( 2). The schedulability test becomes: 0

8J = 1; 2; :::::; N

Xj Ci=(Di ? Tf ) + Bj =(Dj ? Tf ) 1 i=1

4 Conclusion In this paper we analyzed the eects of the nite transmission time between a processor and its interface when a dynamic algorithm is used to schedule a set of periodic messages on a shared communication channel. We showed how this non-zero transmission time produces priority inversions as during a frame copy between a processor and its interface, the channel is left available for lower priority messages. With a dual port memory containing at least three frames, we demonstrated the priority inversion is limited to the time instant a message is signalled as ready to the processor, and for a time equal to the transmission time Tf of the rst frame. We expressed the eects of this priority inversion on the schedulability test that can be used to guarantee the delivery of messages within their deadlines.

References [1] Baker T.P., \Stack-Based Scheduling of Realtime Processes," The Journal of Real-Time Systems, 3, pp. 67-99, 1991. [2] Kopetz H. and Grunsteidl G., \TTP - A Time-Triggered Protocol for Real-Time Systems," Proc. of the 23rd Symposium on Fault-Tolerant Computing, Toulouse, June 1993. [3] Liu C.L. and Layland J.W., \Scheduling algorithms for multiprogramming in a hard real-time environment," Journal of ACM, 20(1), pp. 40-61, January 1973. [4] Malcolm N. and Zhao W., \The Timed-Token Protocol for Real-Time Communications," IEEE Computer, January 1994. [5] Meschi A. \Progetto di protocolli di comunicazione in sistemi real-time," Thesys dissertation, Universita di Pisa, November 1995. [6] Meschi A., Di Natale M. and Spuri M., \Earliest Deadline Message Scheduling with Limited Priority Inversion," submitted to the Workshop on Parallel and Distributed Real-Time Systems, 1996. [7] Strosnider J.K. and Marchok T.E., \Responsive, Deterministic IEEE 802.5 Token-Ring Scheduling," The Journal of Real-Time Systems, 1, pp. 133-158, 1989. [8] Tindell K.W., Hansson H. and Wellings A.J., \Analisying Real-Time Communications: Controller Area Network (CAN)," Proc. of the IEEE Real-Time Systems Symposium, 1994. [9] Tindell K.W., Burns A. and Wellings A.J., \Analysis of Hard Real-Time Communications," The Journal of Real-Time Systems, 9, pp. 147-171, 1995. 5

Priority Inversion at the Network Adapter when Scheduling ... - CiteSeerX

Priority Inversion at the Network Adapter when Scheduling ... - CiteSeerX

Suggest Documents

Using Adaptive Priority Scheduling for Service ... - CiteSeerX

Fixed-Priority Preemptive Multiprocessor Scheduling: To ... - CiteSeerX

Priority Scheduling - Google Sites

Multi-priority Scheduling Using Network Calculus: Model ... - IEEE Xplore

flexible fuzzy priority scheduling of the can bus - CiteSeerX

wireless USB network adapter

The Use of Preemptive Priority-Based Scheduling for ... - CiteSeerX

Wireless Network Adapter - Telus

The Deep Space Network Scheduling Problem - CiteSeerX

Determining the Priority in Vocabulary when

Xerox Wireless Network Adapter Brochure

HP USB Network Print Adapter

Lower-Priority-Triggered Distributed MAC-layer Priority Scheduling in

Guest-Aware Priority-Based Virtual Machine Scheduling ... - CiteSeerX

A Priority-Based MAC Scheduling Algorithm for ... - CiteSeerX

Dynamic Multi-Priority Patient Scheduling for a Diagnostic ... - CiteSeerX

Dynamic Multi-Priority Patient Scheduling for a Diagnostic ... - CiteSeerX

An Optimal Model for Priority based Service Scheduling ... - CiteSeerX

An Optimal Model for Priority based Service Scheduling ... - CiteSeerX

Scheduling and Priority Mapping For Static Real-Time ... - CiteSeerX

Adaptive Dynamic Priority Scheduling for Virtual Desktop ... - CiteSeerX

An Optimal Model for Priority based Service Scheduling ... - CiteSeerX

Cache and Pipeline Sensitive Fixed Priority Scheduling ... - CiteSeerX

An Optimal Model for Priority based Service Scheduling ... - CiteSeerX