RTNoC: A Simulation Tool for Real-Time Communication Scheduling on Networks-on-Chips Mingsong Lv
Ying Guo
Nan Guan
Qingxu Deng
Institute of Computer Software and Theory Northeastern Univ. China mingsong
[email protected]
Institute of Computer Software and Theory Northeastern Univ. China
[email protected]
Institute of Computer Software and Theory Northeastern Univ. China
[email protected]
Institute of Computer Software and Theory Northeastern Univ. China
[email protected]
Abstract—Networks-on-Chips (NoC) is accepted as the most promising on-chip communication infrastructure to solve the communication bottleneck of MPSoCs. Currently, research on real-time communication scheduling on NoCs is immature, and no existing simulation tool can satisfy the requirements of simulating real-time communication scheduling on NoCs. Such a simulator is highly desirable for the research on NoC-based real-time systems. In this paper, we designed RTNoC, a realtime communication scheduling simulator on wormhole-switched NoCs. We also use this simulator to evaluate the real-time performance of different scheduling algorithms with periodic task sets. Experiment results show that RTNoC is a good tool to motivate research on real-time communication scheduling.
I. I NTRODUCTION With the evolution of semiconductor technology to the deep sub-micron era, MPSoC [1] is becoming the inevitable trend for future chip architectures of real-time embedded systems. Since tens or even hundreds of cores will be integrated on one chip, traditional bus-based on-chip communication has become the performance bottleneck. Networks-on-Chips (NoC) [2], which borrows the idea of computer networks, is accepted as a scalable architecture to provide high performance on-chip communication with low power and high reliability. The emergence of NoC poses great challenge to the analysis of real-time systems. Real-time performance of NoC-based applications is greatly affected by on-chip communication, which has become a critical factor that affects the timing behavior of the system. In both real-time and communication communities, simulators are a powerful tool for system analysis. A network simulator [3] is designed to predict the behavior of a network by simulating the topology, protocols and the building blocks such as nodes and links. Although network simulators can be extended to express timing behaviors, simulation performance is still a big problem because network simulators are not designed fundamentally for analysis of real-time systems and lack optimization. On the other hand, existing real-time system simulators [4] are not capable of analyzing real-time communication scheduling on NoCs since they are not originally deThis work was partially sponsored by the National High Technology Research and Development Program of China (863 Program) under Grant No. 2007AA01Z181 and the Cultivation Fund of the Key Scientific and Technical Innovation Project of Ministry of Education of China under Grant No. 706016.
signed to support network-based real-time communication. So an NoC-based real-time communication scheduling simulator is highly desirable to promote research on NoC-based realtime systems. This is the primary motivation for designing RTNoC. The contributions of this paper are: (1) designing RTNoC, a real-time communication scheduling simulator on wormholeswitched NoCs, which is not only capable of expressing the timing behavior of a real-time system, but also optimized for simulation of real-time communications via NoC; (2) Simulating three different real-time scheduling algorithms on NoC-based systems, comparing and evaluating their real-time performance. The rest of this paper is organized as follows: Section II defines the network model and the task model used throughout this paper. The design of RTNoC is detailed in Section III. Section IV presents the implementation results and the analysis of three scheduling algorithms using RTNoC. Related work are introduced in Section V and the paper is concluded in Section VI. II. T HE P ROBLEM M ODEL In this section, we first introduce the NoC topology, the routing algorithms and the real-time task model used in the design of RTNoC; then the assumptions made in the problem model are also given. All the above forms the problem model of this paper. A. The NoC Topology The NoC topology selected in RTNoC is 2-D Mesh, which is very popular in NoC research since it has some desired properties such as regularity, concurrent data transmission, and controlled electronic parameters [5]. As illustrated in Figure 1, a 2-D Mesh is composed of N 2 switches and each switch is connected with four or less neighboring switches via links. An IP core, where tasks are executed, is attached to each switch. For convenience, we do not distinguish the IP core from the switch where the core is attached in the rest of this paper. Real-time tasks on different IP cores can communicate via a sequence of switches, which we call a route.
one switch to one of the adjacent switches within one time unit. 1
2
3
5
6
7
8
9
10
11
12
13
14
15
16
Fig. 1.
III. T HE D ESIGN OF RTNoC
4
The 2-D Mesh topology
B. The Switching and Routing Policies Wormhole switching is used in the design of RTNoC. In wormhole switching, large data packets are broken into small pieces called flits. The first flit which carries address information sets up the route for all subsequent flits associated with the packet. Flits of a single packet may occupy multiple consecutive switches, like a worm. Wormhole switching has predictable distance-irrelevant communication delay, low buffer requirement and high throughput, which makes it suitable for on-chip communication [6]. We assume X-Y routing [7] that is very popular in NoC. Given the source node and the destination node, the flit follows the rows first, and then moves along the columns towards the destination, or vice versa. X-Y routing is suitable for NoC because it is simple to implement on on-chip switches, and it has low overhead. C. The Real-Time Task Model We define the task set allocated on an NoC as a set of n periodic tasks, Γ = {τi | 1 ≤ i ≤ n}. Each periodic task is defined as a tuple, τ =< C, T, D, P, S, R >, where C is the execution time of τ , T is the period, D is the relative deadline, P is the priority, S is the set of all data sending operations within the task period, and R is the set of all data receiving operations within the task period. Note that the time consumed by sending and receiving packets are calculated as part of the execution time of the task. A task set is defined as schedulable if none of the tasks misses its deadline, which implies that all the communications are finished before the task deadlines; otherwise, the task set is non-schedulable. Given a set of n periodic tasks, it suffices to check whether the task set is schedulable within the hyperperiod that is defined as the least common multiple of all the task periods. When multiple tasks communicate simultaneously, it is possible that link collision may occur. We formally define link collision as ”For a given time unit, at least two tasks send flits on the same link”. Without loss of generality, we assume that when no link collision occurs, a flit can be transferred from
In this section, we give the details in the design of RTNoC. First, the fundamental principles and the transition system of the simulation engine of RTNoC is presented; then the communication scheduling are illustrated in detail; at last, the optimization technique applied in the simulation engine is introduced. A. Basic Principles of the Simulation Engine RTNoC is fundamentally a discrete time simulator, where time advances a fixed interval (time unit) for each new simulation step, and the system state is updated accordingly in the simulation step. Figure 2 shows the transition system of the simulation engine.
Initialize
Prepare
Comm. Scheduli ng
Update Task
Check Task
Advance Time
Fig. 2.
Return Result
The transition system of the simulation engine
The simulator starts by entering the Initialize state, in which time is set to 0 and task set information, such as task periods, deadlines and communication information, are loaded into the simulator. Then the simulator enters the Prepare state and checks if there is communication to be scheduled or if there are ready or executing tasks. If there is communication, the simulator enters the Comm. Scheduling state; if there is no communication but there are ready or executing tasks, the simulator enters the Update Task state directly; otherwise it enters the Advance Time state. In the Schedule Task state, the transmission of flits on NoC is simulated according to certain communication scheduling policy, which is the kernel of the simulator. Then the simulator enters the Update Task state, where the information of running tasks, such as elapsed time, is updated, and after that the simulator goes to the Check Task state. In Check Task state, the task set is checked to see if any task misses its deadline. If so, the simulator goes to the Return Result state with a return value ”Task set non-schedulable”, and then terminates; otherwise, the simulator goes to the Advance Time state. In the Advance Time state, the system time advances one time unit. If the system time is less than the hyperperiod, the simulator goes back to the Prepare state; otherwise, the simulator will enter the Return Result state and returns ”Task set schedulable”, then the simulation terminates.
Fig. 3.
The screen shots of RTNoC
B. Communication Scheduling Simulation In communication scheduling simulation, we first model the switches of the NoC. For each switch, an input buffer is associated with each link connected to the switch. Note that the buffer size affects the throughput and the real-time performance of the system, but this issue is beyond the scope of this paper, and we assume that enough buffers are allocated to the links. The communication scheduler schedules the transmission of the flits and modifies the contents of the buffers accordingly. The key issue of communication scheduling is how to schedule the sequence of flit transmission when link collision occurs. In RTNoC, the flits are assigned the priorities of the corresponding sending tasks. If link collision occurs, the flit with the highest priority is scheduled to transmit on the link, and the other flits with lower priorities are not transmitted and are stored in the original buffers. Such flit priority assignment is compliant with most real-time scheduling policies. As stated before, the communication time is calculated as part of the execution time of the tasks. When link collision occurs, the communication times of the tasks with lower priorities grow, which may result in deadline miss of lower priority tasks. Classical theorem-proof-based schedulability analysis cannot well express link collisions, so it has to consider the worst-case, which introduces unnecessary pessimism. While a simulator can simulate the exact communication behavior, thus can yield better analysis results. This is why simulator is highly desirable in NoC-based real-time communication scheduling analysis. C. Simualtion Optimization Simulation performance is the key issue of any simulator design. In RTNoC, early identification of deadline miss is
applied to reduce unnecessary simulation work. In wormhole switching, we use F and L to denote the number of flits of a data packet and the length of a route respectively, then the communication delay is (F + L − 1) when no link collision occurs. This is the minimum value of the communication delay. Before simulation starts, this minimum delay is precalculated for each communication task. If there exist some task τi with (Fi + Li − 1) > Di , τi must miss its deadline, so we can immediately conclude that the task set is nonschedulable. In this way, the analysis efficiency is improved by eliminating unnecessary simulation work. IV. E XPERIMENT R ESULTS RTNoC is implemented in Visual C++ 6.0, and Figure 3 illustrates the screen shots of some of the functionalities of this tool. The task set can be either manually inputed by the user or automatically generated by RTNoC. The simulation engine runs in the background and finally returns the results telling whether the task set is schedulable. Currently, Rate Monotonic, Earliest Deadline First and Least Laxity First algorithms are supported in RTNoC. The final scheduling result is recorded in an XML file, which can be read back by RTNoC for playback of the scheduling process by means of animations. Users can also probe the scheduling result of a certain task independently. We also use RTNoC to analyze the scheduling quality of RM, EDF and LLF algorithms. A 4-by-4 2-D Mesh topology is deployed in our experiments. We generate 100 random task sets, with each task set contains at most 6 tasks. The 100 task sets are classified into five groups according to the task set utilization, in Figure 4. Note that the utilization is calculated for the ideal situation where no link collision occurs. The real
utilization of each task may be larger since the execution time may grow due to link collisions.
20 15 # Task 10 Sets 5 0
1%-15%
15%-30%
2 Tasks
3 Tasks
Fig. 4.
30%-45% 4 Tasks
45%-60%
5 Tasks
60%-75%
6 Tasks
VI. C ONCLUSION
The 100 random task sets
# Schedulable Task Sets
RTNoC runs three passes with each pass applying a different real-time scheduling algorithm on the 100 task sets, and the scheduling results are shown in Figure 5. One fact reflected in this figure is that with the increase of task set utilization, the number of schedulable task sets decreases. This is because in our model higher utilization usually implies more communication, and more communication leads to higher communication delay, which finally results in large deadline miss ratio. Another fact that can be seen from the figure is: all the task sets that can be scheduled by RM and LLF can all be scheduled by EDF, while some task sets that can be scheduled by EDF cannot be schedule by RM or LLF. From this result, we can get an intuitive impression that EDF has better scheduling quality than RM and LLF on wormholeswitched NoCs. New real-time scheduling algorithms can be added, and more comprehensive analysis and comparisons can be performed with RTNoC.
24 20 16 12 8 4 0 1%-15%
15%-30% RM
Fig. 5.
30%-45% EDF
45%-60%
suitable for simulating real-time communication scheduling because they inherently cannot express timing properties. RTSIM [4] is a real-time system simulator. It has been used primarily for experimenting with new scheduling algorithms and solutions. RTSIM cannot be used for analyzing NoC-based real-time systems due to the lack of NoC support. UPPAAL [11] is an integrated environment for modeling, validation and verification of real-time systems modeled as networks of timed automata. UPPAAL is capable of expressing lots of systems, but when analyzing large and complex systems, the model checker may suffer state explosion problems.
60%-75%
LLF
The scheduling results
V. R ELATED W ORK Ns-2 [8]is a discrete event simulator targeted at networking research. It provides substantial support for simulation of TCP, routing, and multicast protocols over wired and wireless networks. NetSim [9] is a commercial network simulator developed by tetcos to assist network lab experimentation, research and development. The excellence of NetSim is the support of a variety of popular network protocols. GloMoSim [10] is a scalable simulation environment for large-scale hybrid networks. This tool effectively utilizes parallel execution to reduce the simulation time of detailed high-fidelity models of large communication networks. All the above tools are not
In this paper we designed RTNoC, a real-time communication scheduling simulator on wormhole-switched NoCs. We used this simulator to evaluate the real-time performance of RM, EDF and LLF scheduling algorithms with periodic task sets. Experiment results show that RTNoC is a good tool to evaluate real-time scheduling algorithms on NoCs. We hope that RTNoC can pave the way for the research on NoC-based real-time communication scheduling. R EFERENCES [1] W. Wolf, “The future of multiprocessor systems-on-chips,” in DAC ’04: Proceedings of the 41st annual conference on Design automation. New York, NY, USA: ACM, 2004, pp. 681–685. [2] T. Bjerregaard and S. Mahadevan, “A survey of research and practices of network-on-chip,” ACM Computer Survey, 2006. [3] “http://en.wikipedia.org/wiki/network simulator.” [4] “http://rtsim.sssup.it/.” [5] B. Towles and W. J. Dally, “Route packets, net wires: On-chip inteconnectoin networks,” dac, vol. 00, pp. 684–689, 2001. [6] Z. Shi and A. Burns, “Real-time communication analysis for on-chip networks with wormhole switching,” in NOCS, 2008, pp. 161–170. [7] C. J. Glass and L. M. Ni, “The turn model for adaptive routing,” J. ACM, vol. 41, no. 5, pp. 874–902, 1994. [8] “http://www.isi.edu/nsnam/ns/.” [9] “http://www.tetcos.com.” [10] L. Bajaj, M. Takai, R. Ahuja, R. Bagrodia, and M. Gerla, “Glomosim: A scalable network simulation environment,” Tech. Rep. 990027, 13, 1999. [Online]. Available: citeseer.ist.psu.edu/225197.html [11] “http://www.uppaal.com/.”