ENERGY EFFICIENT SCHEDULING USING ...

2 downloads 0 Views 583KB Size Report
Keywords- Multi-core processor, Dynamic Power Management (DPM), .... according to the Advanced Configuration & Power Interface (ACIP) recommendations.
Applied Mechanics and Materials Vol. 550 (2014) pp 178-186 Online available since 2014/May/07 at www.scientific.net © (2014) Trans Tech Publications, Switzerland doi:10.4028/www.scientific.net/AMM.550.178

ENERGY EFFICIENT SCHEDULING USING SIMULATED ANNEALING ALGORITHM FOR MULTI-CORE PROCESSORS Mrs.M.Poongothai1,a, Dr.A.Rajeswari2,b, Ms.V.Sanmukapriya3,c Department of Electronics and Communication Engineering, Coimbatore Institute of Technology, Coimbatore, Tamil Nadu, India. a

[email protected] , [email protected],

Keywords- Multi-core processor, Dynamic Power Management (DPM), Dynamic Voltage Scaling (DVS), Simulated Annealing Algorithm (SA), Leakage Aware algorithm (LA), Largest Task First algorithm (LTF), First Fit algorithm (FF), penalty value, reward value.

Abstract:The role of multi-core processors in recently developed real-time systems is gaining importance because of its energy and thermal conditions. The major development requirements and objectives to be met while designing multi-core processors are: low heat dissipation, low energy consumption and long battery life, which also helps to reduce the system costs. This paper is presented with an aim to achieve a better system performance in a multi-core processor platform by adjusting the trade-off between system performance and power dissipation. Dynamic Power Management (DPM) and Dynamic Voltage Scaling (DVS) are the two run-time techniques used to adjust the trade-off between the system performance and power dissipation. Simulated Annealing (SA) algorithm is implemented in order to find a good approximation to the global optimum. The idea behind Simulated Annealing algorithm is to iteratively improve the solution by investigating the neighbour solutions. INTRODUCTION The power consumption of modern battery driven systems is becoming one of the most important design concerns in recent years. This is also one of the problems faced in multi-core environment. Low energy consumption, long battery life and low heat dissipation are the major development requirements and objectives to reduce the operation cost at system level. In order to achieve this, the two run-time techniques used are Dynamic Power Management (DPM) and Dynamic Voltage Scaling. DVS and DPM can be considered with either full-chip platforms or per-core platforms as on reference [7]. A set of tasks are considered for partitioning, which is done using Simulated Annealing Algorithm (SA). The partitioned tasks are then reallocated to a new processor or new core based upon certain criteria. After re-allocation, the reallocated tasks are assigned with a new frequency and finally scheduling is done. Thus energy efficient scheduling is performed in multi-core platforms. ENERGY EFFICIENT SCHEDULING Existing methods: There have been extensive research works on energy efficient scheduling in multi-core platforms. The authors of [2] have proposed a rate-harmonized task schedule, where an artificial task period is introduced and all tasks are only eligible to execute at the new period boundaries. This has the effect that some ready tasks may be delayed in order to prolong the current idle time for maximizing the Dynamic Power Management (DPM) usage. The authors of [3] have introduced a dynamic counter approach to decide the number of upcoming events and therefore bound the All rights reserved. No part of contents of this paper may be reproduced or transmitted in any form or by any means without the written permission of TTP, www.ttp.net. (ID: 117.230.184.88-09/05/14,18:00:51)

Applied Mechanics and Materials Vol. 550

179

future workload. Based on this information, shutdown can be done safely. But the drawback is that, the task should never run under a pre-defined speed, known as critical speed, otherwise power consumption increases. The authors of [6] have proposed an offline optimal device scheduling algorithm for hard real-time systems based on pruning techniques. A heuristic search algorithm is proposed to find the optimal solution. But this approach does not give the near-optimal solution. Proposed Method: In order to eliminate the drawbacks of the existing systems and to improve the system performance in multi-core platforms, “Energy Efficient scheduling on multi-core platforms for hard real-time systems using Simulated Annealing algorithm” is proposed. In this, the concepts of Dynamic Power Management (DPM) and Dynamic Voltage Scaling (DVS) are used.. Simulated Annealing Algorithm (SA) is also used to find a good approximation to the global optimum. Dynamic Power Management and Voltage Scaling: Dynamic Power Management (DPM) and Dynamic Voltage Scaling (DVS) can be applied either to full chip platforms or per-core platforms. Since full chip platform implementation lacks flexibility, we have chosen per-core platform implementation in spite of high implementation cost. The idea behind DPM technique is to complete the tasks as fast as possible. This enables the system to get more idle time and also to selectively shutdown the unused system components which can be made to wakeup only when required. The main idea behind DVS technique is to make the task execute as slow as possible. This tries to slowdown the components in active state in order to save power and energy. D. Task Parameters:

Fig 1: Sample task Fig.1 shows the parameters to be considered while considering a task. Consider the sample task (say Ʈi). The parameters which influence the execution of a task are: Ti --- Period of task – Ʈi Ci --- Worst Case Execution Time (WCET) of task - Ʈi Di--- Relative deadline of task - Ʈi i.e., the maximum allowed time between the arrival and the completion of an instance of task - Ʈi The execution of every task depends on the above mentioned parameters. Processor Model: The number of cores and the number of processors considered for the problem is specified according to the Advanced Configuration & Power Interface (ACIP) recommendations. The ‘hyper period' of the taskset is defined as the least common multiple (LCM) of all task periods. There are also certain assumptions to be made for working in multi-core platform. The assumptions are: 1. The processors in the same core have the same power model and the processors in the different core have different power models.

180

Mechanical and Power Research

2. The processors within the same core have to operate in the same frequency. 3. During task execution, memory access is rare. 4. The Worst Case Execution Time (WCET) of the task increases as the processor speed decreases. 5. Only if all the processors in a particular core are idle, the core can be switched to a low power state. SIMULATED ANNEALING ALGORITHM Simulated annealing (SA) algorithm, based on heuristic approach is used to solve the problem of scheduling in multi-core environment. The SA is similar to the annealing process in material science, which is defined as “heating the system to a high temperature and then cooling it down slowly”. In general, Simulated Annealing algorithm is problem-independent and applicable for a large variety of problems. In our paper, SA is designed for solving optimization problem. Optimization problem: The problem of optimization plays a major role in Simulated Annealing Algorithm. The optimization problem is categorized as 3 parts: 1. Set of instances 2. Finite set of candidate solutions for each instance 3. Cost function that assigns to each candidate solution for each instance (which is a positive integer)

Fig 2: Multi-core Architecture Working of Simulated Annealing Algorithm: The main idea of Simulated Annealing algorithm is to iteratively improve the solution by investigating the neighbor solutions. The neighbors are randomly selected by uniform probability. If the neighbor solution is better than the present solution, a movement is made towards the neighbor solution. Otherwise, the movement is accepted with a certain probability. This process is repeated until an acceptable solution is obtained or the number of iterations reaches a pre-defined threshold value. If the iteration number is large i.e., if there are more number of iterations, the SA will gradually converge to a global optimum. Generation of Neighbor solution: As already discussed, the neighbor solution is selected by uniform probability. The steps involved in generating neighbor solution are: 1. Selecting a task 2. Re-allocating it to a new core 3. Re-assigning it with a new frequency

Applied Mechanics and Materials Vol. 550

181

For selecting a task, a penalty value is assigned to each task Ʈi , known as pen(Ʈi ). The penalty value of a task indicates the energy wasted by the task. If the penalty value for a particular task goes higher, then the probability that the par task has to selected for configuration change also increases. The penalty value of the task is given by Eq.(1), which comprises 3 parts. The term “critical speed” is defined as the minimum speed, below which the task cannot be executed. pen(Ʈi)=1* │F(assign(Ʈi)-F(Ssc)│+ 2 * tunbalanced (Ʈi) + 3

(1)

The first part -│F(assign(Ʈi)-F(Ssc)│ denotes the wasted active energy during task execution. 1 is introduced as a constant which can be adjusted accordingly. The second part - tunbalanced (Ʈi) is the time taken by the core to execute the task Ʈi. When tunbalanced (Ʈi) increases, the task has to be selected for configuration change. The third part - 3 is added as an arbitrary parameter in order to prevent the penalty value from being zero. Depending upon the penalty values, the probability of each task prob(Ʈi ) is calculated using Eq.(2). The selection of task is thus done.

(2) After task selection, the second step is to re-allocate it to a new core. In order to perform this, we have to select a core in which the task can be re-allocated. To achieve this, we associate each core with a reward value rew(Dx,y). The reward value is the available utilization that is still free to be used and is given by Eq.(3) as: Rew(Dx,y) = Uub – UDx,y , if alloc(Ʈk) ≠ Dx,y Rew(Dx,y) = U ub – UDx,y + Uk , otherwise

(3)

The term Uub is known as the upper bound of the total task utilization. The value of Uub is ‘1’ when Earliest Deadline First (EDF) algorithm is used and ‘0.69’, if Rate Monotonic (RM) algorithm is used. EDF and RM are the commonly used real-time scheduling algorithms, both in theory and practice. The term UDx,y is the total task utilization of the core,, whereas Uk is the utilization of task Ʈk, where, Ʈk is the task selected from previous step. Based on the calculated reward values, we define the probability for each core using Eq.(4) . Thus using the probability value, the core with high available utilization can be selected.

(4) After the selection of a new core, the task which was obtained from the step (1) i.e., Ʈk is allocated to the selected new core. The final step is assigning a new frequency to the re-allocated task. This new frequency is determined by computing the uniform distribution of all frequencies and assigned to the task. Thus the scheduling is done. ALGORITHM AND FLOWCHART Algorithm: The proposed algorithm is as follows: Step 1: Generate initial solution. Step 2: Get the value of initial solution. Step 3: Generate a new neighbor solution using penalty and reward values.

182

Mechanical and Power Research

Step 4: If the new solution is schedulable, then get the value of the new solution. Step 5:Accept the new solution and move to the new Solution with probability ‘p’. Step 6: Repeat this process until solution with certain quality is obtained or the number of iterations reaches a pre-defined threshold. Flowchart:

Fig.3: Flow Chart RESULTS AND DISCUSSION The concept of Real-time scheduling is implemented using MATLAB software, within which a special toolbox is added. The toolbox known as “TORSCHE SCHEDULING TOOLBOX FOR MATLAB” – (Time Optimisation, Resources, SCHEduling) is a freely available toolbox, which can be downloaded from open source and can be added to the path file of MATLAB. TORSCHE scheduling toolbox is specially designed for solving scheduling algorithms.

Applied Mechanics and Materials Vol. 550

183

Here, we consider single core with 3 processors. Each processor consists of 1 task with its own deadline and execution time. A sample problem may be considered: Table.1 Sample Scenario for execution PROCESSOR TASK RUNTIME DEADLINE PROCESSOR_1 TASK_1 3 7 PROCESSOR_2 TASK_2 3 12 PROCESSOR_3 TASK_3 5 20 From this table, we have inferred that the task_1 present in Processor_1 has to be executed for a period of 3ms and the maximum deadline allowed for the execution of task_1 is 7ms. If there is a deadline miss, then, the task will not be serviced. The priority of processors is given in ascending order i.e., highest priority for processor_1 and least priority for processor_3. The MATLAB output for this scenario is: task“t1” Processing time: 3 Weight :3 Period :7 task“t2” Processing time: 3 Weight :2 Period : 12 task“t3” Processing time: 5 Weight :1 Period : 20 Set of 3 tasks taskset schedulable

Fig.4: MATLAB output The same scenario, when compiled using TORSCHE scheduling toolbox, generates a graph as shown:

Fig.5: TORSCHE Scheduler output Another set of analysis made from C-programming gives a detailed understanding of the concept of Simulated Annealing algorithm. The output obtained by executing the same concept in “TurboC” software gives the following analysis:

184

Mechanical and Power Research

Fig.6: Input data from user Fig.6 shows the data input from the user. This includes the number of cores, processors and tasks along with the execution time of each task.

Fig.7: Penalty value calculation Fig.7 shows the data got from the user at step 1 and the calculated penalty values for each task. The penalty values are calculated using the formula (1). Fig.8 shows the probability values calculated from the penalty values for each task. The probability value should not exceed 1. The task with maximum probability value is chosen for change and is interchanged with the task with minimum probability value.

Applied Mechanics and Materials Vol. 550

185

Fig.8: Probability Value calculation ADVANTAGES AND APPLICATIONS The real-time scheduling done using Simulated Annealing Algorithm proves to be efficient when compared to other conventional algorithms because of the following reasons: It provides Flexibility and global optimality. It is also Robust and generalized. SA is more versatile and does not depend upon the properties of the model. This algorithm is used to model heuristic based problems, in determining adaptive neighborhood and also in multi-core environment. CONCLUSION Due to the constant increase of technology advancement and system complexity in multi-core platforms, energy efficiency becomes a major concern. In this paper, the DVS and DPM techniques are applied together in order to achieve efficient power management. Finally, through the experiment results, we conclude that there is an efficient power management when compared to the existing algorithms. This work can be further extended by considering more number of clusters and cores. ACKNOWLEDGEMENT I am thankful to Dr.S.R.K.Prasad, Correspondent, Dr.Prabhakar, Secretary and Dr.V.Selladurai, Principal, CIT. I am highly indebted to my research guide Dr.A.Rajeswari, HOD, Department Of ECE and co-guide Mrs.M.Poongothai, for their support and constructive criticism in doing my project. REFERENCES [1] “A heuristic energy-aware approach for hard real-time systems on multi-core platforms”Da He, Wolfgang Mueller, “Microprocessors and Microsystems”, University of Paderborn, Germany, Elsevier, May 2013. [2] A. Rowe, K. Lakshmanan, H. Zhu, R. Rajkumar, “Rate-harmonized scheduling for saving energy”, in: Proceedings of the 2008 Real-Time Systems Symposium, RTSS ’08, IEEE Computer Society, Washington, DC, USA, 2008, p. 113–122. [3] K. Lampka, K. Huang, J.-J. Chen, Dynamic counters and the efficient and effective online power management of embedded real-time systems, in: Proceedings of the Seventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, ACM, New York, NY, USA, 2011.

186

Mechanical and Power Research

[4] J. Chen, H. Hsu, T. Kuo,” Leakage-aware energy-efficient scheduling of real-time tasks in multiprocessor systems”, in: Proceedings of the 12th IEEE RTAS, Washington, DC, USA, 2009, pp. 408–417. [5] L. Niu, “System-level energy-efficient scheduling for hard real-time embedded systems”, in: Design, Automation Test in Europe Conference Exhibition (DATE), 2011, pp. 1–4. [6] X. Zhong, C.-Z. Xu, System-wide energy minimization for real-time tasks: lower bound and approximation, in: IEEE/ACM International Conference on Computer-Aided Design, 2006. ICCAD ’06, 2006, pp. 516–521. [7] D. Zhu, R. Melhem, B.R. Childers, Scheduling with dynamic voltage/speed adjustment using slack reclamation in multiprocessor real-time systems, IEEE Trans. Parall. Distrib. Syst. 14 (7) (2003) 686–700. [8] D. Zhu, R. Melhem, B.R. Childers, Scheduling with dynamic voltage/speed adjustment using slack reclamation in multiprocessor real-time systems, IEEE Trans. Parall. Distrib. Syst. 14 (7) (2003) 686–700.

Mechanical and Power Research 10.4028/www.scientific.net/AMM.550

Energy Efficient Scheduling Using Simulated Annealing Algorithm for Multi-Core Processors 10.4028/www.scientific.net/AMM.550.178

Suggest Documents