Scheduling Policy Costs on a JAVA Microcontroller - FTP Directory ...

6 downloads 47947 Views 100KB Size Report
chosen scheduler and the application, and also the power consumption overhead. ... necessary if one wants a multi-application platform and has to download ...
Scheduling Policy Costs on a JAVA Microcontroller Leomar S. Rosa Jr., Flávio R. Wagner, Luigi Carro, Alexandre S. Carissimi, André I. Reis Instituto de Informática – Universidade Federal do Rio Grande do Sul (UFRGS) PO Box 15.064 – 91.501-970 – Porto Alegre – RS – Brazil {leomarjr, flavio, carro, asc, andreis}@inf.ufrgs.br

Abstract. This paper presents the implementation of different scheduling policies on a Java microcontroller. Seven new instructions were added to the architecture to support context switching and scheduler implementation. By using these instructions, four schedulers following the POSIX standard were developed for the specific architecture. These schedulers were used in a study about the impact of different scheduling policies for embedded systems applications. Several design costs are discussed, including the hardware cost of the extended instructions, ROM and RAM capacity used, the number of cycles to run the chosen scheduler and the application, and also the power consumption overhead. Experiments show that the exploration of different scheduling alternatives as well as careful scheduler implementation may play an important role in performance optimization.

1 Introduction Embedded systems are fundamental parts of modern life appliances. These equipments perform several different tasks, using a limited set of resources. The complexity of applications and underlying hardware, tight performance/power budgets, as well as an aggressive time-to-market design schedule require the use of run-time software support by application developers. This support usually takes the form of an operating system [12] [13] that manages four system aspects: file systems, memory access management, I/O system, and processor use. The processor use is managed by a routine of the operating system called scheduler. A scheduler is needed when a single processor must handle different tasks. The scheduler may be implemented in software [11] or in hardware [8]. The power consumption of a complete embedded operating system was investigated in [1] [2] [3] [15]. The work of [2] investigates the power consumption of the PalmOS operating system. The power consumption of the µC/OS running on a Fujitsu SPARClite platform is analyzed in [3]. The energy consumption of eCos, focusing in particular on the relationship between energy consumption and processor frequency was characterized in [1]. The work of [15] compared the consumption of the Linux OS running on a StrongARM-based platform and µC/OS running on a Fujitsu SPARCLite

platform. The work [16] investigates the performance and silicon area requirements of a multithreaded Java microcontroller. However, none of these works studied the influence of different scheduling policies in terms of area, power consumption, and performance together. In this paper we investigate the implementation cost for different scheduling policies implemented in software. Therefore, the operating system support for file systems and memory management is neither considered nor implemented. There are many applications in which there is no need for this kind of support, and these aspects will be studied in a future work. Our study will concentrate on a dedicated leveraged OS consisting only of schedule support. The goal of this paper is to investigate the cost of implementing software schedulers for embedded applications, in terms of area, performance, and power overhead. All experiments have been carried on top of a Java microcontroller [5] [6]. This paper is organized as follows. Section 2 details the processor architecture. New instructions to support scheduler implementation and context switching are presented in Section 3. The implemented scheduling policies are discussed in Section 4. Results and conclusions are presented in Sections 5 and 6, respectively.

2 The FemtoJava Microcontroller The FemtoJava Microcontroller [5] [6] is a stack-based microcontroller that executes Java bytecodes, whose major characteristics are: reduced bytecode instructions set, Harvard architecture, orthogonality of execution, small size, and easy insertion and removing of instructions depending on the target application. FemtoJava implements in hardware an execution engine for Java, through a stack machine compatible with the Java Virtual Machine (JVM) specification, like PicoJava [10]. In general, the JVM has three major components: class loader, class verifier, and execution engine. In fact, the class loader and verifier act at runtime and are only necessary if one wants a multi-application platform and has to download code over a network. A compiler that follows the JVM specification is being used and will synthesize an ASIP version of FemtoJava. Only the execution core and some tools to extract the software at design time are really necessary. An immediate advantage of native execution of Java bytecodes is the software compatibility. This feature guarantees the availability of cross-platform development of software. In some conventional Java platforms (for example, PCs or workstations) one can just implement and run a Java program. Running the program is equivalent to simulating the behavior of the application in the target microcontroller, with all resources and convenience of a desktop environment in the development phase. The Java Virtual Machine is an abstract machine, based on a stack architecture, with Java bytecode execution capability [9]. Small applications as smart cards, building access control, and active badge location are not a good target for powerful superscalar processors, with large caches and register files running a full Java execution environment. For this kind of application just a small, low cost and cacheless microcontroller could be an optimal solution. In addition, the stack machine has characteristics and limitations distinct from those of CISC and RISC ones. Zero-operand stack

machines do not encode operand information on the instruction word. Therefore, stack code is more compact and portable, because it needs fewer bits to encode one instruction, and makes no assumption about register file organization (making the instruction size small) [7]. The simplified schematic of Figure 1 illustrates the microarchitecture of the FemtoJava microcontroller. The FemtoJava implementation uses a subset of the JVM bytecodes, with only 68 instructions. The implementation was made using VHDL, and the synthesis and analysis were performed through the Leonardo Spectrum and Quartus II environment from Mentor Graphics and Altera, respectively. The instructions supported are basic integer arithmetic and bitwise operations, conditional and unconditional jumps, load/store instructions, stack operations, and two extra bytecodes for arbitrary load/store. In this core all implemented instructions are executed in 3, 4, 7, or 14 cycles, because the microcontroller is cacheless and several instructions are memory bound.

3&

; 8 0

5$0

 $

520

H U G G 

$  P  H  0  J  U  3

2XWSXW 3RUWV

V  X 

%  Q  R L W F X  U W  Q ,

%  V  V  H  U  G  G 

$  P H  0  D W  D  '

0$5 V X 

%  D W D 

'

,00

; 8 0 &RQVW

)50

$ %

,QWHUUXSW +DQGOHU



63

9$5

7LPHU

; 8 0

V  X 

V  X 

%  V  V 

,QSXW 3RUWV





; 8 0

8 /$

,5 &RQWURO

Fig. 1. The FemtoJava microarchitecture

Table 1. Description of the instructions for scheduling support

Instruction

Bytecode

Example

Meaning

INIT_VAL INIT_STK REST_CTX SAVE_CTX SCHED_THR

#of µInstructions 9 9 11 7 12

f4 f5 f6 f7 f8

f4 f5 f6 f7 f8

SCHED_PRT GET_PC

11 7

f9 fa

f9 $s1,$s2 fa $s1

Mem[$s2] ← $s1 Mem[$s2] ← $s1 - 2 SP ← Mem[$s1] Mem[$s1] ← SP if (Mem[$s1]=0) PC ← PC + $s2 ; else PC++ A ← $s1 , B ← $s2 PC ← Mem[$s1]

$s1,$s2 $s1,$s2 $s1 $s1 $s1,$s2

3 New Instructions for Scheduling Support The scheduler is responsible for granting CPU access to one of multiple processes to be run in the same CPU. In order to implement a scheduler, hardware support is needed. The original version of the FemtoJava microcontroller does not have instructions dedicated to process scheduling and context switching, as it is derived from the JVM that is a stack machine. Seven new instructions have been created for this, adding to those already existing in the architecture. Table 1 presents these new instructions. The number of microinstructions varies from 7 to 12. Each of the instructions is described below. INIT_VAL - This instruction saves the value of a register (s1) in a memory position pointed by another register (s2). It is used to save information needed to schedule the processes. INIT_STK - This instruction is needed to initialize a stack for each process to be treated by the scheduler. As the push instruction pre-increments the Stack Pointer (SP) and the memory is addressed by byte, it is necessary to decrement the saved address by two positions. REST_CTX - This instruction restores the SP from the memory position pointed by register s1. As the FemtoJava microcontroller saves the context in the stack before handling interruptions, the remaining context is recovered automatically from the stack. SAVE_CTX - This instruction saves the SP in the memory position pointed by s1. As the FemtoJava microcontroller saves the context in the stack before handling interruptions, the remaining context is saved automatically in the stack. SCHED_THR - This instruction is used to redirect the execution flow to the process that is granted with CPU access by the scheduler.

Table 2. Hardware overhead and number of logic cells (LC) for implementing the new instructions with Altera MaxplusII on a FLEX10K EPF10K70RC240-2 device

Schedulers None Without Priority With Priority

# of New Instructions 0 6 (f4, f5, f6, f7, f8, fa) 7 (f4, f5, f6, f7, f8, f9, fa)

LC 2057 2173 2175

% Area 100% 105.6 % 105.7 %

SCHED_PRT - It transfers the priority values to ALU registers A and B and compare them by a subsequent arithmetic instruction. This instruction is used to compare the priorities of two different processes and determine which one has the higher priority. This instruction is used only by schedulers considering processes with different priorities. GET_PC - This instruction is an unconditional branch used to jump to a memory position pointed by register s1. It is used to jump to some specific sub-routines of the scheduler. The hardware overhead for implementing the new instructions with Altera MaxplusII on a FLEX10K EPF10K70RC240-2 device is shown in Table 2. The number of logic cells (LC) of the original FemtoJava processor without the new instructions is 2057 logic cells. This number is increased to 2173 logic cells when the six instructions for scheduling without priority (f4, f5, f6, f7, f8, fa) are implemented. The number of cells to implement all the new instructions, including instruction SCHED_PRT that is used only for schedulers considering processes with different priorities, is 2175. The area overhead for extending the FemtoJava instruction set to support context switching and allow the implementation of a software scheduler using the new instructions is of the order of 120 logic cells and represents an area increase of around 5% with respect to the original microcontroller.

4 Implemented Schedulers This section discusses the implemented scheduling policies. An operating system must allocate computer resources among the potentially competing requirements of multiple processes. In the case of the processor, the resource to be allocated is execution time on the processor and the means of allocation is scheduling. This way, the scheduler is the component of the operating system responsible to grant the right to CPU access to a list of several processes ready to execute. This idea is illustrated in the five-state diagram of Figure 2 [14]. The meaning of each state is defined as follows: New: a new process may be created and admitted to the list of ready processes. Ready: this is a list of processes that are ready to execute. Once the processor is free, the scheduler must choose one of them to become active in the CPU. This is called dispatch. The choice of the process is done following a scheduling policy.

Fig. 2. Five-State process model

Running: it is a process that is being run on the processor. There are three ways a process may loose processor access. The scheduler may time-out the process, the process may request an external operation or execute a synchronization primitive and become blocked until the request is granted, or the process may finish and be released for exit. Blocked: contains a list of processes waiting for pending events (I/O or sincronization). Exit: finished processes. Four schedulers following the POSIX standard [4] for operating systems [14] were implemented: FIFO and Round-Robin with and without support to weighted process priority. The implemented schedulers are discussed in the following sub-sections. 4.1 FIFO The FIFO scheduling policy just dispatches the ready processes in a first-in-first-out basis. Each process is executed until it finishes or until it requests an external operation. In the case of an external request, the process will be blocked to wait for an event and, once the request is granted, it will be inserted in the end of the list of ready processes. The implementation of this scheduling policy is straightforward and requires mainly context switching, performed by saving and restoring special registers like the Stack Pointer and the Program Counter. This policy was implemented directly in bytecodes by using the new instructions. 4.2 FIFO with Priority The FIFO with priority scheduling is very similar to the FIFO policy. Each process has an associated priority, and the scheduler just dispatches the ready process with the highest priority. This way it visits all the processes in the ready state and dispatches the one with the highest priority. If there are several processes with the same priority,

the FIFO policy is adopted among these processes. The implemented FIFO is nonpreemptive. 4.3 Round-Robin The Round-Robin scheduling policy works with the concept of a time quantum. Each process receives a time quantum to use the processor. Each process may be deactivated due to three reasons: granted time ends, process finishes, or process requests external event. The concept of time quantum is implemented through a timer interrupt programmed by the scheduler. The scheduler sets a timer to count the number of cycles corresponding to the quantum and dispatches a process. When the timer achieves the quantum count, it produces a timer interrupt that transfers the processor control back to the scheduler that will dispatch the next process in the list of ready processes. In the implemented scheduler, the user may configure the time quantum. In Section 5, four different time quanta were used to measure the scheduling costs. 4.4 Round-Robin with Priority The Round-Robin with priority scheduling is very similar to the Round-Robin policy. Each process has an associated priority, and the scheduler adopts a Round-Robin policy among the ready processes with the highest priority. The implemented scheduler is non-preemptive. 4.5 Memory Overhead The memory overhead introduced by the schedulers code is presented in Table 3. It is possible to notice that there is a small RAM overhead due to some internal scheduler data. The size of the code of each scheduler introduces a ROM overhead that is quite small when compared to the code of usual applications.

Table 3. Memory overhead introduced by the schedulers

Scheduler FIFO FIFO with Priority RR RR with Priority

RAM overhead due to internal data 41 bytes 42 bytes 41 bytes 42 bytes

ROM overhead due to scheduler code 1567 bytes 1111 bytes 1596 bytes 1161 bytes

Table 4. Time quanta used for Round-Robin schedulers

Time Quantum Q1 Q2 Q3 Q4

# of cycles 7.000 cycles 5.000 cycles 3.000 cycles 2.500 cycles

Table 5. Process priorities used in the schedulers considering processes with different priorities

Mode S1 S2 S3 S4

Process Priority [5] [5] [5] [5] [5] [5] [5] [5] [5] [5] [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [9] [8] [7] [6] [5] [4] [3] [2] [1] [0] [7] [6] [8] [2] [1] [7] [5] [3] [4] [9]

4.6 Implementation of the Schedulers Some implementation details are presented in the appendixes and illustrate the use of the new Femtojava instructions. Appendix I details the implementation of process initialization, while Appendix II describes the code for process scheduling.

5 Results Four different schedulers following the POSIX standard [4] were implemented. As described previously, these schedulers introduce a small hardware overhead due to the new instructions and a small memory overhead due to the scheduler code itself (ROM overhead) and to some internal data (RAM overhead). In this section we discuss the relative performance of the different schedulers considering the number of cycles they use to complete a set of tasks, as well as the power consumption involved. First of all, let us consider the different schedulers in use. The Round-Robin schedulers are characterized by a time quantum, and the time quanta used in our experiment are shown in Table 4. The schedulers considering different priorities may treat different sets of tasks, with different priorities. Table 5 shows the priority of each task used in our experiments, as they are ordered in the program ROM. Higher number means higher priority. Each task is a sorting algorithm that processes a 10-element vector. The set S1 has ten tasks of equal priority. The sets S2 to S4 have different priorities, but the position the tasks are placed in the instruction ROM are different: set S2 has tasks ordered in increasing priority, set S3 has tasks ordered in decreasing priority and set S4 has tasks in random order.

The number of cycles and power consumption needed to compute the set of 10 sorting tasks is shown in Table 6 and Table 7. Table 7 shows the estimated power consumption produced by the schedulers in number of switched gate capacitances (SGC). This power consumption was obtained by simulation. The simulator used to perform this estimation is a high level simulator that estimates the number of gate capacitances being switched for every processor microinstruction, as described in [17]. According to data in Table 6, it is possible to observe that the FIFO scheduling policy is always the faster one. The execution time penalty of the Round-Robin schedulers increases with the decreasing of the size of the time quantum. However, a smaller time quantum will give a better service quality, because the processes will execute more frequently. This is an important point to be considered in RTOS. The FIFO or a Round-Robin with large time quantum will consume less power, but the quality of service may be degraded due to the excessive time that a process may have to wait to have the CPU granted. Table 6. Number of cycles for different schedulers on the same tasks

Scheduler FIFO

Quantum -

FIFO with Priority

-

RR

RR with Priority

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

Priority S1 S2 S3 S4 -

S1

S2

S3

S4

Cycles 66.514 66.797 72.953 66.797 68.207 66.517 73.069 76.328 82.880 66.800 73.207 76.432 82.839 72.956 88.638 95.118 109.085 66.800 73.207 76.432 82.839 68.210 76.003 79.837 87.588

Table 7. Power consumption for different schedulers on the same tasks Scheduler FIFO

Q. -

FIFO with Priority

-

RR

RR with Priority

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

P. S1 S2 S3 S4 -

S1

S2

S3

S4

RAM 223.606.000 223.169.000 243.225.000 223.169.000 227.700.000 224.342.000 243.064.000 252.379.000 271.101.000 223.905.000 241.477.000 250.332.000 267.904.000 243.961.000 291.778.000 311.443.000 353.855.000 223.905.000 241.477.000 250.332.000 267.904.000 228.436.000 250.516.000 261.441.000 283.429.000

ROM 4.378.510 4.397.830 4.901.300 4.397.830 4.512.140 4.379.200 4.874.390 5.121.180 5.616.370 4.398.520 4.886.810 5.133.830 5.622.120 4.901.990 6.152.960 6.666.780 7.776.070 4.398.520 4.886.810 5.133.830 5.622.120 4.512.830 5.115.200 5.411.900 6.010.590

Core 119.062.403 119.738.075 130.453.446 119.750.409 122.212.857 119.202.709 130.823.661 136.601.747 148.243.343 119.878.654 131.398.505 137.205.995 148.749.841 130.592.209 158.220.834 169.694.899 194.361.976 119.890.781 131.421.625 137.238.112 148.799.574 122.352.506 136.301.339 143.175.839 157.072.502

Total 347.046.913 347.304.905 378.579.746 347.317.239 354.424.997 347.923.909 378.762.051 394.101.927 424.960.713 348.182.174 377.762.315 392.671.825 422.275.961 379.455.199 456.151.794 487.804.679 555.993.046 348.194.301 377.785.435 392.703.942 422.325.694 355.301.336 391.932.539 410.028.739 446.512.092

Notice that in the implementation of the schedulers dealing with different task priorities, the scheduler saves some information of the process with highest priority, and this way the order of the processes in the program ROM will have a significant impact on the performance and power consumption. This may be observed in Tables 6 and 7 by comparing the Round-Robin with priority and quantum Q4 running for sets S2 and S3. Set S2 has an unfavorable order, and the number of cycles is around 109k cycles. Set S3 has a favorable order, and the number of cycles is around 83k cycles. This represents a difference of 30% due to a better scheduler implementation. The power consumption overhead follows a similar pattern and it is shown in Table 7. This way, the exploration of different scheduling alternatives as well as careful scheduler implementation may play an important role in performance and power consumption optimization.

6 Conclusions and Future Work Four different schedulers following the POSIX standard [4] were implemented. As described previously, these schedulers introduce a small hardware overhead due to the

new instructions added to the FemtoJava architecture and a small memory overhead due to the scheduler code itself (ROM overhead) and to some internal data (RAM overhead). Simulations of the schedulers indicate that the exploration of different scheduling alternatives as well as careful scheduler implementation may play an important role in performance and power optimization. Future work will analyze the impact of other scheduling policies. This analysis will be used to create an automatic tool to synthesize embedded schedulers according to particular system requirements. This linker tool is being developed and will allow the user to configure one of the implemented schedulers following a given policy to be linked and manage several applications sharing the same FemtoJava CPU.

References 1. Acquaviva A.; Benini L.; Ricco B. Energy Characterization of Embedded Real-Time Operating Systems. In Proc. Workshop Compilers & Operating Systems for Low Power. (2001) 2. T.L.Cignetti, K.Komarov, C.S.Ellis. Energy Estimation Tools for the Palm. In: Proceedings of the ACM MSWWiM (2000) 3. Dick, R.P.; Lakshminarayana, G.; Raghunathan, A.; Jha, N.K. Power Analysis of Embedded Operating Systems, 37th Design Automation Conference, pp. 312-315. (2000) 4. Gallmeister, B. O. POSIX.4 Programming for the Real World. First Edition. O´Reilly & Associates, Inc. (1995) 5. Ito, S. et al. System Design Based on Single Language and Single-Chip Java ASIP Microcontroller, In Proceedings of Design Automation and Test in Europe, pp. 703-707, Paris, France. IEEE Computer Society Press. (2002) 6. Ito, S. A., Carro, L. and Jacobi, R. Making Java Work for Microcontroller Applications, IEEE Design & Test, vol. 18, no. 5, pp. 100-110, Sep-Oct. (2001) 7. Koopman, P. Why stack machines? In the URL: http://www.cs.cmu.edu/koopman/forth/ whystack.html. (2002) 8. Kreuzinger, J.; Brinkschulte, U.; Pfeffer, M.; Uhrig, S.; Ungerer, Th. Real-time EventHandling and Scheduling on a Multithreaded Java Microcontroller, Microprocessors and Microsystems, vol. 27, pp. 19-31. (2003) 9. Lindholm, T. and Yellin, F. The Java Virtual Machine Specification. The Java Series, Addison-Wesley. (1997) 10. Mcghan, H. and O'Connor, M. Picojava: A Direct Execution Engine For Java Bytecode, Computer, vol. 31, no.10, pp.22-30, Oct. (1998) 11. Memik, S. O.; Bozorgzadeh, E.; Kastner, R.; Sarrafzadeh, M. A Super-Scheduler for Embedded Reconfigurable Systems, IEEE/ACM International Conference on Computer Aided Design, pp. 391-394. (2001) 12. Ortiz, S. Jr. Embedded OSs Gain the Inside Track. IEEE Computer, vol. 34, n. 11, pp. 1416. (2001) 13. Schlett, M. Trends in Embedded-Microprocessor Design, IEEE Computer, vol. 31, n. 8, pp.. 44–49. 1998 14. Silberschatz, A.; Galvin, P.; Gagne, G. Applied Operating System Concepts. First Edition. Wiley. (2000)

15. Tan, T.K.; Raghunathan, A.; Jha, N.K. Embedded Operating System Energy Analysis and Macro-Modeling, IEEE International Conference on VLSI in Computers and Processors, pp. 515-522. (2002) 16. Kreuzinger, J.; Zulauf, R.; Schulz, A.; Ungerer, T.; Pfeffer, M.; Brinkschulte, U.; Krakowski, C. Performance Evaluations and Chip-Space Requirementes of a Multithreaded Java Microcontroler. To be published. Available document in Scientific Literature Digital Library, CiteSeer: http://citeseer.nj.nec.com/384138.html 17. Beck Filho, A. C. S.; Mattos, J. C. B.; Wagner, F. R.; Carro, L.. CACO-PS: A General Purpose Cycle-Accurate Configurable Power Simulator. In 16th Symposium on Integrated Circuits and Systems Design Proceedings, IEEE Computer Society Press. (2003)

Appendix I: Initialization Module for the Round-Robin Scheduler The instructions with address ranging from 2dH to 37H implement the initialization of the timer and of the interrupt system for the schedulers based on fixed time quanta. The code starting at the address 38H implement the stack initialization for each process, as well as the initialization of some information present in the process table (stack pointer, process priority, process status). This piece of code uses the new instructions INIT_STK and INIT_VAL. 2d 2e 2f 30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f 40 41 42 43 44 45 46 47 48 49 4a 4b . .

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

10; 22; 10; 00; f2; 11; 00; 14; 10; 0c; f2; f5; e6; 7f; 05; 1d; f4; 00; f2; e6; 7a; f5; cc; ff; 05; 84; f4; 00; f3; cc; fa;

--------------------------------

bipush -- ENABLE INT bipush store_idx sipush -- SET TIMER bipush store_idx INIT_STK PC 2 value initialized in e67dH PC 2 value initialized in e67dH INIT_VAL SP 2 initialized in f2H SP 2 initialized in f2H INIT_STK PC 3 value initialized in ccfdH PC 3 value initialized in ccfdH INIT_VAL SP 3 value initialized in f3H SP 3 value initialized in f3H

92 93 94 95 96 97 98 99 9a 9b . . c4 c5 c6 c7 c8 c9 ca cb cc cd

: : : : : : : : : :

f4; 01; 01; ff; ff; f4; 01; 02; ff; ff;

-----------

INIT_VAL (ffff = ready , 0000 = blocked)

: : : : : : : : : :

f4; 01; 11; 00; 05; f4; 01; 12; 00; 05;

-----------

INIT_VAL

state of process 1 saved in 0101H state of process 1 salvo in 0101H INIT_VAL state of process 2 saved in 0102H state of process 2 saved in 0102H

priority of PROC 1 saved in 0111H priority of PROC 1 saved in 0111H INIT_VAL priority of PROC 2 saved in 0112H priority of PROC 2 saved in 0112H

Appendix II: Task Scheduling for the Round-Robin Scheduler The following piece of code illustrates the use of the new instructions for task scheduling. The instructions SCHED_THR, SAVE_CTX, SCHED_PRT and REST_CTX are used. The first actions taken by the code is to disable the timer (addresses 133 to 137) and then save the context of the current process (addresses 138 to 16b). The goal of this piece of code is to guarantee that any other interruption will not be handled during the execution of the scheduler code, and also to guarantee the integrity of the data belonging to the process being removed from the CPU. The scheduling policy code begins at the address 199H.. This code chooses the next process to be scheduled. The context restoring for the process gaining access to CPU starts at address 471H. Finally, the timer is enabled, with the code starting at address 4aaH. This action is performed before transferring the CPU control to the process being scheduled. It is important to notice that this piece of code will vary according to the scheduling policy being used. The code shown here implements a Round-Robin policy. 133 134 135 136 137 138 139 13a 13b 13c 13d 13e 13f .

: : : : : : : : : : : : :

10; 00; 10; 0d; f2; f8; 01; 21; 25; f8; 01; 22; 27;

--------------

bipush -- STOP TIMER bipush store_idx SCHED_THR

SCHED_THR

. 160 161 162 163 164 165 166 167 168 169 16a 16b . . 199 19a 19b 19c 19d 19e 19f 1a0 1a1 1a2 1a3 1a4 1a5 1a6 1a7 1a8 1a9 . . 471 472 473 474 475 476 477 478 479 47a 47b 47c . . 4aa 4ab 4ac 4ad 4ae 4af 4b0 4b1 4b2 4b3

: : : : : : : : : : : :

f7; 00; f1; a7; 00; 34; f7; 00; f2; a7; 00; 2e;

-- SAVE_CTX PROC 1 ------- SAVE_CTX PROC 2 ------

: : : : : : : : : : : : : : : : :

f4; 01; 1f; 00; 00; f8; 01; 01; 45; f9; 01; 11; 01; 1f; a2; 00; 3d;

------------------

INIT_VAL

: : : : : : : : : : : :

f6; 00; f1; a7; 00; 34; f6; 00; f2; a7; 00; 2e;

-------------

REST_CTX PROC 1

: : : : : : : : : :

10; 14; 10; 0c; f2; 10; 03; 10; 0d; f2;

-----------

bipush -- SET TIMER

SCHED_THR

SCHED_PRT

IF_ICMPGE

GOTO SET_TIMER, START_TIMER, RETURN REST_CTX PROC 2 GOTO SET_TIMER, START_TIMER, RETURN

bipush store_idx bipush -- START TIMER bipush store_idx