MONSTER: A NEW BEHAVIOR-BASED MICROKERNEL ... - CiteSeerX

3 downloads 436 Views 150KB Size Report
M.O.N.S.T.E.R.: A NEW BEHAVIOR-BASED MICROKERNEL FOR MOBILE ROBOTS .... programming languages provide an automatic mechanism to handle this ...
M.O.N.S.T.E.R.: A NEW BEHAVIOR-BASED MICROKERNEL FOR MOBILE ROBOTS Dirk Spenneberg, Martin Albrecht, Till Backhaus Robotics Lab Faculty of Mathematics and Computer Science University of Bremen Bibliotheksstr. 1, D-28359 Bremen, Germany {dspenneb, malb, backhaus}@informatik.uni-bremen.de http://www.informatik.uni-bremen.de/robotik ABSTRACT This article describes a concept for programming behavior based robots which combines properties of real-time operating systems with behavior-based programming to solve problems of scalability and reactivity. This new concept features hard and soft periodic processes, preemptive reflexes, behavior processes on different execution frequencies and a background process. 1. INTRODUCTION In the field of behavior-based robotics a multitude of wellknown architectures were developed. Typical for these approaches is the principle of simultaneous active behavior processes (BPs). By behavior processes we mean the internal mechanisms/processes in an agent’s program which are responsible for a certain observed behavior of the agent. In order to port these architectures one-to-one on robots using sequential processors, certain real-time-demands have to be fulfilled. In particular, all behaviors, reflexes and I/Oroutines, thus all processes, have to be processed under specific time-constraints. In approaches like the Behavior Language [1] (an extension of the subsumption architecture), Process Description Language (PDL) [2] or Dual Dynamics (DD) it is necessary that at least a part of the processes is processed with a constant frequency. The Behavior Language provides the user with the whenever -expression which is mostly used in its periodical form. This means that the expression has to be processed with a fixed period. According to [1], a fundamental period of 40 ms (25 Hz) is used in most of their experiments. In PDL and DD, a fixed frequency is applied to all processes. Behaviors are understood as dynamical systems which are described with differential equations. For a correct calculation of the difference the current point in time must be known. In PDL, to keep this problem simple, all processes are processed in a single-loop system running at a fixed fre-

quency [2]. A typical frequency of PDL systems is 40 Hz. All known implementations of DD are also based on a periodfixed control loop, e.g.[3]. 2. CONSEQUENCES OF SINGLE-LOOP SYSTEMS Guaranteeing the above described periodic single loop behavior leads to the consequences described in table 1. Point 1 1. The number of processes which can be processed in a fixedperiod-loop is limited. If the execution time of all processes in a loop exceeds the given maximum loop time the behavior of the whole system changes. 2. Direct reaction to spontaneous events are not foreseen. The reactivity of the system is limited by the loop period. 3. All processes are clocked with the same frequency, processes running at different frequencies are not allowed. 4. Time which is left after finishing all processes in a loop is idle time. No time-uncritical and preemptive processes like a background process are foreseen in a single-loop system to use this time.

Table 1. Consequences from a Fixed Processing Period of table 1 addresses the problem, that if we describe behavior processes (e.g. an Obstacle Avoidance) with difference equations as in DD or PDL, it is important to keep the time basis constant. For example we could program a Avoidancebehavior in such a way that a robot would reliably avoid obstacles at a given fixed behavior loop frequency. Now, if someone adds more and more processes to the program loop, it is only a matter of time till the execution of these processes take longer than the given fixed loop-period. In a PDL -implementation, this would lead to the effect that the period is prolonged till all behaviors are executed. Thus the behavior processes including the Avoidance-behavior are now running at a lower frequency, which results in a different behavior. In average each behavior will react later and

the observed reaction will be less strong, because of the use of difference equations. Thus at least an adjustment of the difference equation would be necessary to get a reliable obstacle avoidance behavior again. It is likely that all other implemented behaviors will have to be re-adjusted too. An automatic adjustment process is neither foreseen in the Behavior Language nor in PDL or DD. Point 2 mentions the problem that to respond to certain disturbances a agent must be able to interrupt the normal periodic behavior processing loop, and initiate a reflex reaction, e.g., a stumbling correction reflex in a walking robot. In the worst case the reflexes might be triggered often and start to consume time needed to execute the behavior processes in time. Another example for this problem of interrupting processes are low-level interrupt service routines (ISR) as commonly used for the I/O operation of embedded systems. Point 3 of table 1 addresses the problem that it would be preferable to have behavior processes running at different frequencies. For example, many behavior processes are computing their influence on the overall behavior on the basis of sensory data. Some sensors generate new values faster than others. Therefore, it might not make sense to execute a behavior process before a new sensor value has been generated. A solution in PDL for this problem would be to use a high base frequency and to add an internal counter to each behavior process which increments every loop cycle. If the counter reaches a certain process-specific threshold value the process is executed and the counter is reset. To ensure that this works, the programmer has to take care that in every loop all processes reaching their counter threshold can be executed in time. None of the above mentioned behavior programming languages provide an automatic mechanism to handle this problem. The disadvantage of point 4 in table 1 is obvious. If we use fixed time slices and the execution of the behavior loop is finished before the next time step, some time is unused. The implications of this problem are already explored for periodicmonoton processes in real-time systems[4]. For real-time systems based on periodic-monoton processes, it is true that all processes are executed in their given period time if the working load of the whole system is smaller than 69% [5]. This theorem is still true when the processes are allowed to run on different frequencies. It is necessary to annotate that in the majority of published behavior-based implementations most of these problems do not occur because of their very low computational demands in relation to the powerful hardware used. 3. CONCEPT OF A REAL-TIME MICROKERNEL FOR BEHAVIOR BASED PROGRAMMING These reasons have been the motivation for a new microkernel approach, where we added RT-properties to a behavior-

based approach to cope with the above problems. Table 2 states which components a behavior-based program has and where they differ regarding periodicity. To be able to imple1. 2. 3. 4.

Reflexes on the basis of exceptions (non-periodic) Service routines on the driver level (non-periodic) Actuator control (periodic, hard periodicity) Sensor value acquisition (periodic, hard periodicity or non-periodic) 5. Reflexes triggerable by (periodically) measured sensor values (periodic, hard periodicity) 6. Behavior processes (periodic, soft periodicity) 7. Deliberative/Learning component (non-periodic, background)

Table 2. Typical Components of a Behavior-Based Program ment also hybrid architectures, an optional deliberative or learning component (e.g., a planer) is added to this list. Table 2 is ordered by the priority of the components with regard to their role concerning the system preservation (survivability). Reflexes on the basis of exceptions are used in the case of emergency. The service routines on the driver level ensure correct functioning of the system, especially by processing critical I/O. The actuator control and the sensor value acquisition are the interfaces between the behavior level and the sensor and actuator drivers. Non-interrupt based reflexes might be triggered on the basis of the sensor value acquisition. All known behavior-based approaches can be illustrated with the above listed components. PDL, for example, uses the following components: service routines on driver level, actuator control, sensor value acquisition and behavior processes. We divide the processes in strong, hard and soft periodicity (see table 3). Strong periodicity is not needed for our behavior-based con• Strong Periodicity: The process must run exactly with its given target frequency. • Hard Periodicity: The process runs in a recurrent way and, whenever possible, on its given target frequency, but at least as fast as its given minimum frequency. • Soft Periodicity: The process runs at its given target frequency, whenever possible, but its target frequency can be changed. There is no given minimum frequency.

Table 3. Three Types of Periodicity trol system, although it would be optimal and is requested by systems like PDL. We propose to handle all system-critical events with preemptive interrupt routines, which is the fastest and thus preferable way to respond to such events, therefore no strong periodicity is needed. Hard periodic processes are expected to run at their given target frequency. Only when preemptive interrupt routines are causing too much load, their frequency could be reduced. The user has to en-

sure by using interrupts wisely, that in the worst case the hard periodic reflexes run at least at their necessary minimum frequency. The frequencies of the behavior processes depend upon the remaining time. If enough time resources are left, the behavior processes will run at their target frequency. If there is less time, the behaviors will run at a reduced frequency. If the behaviors are running at their target frequency, the system will use the remaining time resources to switch to the deliberative or learning component, which is realized as a background process. The most surprising point in the above discussion is that we are rarely confronted with hard real-time constraints. The only processes which have hard real-time demands can be satisfied by a simple system which consists of a cyclic executive (loop) and additional asynchronous interrupts.

any uncritical task can be processed. With this extension the processor is utilized as much as possible, but a major problem remains. It is necessary to calculate the time the preemptive ISRs can consume in the worst case and the time the executive cycle will use. This is crucial for choosing the right frequency for the microkernel. To

Fig. 3. 2-Level-Background/Foreground System

Fig. 1. Clocked Cyclic Executive

Fig. 2. Foreground / Background System

But a simple cyclic executive has the disadvantage that the execution frequency of each process in the cycle depends on the execution time of the whole cycle. As an easy improvement, the executive cycle can be clocked with a periodic interrupt generated by an internal timer module. This results in the simplest version of a real time system, with very low overhead. It is called interrupt control loop or a clocked cyclic executive (see figure 1). Adding extra interrupts (in our case for emergency responses) results in a foreground/background system (see figure 2), where the background is realized by the clocked executive cycle and the foreground by our preemptive interrupts. If one can ensure that in the worst case the load of the preemptive interrupts will not cause the cyclic loop to fail (not all processes are finished at the end of a loop), this system works very well. To enhance it, we extend it with a ”background of the background”, which results in a 2-LevelBackground/Foreground microkernel system (see figure 3). This means that after all processes are processed, and the system has still time until the next loop starts, the microkernel makes a context switch to a background process where

do this in advance is very difficult because one of the convenient features of behavior-based software development is its bottom-up approach. Thus, one does not know exactly in advance how much behaviors and ISRs will be added in the future.Therefore, we break with the idea of a fixed period time for the processes in our loop and examine the consequences of a non-fixed period. The consequences are the same as with a loop which can not keep its given target frequency. Thus, the developer can no longer implement behavior processes for a certain optimal frequency only, but has to design behavior processes which can adapt to varying frequencies. The major consequences for the microkernel are: 1. Behavior processes have to know their period time. 2. The merging of the influences of the behavior processes on the same quantity (e.g. a motor value) has to take into account that not all behaviors might have taken their influence when the next updating of the actuator or internal quantities is executed. The first consequence can be resolved easily by just measuring the time (in time slices) between the current and the last execution of the behavior. This value is sent to the behavior process as its actual period value. The behavior can use this as an anticipation of its future period value and adapt its output. Another option to estimate the future period is to compute the median of the latest n execution periods. Coping with the second consequence is a more difficult. Therefore we should take a look at the example in figure 4. It shows the effects of interrupts. When interrupts are rarely

4. IMPLEMENTATION

Fig. 4. Example of a run in our new framework generated the behavior loop is processed easily in one time slice (from one time step (timer interrupt) to the next). The more interesting behavior loop is the second one. When the timer interrupt occurs, not all behavior processes will have been processed. But to ensure an up-to-date behavior of the actuator control, all influences already taken by the behavior processes should be incorporated. Therefore, we thought about a new way to merge the influences taken by the behavior processes. In PDL, each behavior process can take an additive influence on a quantity. Eventually, all these additive values and the old value of the quantity are summed up. Then the sum is checked, if it stays within certain bounds. If not it is cut accordingly. We modify this in the following way: the influence infb,i of each behavior process b ∈ B(the set of all BPs) on quantity qi is weighted with a weight wb,i , and the following new value for qi is computed:

qi (t) = qi (t − 1) +

n P

wb,i (t) · infb,i (t)

b=0 n P

(1) wb,i (t)

Our M.O.N.S.T.E.R.1 implementation is running on a Motorola MPC565 board clocked at 40 MHz. It is aimed at the idea that a behavior process programmer does not need to know about concurrent behavior processes trying to access the same piece of hardware ”at the same time” as for him the whole hardware seems to be his own. Furthermore, he needs not to care about merging his values with others as this is handled by the microkernel. On the other hand, a driver developer should not need to care about concurring behavior processes trying to interact with his driver or has to think about its driver being executed on time.

Data Structures The microkernel does not know behavior processes or drivers but soft- or hard-periodic processes. Each such process must provide the microkernel with a couple of functions which have to fulfill some formal requirements to enable the microkernel to handle most of the required work automatically. Basically, a developer would have to fill out the c-structure process (see below). struct process { char *name; struct atom *export_io; void *private_data; process *(*install) (char *name); atom*(*init) (process *p); void (*merge) (process_data *pd); void (*loop) (process *p, uint freq); void (*terminate)(process *d); }

b=0

wb,i (t) =

n wset (t), if b active in step t b,i

wb,i (t − 1) · dec(b, i), else

(2)

The function dec is used to decay the weight for the influence of behavior process b on quantity qi . This decayfunction can be defined by the developer of the behavior process. Note that in principle each behavior can have its own, different dec-function. If the behavior b took influence set in time-slice t no decaying is performed but the weight wb,i is used. This value is set newly in every cycle the behavior process is processed in. Thus, as long as the behavior process does not get processed, the weight will decay, and the influence of the behavior process on the quantity will decrease. This merging process is done in a hard periodic process. Furthermore, at every start of a new time slice, a short management process is executed to store and to change the machine context, all sensor values are acquired, the new sensor quantities and internal values are computed, the noninterrupt triggered reflex processes are processed and the motor values are updated.

The only function call the developer has to handle ”by hand” is install to register its process to the microkernel. From there on everything is handled automatically. To make this happen, every process needs to provide a function which is executed whenever it is scheduled – loop – and some housekeeping functions as init and terminate. When a process is initialized, it needs to connect to all initially required other processes, to export its I/O data by constructing the export_io list if it is a connectable process itself, and to request an execution period by setting a process assigned variable period. An initialized process is added to the runqueue, and its loop function will be executed regularly until it is terminated invoking its terminate function. Each regularly executed function will get its last period as a parameter real_period so this value can be used to estimate the next execution period. If the process is hard-periodic it is guaranteed that this is the period as requested by the process. If the process is soft-periodic, it must use this value to adapt 1 Microkernel

for scabrous terrain exploring robots

its calculations to its real execution period as reported by the microkernel.

Interprocess Communication Whenever a process b needs to interact with another process d it requests that from the microkernel by calling a connect ("d") function. All initialization of the required process d is handled by the microkernel. If the process d is a process which accepts inputs – e.g. an actuator driver – or provides outputs – e.g. a sensor driver – it has to export these I/O data via export_io in a given struct atom form to allow automatic processing. struct atom { unsigned int type; //type of data in v t_value v; //actual data unsigned int weight; //influence on "d" struct atom *next; }

Process d can export as many atoms as desired as its inputs and outputs, e.g. offering unlimited channels for input and output. The connecting process b will receive a copy of these exported I/Os. To process b, it looks like it had complete control over process d but in fact it is working on a virtual version of process d’s I/Os. Neither process b nor process d need to handle merging or value propagation. They write their influences, assign weights to them, and leave the microkernel do the calculations. Before a process d is executed, all values written by each connected process b are merged by the microkernel using the mechanism as described in this paper 2 or a custom-provided merge function . The result will be placed into the process’ export_io list. If the process exported read-only data – e.g. the process is a sensor – it must write new data to its export_io so that the new values can be propagated afterwards.

Loop / Scheduler Our time steps are hardware timer interrupt driven and clocked to a period of 10 ms at the moment. Whenever there is a timer interrupt, there are two possible contexts the machine might be in: 1. The complete loop could be processed in one time slice (see loop 1 in figure 4), and the background process consumes the spare CPU cycles until the next time slice starts. Therefore, the robot is in the background context when the interrupt occurs. So the hard-periodic processes are run, all influence weights get decayed and finally the behavior loop will be restarted. 2 See

section 1

2. The loop could only partially be processed in one timeslice (see loop 2 in figure 4), and the machine in the behavior loop context when the interrupt occurs. Therefore, when the hard-periodic processes are running, there will be influences written by behaviors in the last time slice and older ones. Those older influences have been decayed and will take less influence on the actual behavior of the system. After running the hard-periodic processes, all influence weights get decayed, and the behavior loop is readopted. When this loop is finished, the next one will be started directly, and the background process will not be active until the next tick. Therefore, all available resources are used to renormalize the behavior of the system (see time slice 3 in figure 4). Behaviors and drivers running on larger periods than the system itself are handled by a simple process specific counter variable dec which at each time step will be decremented. This is performed by the microkernel for each process. If it equals or is less than 0, M.O.N.S.T.E.R. executes the process and resets the counter to its original starting value period. As this counter variable is passed as real_period to the loop function processes know if they have been executed on time or if they have been delayed. It is possible to monitor the counter values of a given set of processes and after evaluation to readapt these processes to new frequencies. Readaption is no more than altering period which is part of every process’s internal data struct and can be accessed by library functions. Preemptive reflexes are not bound directly in the loop but are handled on IRQ level and will look like disturbances to the microkernel. Therefore no microkernel overhead increases reaction time for reflexes.

Tests and Limitations M.O.N.S.T.E.R. is successfully used in our four-legged, 16 degree of freedom walking robot AIMEE [6] with which we participated in the RoboCup German Open 2005 Rescue League achieving the 3rd place. We implemented a bio-inspired behavior-based control concept. At the moment 37 processes are installed on our robot where 16 are hardperiodic and interrupts are occurring regularly caused by the serial communication to and from the operator’s station. However there are some known limitations: There is no load balancing mechanism at the moment which tackles the problem that if, for example, ten behavior processes need to be executed every tenth loop cycle they will likely all be executed in the same loop. This might lead to the situation that only a partial loop can be processed in one time slice. While our microkernel is designed to handle such situations, it would be better if they could be avoided. Additionally, no reordering of the run queue is done at the moment giving processes at the top of the queue a slightly higher influence. We do not provide the ability to define a minimal frequency for hard-

periodic processes but assume that their targeted frequencies can be fulfilled. Solutions (e.g, scheduling algorithms) for all above mentioned limitations already exist and will be incorporated in the future. 5. DISCUSSION/OUTLOOK This new approach has some very interesting properties summarized in table 4. • Behavior processes are not fixed to a certain frequency. • Behavior processes know their period and can adapt to it. • Behavior processes loose their influence on the overall behavior, if they are processed not as often as others. • The frequency of the vital processes is independent from the frequency of the behavior processes; vital processes have absolute priority ensuring high system preservation. • Idle time does not exist any longer, but is used for a background process (e.g. a planer)

Table 4. Properties of the new Concept Crucial for this approach is the behavior’s knowledge about its own execution frequency and its ability to adapt to that frequency. The adaptation may not be trivial of course, but for standard behaviors like homing, wander, obstacle avoidance etc., this is obviously easy. In the future, we will do tests on our walking robots with more complicated behaviors to explore whether there are general rules how to tackle this problem. To describe behaviors in the way it is done in PDL or Dual Dynamics helps already solving the problem because of the differential equations used. If we know the time between the last execution and the current execution (difft ) and if we define a base period(tbase ) for the behavior process, we can compute an adapted output of the behavior process. E.g. an obstacle avoidance behavior programmed by means of difference equations (as mentioned in section 2), can adapted its influence on the actuators (delta) by weight it with tbase /difft and therefore still avoid obstacles in a less smooth but still effective way. This would be a simple but already usable adaptation of a behavior to different frequencies. By dividing our processes in hard and soft periodic processes and by allowing preemptive interrupts, we can ensure a very high system reliability. We are confident that the presented ideas make the scaling up of behavior-based programs substantially easier. Of course, scalability is still limited by the available computational resources on the system. But porting the software to new hardware will be easier because of the frequency adaptive behaviors. Summarizing, this approach seems to be very promising, and the extra introduced mechanisms do not result in a lot of overhead load. In contrast, they actually make better use of

the existing resources. New about our approach to other approaches where RTOS are used for behavior based programming is the way to solve the problems. Instead of trying to implement a behaviorbased approach like PDL by using a already existing RTmicrokernel or RTOS, we asked the question, what we need to add to a PDL-like behavior-based approach to cope with the above described problems. As we saw, to add certain RT-concepts solved or reduced most of the problems and resulted in a microkernel concept which integrates behavior based concepts like the ”merging of quantities” already on the kernel level. The performance of M.O.N.S.T.E.R. in comparison to existing RT-microkernels will be investigated in the future. Because of the on the kernel level integrated behavior-based concepts, we believe, that our microkernel should compete very well. It is much more adapted to the special needs of programming concepts like PDL than a standard RT-microkernel. In the future, we will enlarge the number of behavior processes and do further testing of the system’s stability. Moreover, different merging techniques will be investigated in the future. 6. REFERENCES [1] R. A. Brooks, “The behavior language; user’s guide,” Tech. Rep., MIT AI Lab Memo 1227, April 1990. [2] Luc Steels, Peter Stuer, and Danny Vereertbrugghen, “Issues in the physical realisation of autonomous robotic agents,” in Proc. of SAB’96, 1996. [3] Hans-Ulrich Kobialka and Herbert Jaeger, “Experiences using the dynamical system paradigm for programming robocup robots,” in Proc. of AMiRE 2003, 2003, pp. 193–202. [4] J. Zalewski, “What every engineer needs to know about rate-monotonic scheduling: A tutorial,” Real-Time Magazine, vol. 1/95, pp. 6–24, 1995. [5] C. L. Liu and J. W. Layland, “Scheduling algorithms for mulitprogramming in a hard real-time environment,” J. ACM, vol. 20, no. 1, pp. 46–61, 1973. [6] M. Albrecht, T. Backhaus, S. Planthaber, H. Stoeppeler, D. Spenneberg, and F. Kirchner, “Aimee: A four legged robot for robocup rescue,” in Proceedings of CLAWAR 2005. 2005, Springer.