A Distributed Computer Testbed for RealâTime Control of ... - CiteSeerX

Törngren M., Garbergs B., Berggren H., A Distributed Computer Testbed for Real–Time Control of Machinery In Proc. of the 5th Euromicro Workshop on Real–Time Systems, Oulu, Finland, June 1993, IEEE Computer Society press, ISSN 1068–3070.

A Distributed Computer Testbed for Real–Time Control of Machinery Martin Törngren*, Bengt Garbergs**, Hans Berggren** * DAMEK ** Dept.

mechatronics group, Dept. of Machine Elements, Royal Inst. of Tech., S–10044 Stockholm. of real–time computer systems at the Univ. of Västerås, Sweden. Email: [email protected]

Abstract This paper describes a control application model and a distributed computer testbed, developed for the purpose of studying distributed real–time control of machinery. Different control approaches are considered depending on the support provided by the distributed computer system. The importance of considering the real–time cooperation constraints of control components, within a control application, is emphasized and communication subsystem properties needed to fulfill them are proposed.

Introduction We believe that a distributed control approach is a viable alternative for many mechanical applications including automobiles, industrial robots, production machines and rock drilling machines. With a distributed control approach, hardware and different levels of control are spatially distributed to actuators and sensors in the mechanical system. Benefits by a distributed control approach include: = Modularity (mechanical, hardware and software related). = Improved functionality and performance (due to local processing capabilities). = Reduced cabling.

These and other advantages are further investigated by Wikander and Törngren [1]. The term real–time machinery is proposed by the authors to refer to this kind of machines, which incorporate an embedded and distributed computer control system. However, there are some serious problems which need to be solved in order to facilitate the design of real–time machinery. = The design of real–time machinery is an interdisciplinary area where knowledge in mechanics, electronics, control and computer engineering sciences is required.

1

= There is a gap between modelling, control synthesis and verification on one hand, and real–time computer implementation, on the other hand. The mapping of a controller onto a distributed computer system, naturally requires that all important assumptions made in controller design are known, and furthermore that the real–time computer implementation also fulfills these assumptions. A distributed computer implementation may otherwise violate, e.g. assumptions made about the timing behaviour of the controller. = Practical problems including machine design in order to fully exploit the possibilities of distributed control, and the design of small compact computer modules which can be integrated with actuators and withstand harsh environmental conditions.

Related work The potential benefits of distributed control have caused research and standardization in several areas. As the problem area is multidisciplinary, related research work exists in several technical fields. In the Prometheus program, car manufacturers in Europe are now in the phase of specifying and designing vehicle internal architectures of distributed computer systems for time and safety related tasks, such as vehicle dynamics control and engine control [2]. Real–time systems research includes work on scheduling theory, communication protocols, clock synchronization and fault–tolerance. Work in the MARS project, Kopetz et al. [3], is unique in the sense that most of these issues have been considered. Real–time communication network protocols have been treated by Ramamritham [4], Arvind et. al [5], Kopetz et al. [6], Törngren and Backman [7], and by industry [8]. Not much work has considered the design of real–time machinery from a control engineering point of view. Two exceptions are work by Ray et al. [9], and Törngren [10]. In research at Trätek in Sweden and VTT in Finland, distributed control have been applied in hydraulic systems [11], [12]. This paper describes a control application model and a distributed computer testbed. Experiments and gained experiences are presented and discussed.


Control application model A distributed control system is formed by combining a control application with a run–time system. We use the term run– time system to denote distributed hardware, a communication network and a distributed real–time executive. The executive includes a local operating system kernel and a communication subsystem (CSS). The run–time system is designed explicitly to manage resources in order to meet application timing requirements and need not include a file system, general purpose communication facilities, etc.

For instance, a servo computation may use different control algorithms and may use locally stored reference values or externally received ones. Mode shifts are established via separate LC’s, termed mode channels. A small control application model is exemplified in figure 1. It involves three degrees of freedom (DOF). The reference and mode channels may be sourced by, for example, the global servo or an operator (through a node with man–machine interaction). The latter possibility is indicated by the M symbol in the bottom right corner of figure 1.

The control application model is built using control components and logical channels, through which the components are connected and communicate. There are three types of control components, all of which are sequential and execute periodically:

Control components are associated with the following timing requirements:

– Computation components (of global or local servo type). – Actuator and sensor components, which have a direct interface to a physical device.

R is the release time, which specifies the earliest allowed start time of a component each period. The ”start time jitter”, if defined for a component, additionally specifies that the actual start time must not deviate more than the specified jitter from the stipulated release time. C is the execution time of a component each period, D is the deadline and T is the period.

Logical channels (LC’s) are unidirectional broadcast channels which guarantee ordered transmission of data and an upper delay bound. A logical channel originates from one control component only and is uniquely identified by a source name. New data written to an LC overwrites previously written data. Transmitted data is associated with a timestamp, reflecting its creation time. A control application may have a number of modes of operation, referring to the modes of its control components. DOF 1

DOF 2

Sampling & actuator component Servo computation

Node 1

– Computation components – {R, C, D, T} – Actuator and sensor components – {R, C, T}

A description of a control application according to figure 1, together with the attributes of the involved control components, provides information about the allocation of control components to nodes; logical channel connections; node processing load and the minimum network bandwidth required. DOF 3

Sampling & actuator component

Sampling & actuator component

Servo computation

Servo computation

Mode channel

Global servo Node 2

Reference value channel

Node 3

Sensor output channel

Communication subsystem M

Figure 1. Description of a control application using the control model.

2


The testbed The testbed is a distributed computer system which has been developed for the purpose of studying distributed real–time control of machinery. It has got the following overall characteristics: – Each node has a local clock. – Control components are statically allocated to nodes. – The control components are locally statically scheduled in each node – Nodes are connected through a broadcast serial bus where a dynamic communication scheduling policy (token passing) guarantees an upper bounded access delay. Transputer modules are the basic hardware building blocks in nodes. The transputer was chosen because it provides support for distributed systems, including configuration and debugging support, as well as parallel execution and communication capabilities. All testbed software has been written in the C programming language, with library support for transputer communication and concurrency, [13]. The run–time system is designed to support the application model. The implementation of logical channels is based on the concept of periodic broadcasting of data in the form of datagrams. The approach is suitable because the bulk communication load involves time–critical periodic data and most communication patterns are of one to many type. Similar approaches have previously been advocated by among others Rodd et al., [14], Guth et al., [15], and Kopetz et al., [3]. A somewhat similar application model has also been presented by Lawson, [16]. The local kernel in each node performs scheduling of application components and provides the logical channel interface to components. The ”lower layer” of the CSS provides physical and datalink layer functionality and executes on a dedicated transputer module. The ”higher layer” of the CSS executes on the same transputer as the kernel and application components. Each node incorporates a local database, which is the interface between the kernel and the higher CSS layer. I/O interfacing is provided by a separate transputer module. In the testbed a time delay estimation feature, which provides an estimate of the actual delays occurring during communication, has been introduced. The component to component communication delay depends on a number of factors including network transmission time; medium access delay and the scheduling in a node. Some of the delays are constant and some are time–varying, but bounded in the testbed, as discussed in the section on timing aspects. Therefore the run– time system measures the most significant time–varying delay, the medium access delay, and timestamps the message accordingly, just before transmission. The receiving CSS in

3

turn adds the known delays to the timestamp and subtracts the sum from the current local time of the receiving kernel, yielding an estimate of the creation time of the data–item. A more detailed description of the testbed can be found in [10].

Test experiences Logical channels and operator interface The testbed has so far been applied in controlling a two degree of freedom mechanical system. The application model has been easy to apply on this control application. We expect the model to be suitable for more complex applications also. System LC’s, which provide connections between kernels, have provided means to remotely control execution of control components and to request diagnostics information from kernels. Through mode change LC’s the operator can easily change modes of single or several control components and also, for example, be able to source the reference value channel. Another useful property of the testbed is that state information, describing the dynamic mechanical interaction between actuators, is available on the network for maintenance and logging purposes.

The communication subsystem Because the testbed is implemented mainly in software, running on standard transputer modules, we have had some problems with efficiency and the determination of program execution time. The lower CSS layer carries out the following tasks: – The token passing protocol. – The associative receiver function which selects appropriate messages out of all incoming messages, according to configuration information. Note that all messages are broadcasted on the network. – The provision of accepted messages to the higher CSS layer and the reception of outgoing messages. The three mentioned activities are highly time critical. After each token hold time a new frame transmission starts. During the frame arrival period (token hold plus frame transmission time) all three activities need to be performed. The transputer module dedicated for the lower CSS layer incorporates an Ethernet controller, which performs DMA to and from memory in parallel to the execution of the transputer. In contrast to transputer link operation, this concurrent DMA severely deteriorates program execution speed and, for instance, approximately doubles the time needed for transputer link communication. One solution to this is to set the token hold time to equal the maximum time needed for the associative receiver function, plus the time for the corresponding communication to the higher CSS layer, [17]. Thus, all nodes, regardless of the load on their receiver, will


have a constant token hold time and a new frame will be transmitted after this time. The net result is lower effective communication bandwidth. The token passing protocol allows a single, constant length, frame to be sent by each node every turn of the token. There are three main drawbacks with a token passing mechanism: The risk of losing the token means that a token recovery scheme is needed; extensibility requires extra mechanisms; and finally, the worst case medium access delay is directly proportional to the number of nodes in the system. One should note that there is no support for aperiodic messages with a deadline shorter than the token turnaround time. A major problem with the current implementation is due to the interface between the higher and lower layers of the CSS, today based on a transputer link. The higher CSS layer needs to ensure that communication with the lower CSS layer is performed each frame arrival period. Thus, the current solution causes interference between the kernel and the communication subsystem.

Timing aspects In judging the testbed, one important aspect is the consideration of the baseline assumptions made in application design. Sampled data theory and ordinary discrete–time control theory usually assumes synchronous sampling, zero jitter and constant and known control delays, [18]. Considering that coordination between several actuators requires that control loops can be closed over the communication network and the fact that it is necessary to synchronize the use of reference values, even if a global loop is not closed, the following system aspects are of interest: – Closed Global Servo Loop(CGSL): The delays of actual values from sensor components to the global servo. – Open Global Servo Loop (OGSL) and CGSL: The delays of reference values to local servos. Broadcasting of reference values provides synchronization between local servos to some degree. If local servos use pre–calculated locally stored references values, which refer to a common time base, the use of them needs to be synchronized because local clocks will drift apart. In order to test the applicability of these control approaches, the characteristics of the testbed with respect to data delay variation and local jitter were measured, see table 1. Measurements have been done using transputer high priority timers (with a resolution of 1s) on a three node system. The figures can be compared with the sampling periods of the local and global servo, respectively, used at the time of measurement: T LS ms and T GS=10 ms.

4

ms

Measured quantity Medium access control delay Frame transmission time Token hold time Transputer link communication Local sampling jitter

t MAC t TR t THT t LINK e

160–437 82–92 90 22 0–33

Table 1. The component to component delay is composed of the following parts: – – – – –

Communication from higher to lower CSS layer. Medium access delay. Transmission delay. Lower CSS layer delay in the receiving node. Communication from lower to higher CSS layer.

The medium access delay, t MAC, is the time a message is waiting in the transmission queue before being transmitted on the network. The minimum value of t MAC includes the time from the instant the lower CSS layer reads the message until the next token frame arrives, plus the token hold time and transmission setup time. The lower CSS layer polls the higher CSS layer each frame arrival period. The maximum value of t MAC occurs when the message misses the poll which precedes the arrival of the token. Thus, necessitating a complete turn of the token until the message can be transmitted. n(t TR+ t THT) is the theoretical bound on t MAC, where n is the number of nodes. The obtained maximum value of t MAC is less because of the polling procedure. The frame size is currently large enough to fit all messages generated in a node during a turn of the token. The transmission time, t TR, includes DMA to and from memory. This is the reason for its time–variation. t LINK concerns the transmission of one logical channel message (currently 20 bytes) when no external DMA is active. The communication from the higher to the lower CSS layer introduces a time–varying delay due to the polling mechanism described above. The maximum polling delay is approximately 250 ms. This delay is not currently measured by the run–time system. The lower CSS layer delay in a receiving node equals the token hold time. Because control components execute asynchronously, the current distance between their sampling instants, referred to as execution skew, will cause an additional delay. Consider the broadcasting of reference values from one node, to the other two nodes in the system. At approximately the same time, the reference values will be written to the local databases in the two nodes. However, the usage of the reference values may be delayed up to one sampling period. Thus, the message based synchronization resolution for reference values equals T LS. The delay of actual values from sensor com-


ponents to the global servo, depends on the end–to–end communication delay, the execution skew as well as the relation between local and global servo sampling periods, further discussed in [10]. The local sampling jitter is due to the interference between the kernel and the CSS. Its maximum value equals the execution time of receiving high CSS layer. The occurrence of interferences depends on the the ratio of the local servo sampling period and the frame arrival period.

Conclusions Real–time cooperation constraints An important conclusion drawn from the work with the testbed and associated theoretical studies is the importance of considering the real–time cooperation constraints (RT– CC’s) which the application needs to fulfill. RT–CC’s constitute baseline timing assumptions made in application design, which are particularly important to consider when a distributed computer implementation is at hand. Important RT– CC’s of control applications are as follows: = Synchronous application execution is required.

A number of control components execute synchronously if the distance in time between related sampling instants always is smaller than a known synchronization accuracy constant. Synchronism here refers to the actual meaning of the word, i.e. events occuring simultaneously according to a common time base. This is contrasted with the use of the word in classical (non real–time) distributed systems where synchronism refers to logical event–ordering. = Jitter

By jitter we refer to time variations in actual start times of a component, as opposed to the stipulated release time. It is very important for sensor and actuation components that a maximum allowed jitter is guaranteed. Jitter depends on clock accuracy, scheduling algorithms and computer architecture. = Delays in communication between control components.

Delays between control components should be constant. The delays of interest are the end–to–end delays, [10], [19]. A natural minimum requirement in time–critical systems is that bounded delays can be guaranteed by the communication subsystem. = Consistency.

Consistency in the context of RT–CC’s concerns time consistency of data, i.e. delays of data and differences in delays between nodes, but also atomicity for discrete state variables which are not updated periodically. As an example consider

5

a mode change commanded by an operator. Atomicity may than be required to avoid the case that just a few nodes change mode. Consistency constraints are further elaborated in [20]. We believe that other sets of RT–CC’s can be defined for applications other than those investigated by us. Hopefully, knowledge about application related RT–CC’s can be used to improve academic application models. If synchronous execution is not provided, control delays will be time–varying and depend on the sampling periods, communication delays and computational delays. As the resulting control system is time–variant, ordinary discrete–time theory can not be directly applied. However, considering the characteristics of our testbed, the following approaches are possible, [10]: – To select sampling periods, considering both the dynamics of the controlled process and the characteristics of the time– varying control delays. This means that the system should be overdimensioned by selecting small enough sampling periods so that the time–varying control delays become negligible. The approach may or may not be possible depending on the controlled system and the resulting increase in communication and execution load. The approach could be complemented by robust controller design. – To utilize information about the actual control delays in order to either estimate the values of state variables at sampling instants or to apply non–standard control theory (e.g. non–periodic sampling). It is also necessary to synchronize the use of reference values. In the testbed we have not yet closed the global servo loop. Synchronization of reference values have been based on broadcasting of reference values. It is also rather straightforward to implement a synchronization algorithm (of distributed or master–slave type), which periodically checks that the use of reference values is synchronized. The periodicity can be determined based on a pessimistic estimation of possible clock drifts, e.g. the algorithm could be executed when it is potentially possible that a number of local servos start using reference values which are one sampling period apart.

Necessary improvements of the testbed In the context of our testbed, we propose architectural features which should be provided by a communication subsystem in order to be able to fulfill the RT–CC’s of control applications. Global clock vs local clocks: Our testbed uses local clocks. As it turns out, a distributed real–time computer system based on local clocks becomes much more complicated to manage, than if a global clock would be available. This fact has been pointed out earlier from the view point of schedul-


ing and handling of replica determinism, [3]. Using local clocks, synchronous execution must be based on synchronization through message passing which is liable to introduce larger jitter and dependence on a single synchronizing process. Time stamping mechanisms also become more complicated.

[5] Arvind K., Ramamritham K., Stankovic J.A., A Local Area Network Architecture for Communication in Distributed Real– Time Systems, Journal of Real–Time systems, 3, 115–147, (1991), Kluwer Academic Publishers.

A robust, predictable and extensible CSS: The CSS should provide a suitable interface to applications and should operate without disturbing the execution of applications (i.e. the implementation should eliminate interference between the kernel and the CSS as far as possible). Furthermore, the CSS should be able to select the information really needed in a node, out of all state based information which is broadcasted on the communication network. Appropriate resource management policies and mechanisms (for medium access control and local node scheduling) are required in order to achieve upwards bounded end–to–end delays, which together with synchronous execution can provide constant control delays. Fault detection mechanisms are required to ensure that a very high degree of potential communication errors are detected and reported as early as possible. This is a sufficient requirement in applications which have the possibility to do a safe shut down if an error occurs.

[7] Törngren M., Backman U., Evaluation of Real–Time Communication Systems for Machine Control. To appear in Proc. of the Swedish National Real–Time Association Conference on Real– Time Systems, Stockholm, Sweden, August, 1993.

In order to provide this functionality we believe that a CSS hardware implementation is needed. Resource adequate hardware solutions are in general easier to make predictable than software solutions. CSS performance problems known from software implementations can be effectively solved by pipelined hardware architectures and a dual–port memory interface of the CSS. Accurate clock synchronization can be achieved by a hardware implementation, [21]. Synthesis of the CSS from VHDL descriptions will make it possible to create accurate and efficient simulation models for overall system behaviour, including timing properties. In this approach it is also possible to integrate a hardware real–time kernel, [22].

References [1] Wikander J., Törngren M., A Mechatronic Perspective for the Design of Future Real–Time Machinery. To appear in Proc. of the International Workshop on Mechatronical Computer Systems for Perception and Action, Halmstad, Sweden, June 1993. [2] Lawson H.W., Lindgren M., Strömberg M., Lundquist T., Lundbäck K., Johansson L., Torin J., Gunningberg P., Hansson H., Guidelines for basement: A Real–Time Architecture for Automotive Systems, Mecel, Göteborg Sweden, May 1992. [3] H. Kopetz, A. Damm, C. Koza, M. Mulazzani, W. Schwabl, C. Senft, R.Zainlinger, Distributed fault–tolerant real–time systems: The MARS approach, IEEE Micro, vol. 9, No. 1, 1989, pp. 25–40. [4] Ramamritham K., Channel Characteristics in Local–Area Hard Real–Time Systems, Computer Networks and ISDN Systems 3–13, 1987. North–Holland, Amsterdam.

6

[6] H. Kopetz, G. Grundsteidl, TTP – A Time Triggered Protocol for Automotive Applications, Research report nr. 16/1992, Oct. 1992, Inst. fur Technische Informatik, Tech. Universität Wien.

[8] Fredriksson L–B., A CAN Kingdom, Kvaser edition, Revision 2.2, 921231, Kvaser AB. [9] Ray A., Halevi Y., Integrated Communication and Control Systems: Part I – Analysis, and Part II – Design Considerations, Trans. of ASME, Vol. 110, Dec. 1988, pp 367–381. [10] Törngren M., Distributed Control of Mechanical Systems. Licentiate thesis, Trita–Mae 1992:6, ISSN 0282–0048, Dept. of Machine Elements, Royal Inst. of Tech. [11] Uusijärvi R., Distribuerad Styrning av Hydraulik. Licentiate thesis (in Swedish), Trita–Mae 1992:7, ISSN 0282–0048, Dept. of Machine Elements, The Royal Inst. of Tech. [12] T. Virvalo, Distributed Motion Control in Hydraulics and Pneumatics, Mechatronics Journal, Vol. 2, No. 3, pp. 277–288, 1992, Pergamon Press. [13] ANSI C toolset, D7214C, INMOS Limited, 1990. [14] Rodd M. G., Zhao G. F., Izikowitz I., An OSI–Based Real– Time Messaging System, Journal of Real–Time systems, 2, 213–234, 1990 Kluwer Academic Publishers. [15] Guth R., Lalive d’Epinay T., The Distributed Data Flow Aspect of Industrial Computer Systems, Proc. of the 5:th IFAC Workshop on Distributed Computer Control Systems, Sabi–Sabi, South Africa, 1983. [16] Lawson H.W., Engineering Predictable Real–Time Systems, Lecture notes at the NATO Advanced Study Inst. on Real–Time Computing, Oct. 5–18, 1992. Course book to be publ. by Springer– Verlag. [17] Backman U., Aspects on the Communication Structure of a Distributed Control System, Licenciate Thesis, 1989, Dept. of Machine Elements, The Royal Inst. of Tech. Sweden. ISSN 0282–0048. [18] Åström K. J., Wittenmark B., Computer controlled systems, theory and design, Second edition, 1990, Prentice Hall, ISBN 0–13–172784–2. [19] Tindell K.W., Burns A., Wellings A.J., Guaranteeing Hard Real Time End–to–End Communications Deadlines, submitted for publication to the Journal of Real–Time systems. [20] Kopetz H., Kim K.H., Temporal Uncertainties in Interactions among Real–Time Objects, Proc. IEEE Computer Society 9th Symp. on Reliable Distributed Systems, Oct. 1990, Huntsville, AL. [21] H. Kopetz, W. Ochsenreiter, Clock Synchronization in Distributed Real–Time Systems, IEEE Trans. on Computers, Vol. 36, No. 8, Aug. 1987, pp 993–940. [22] Lindh L., A Fast Deterministic Hardware Based Real–Time Kernel, IEEE Euromicro Workshop, Athens, 3–5 june 1992.

A Distributed Computer Testbed for RealâTime Control of ... - CiteSeerX

A Distributed Computer Testbed for RealâTime Control of ... - CiteSeerX

Suggest Documents

Creating a distributed mobile networking testbed ... - CiteSeerX

a testbed for optimizing the monitoring of distributed systems - CiteSeerX

FlockLab: A Testbed for Distributed, Synchronized Tracing ... - CiteSeerX

a scaled testbed for vehicle control: the irs - CiteSeerX

a scaled testbed for vehicle control: the irs - CiteSeerX

A Distributed Control Architecture for a Purposive Computer Vi - Fei

HYDRA: Virtualized Distributed Testbed for DTN Simulations

HYDRA: Virtualized Distributed Testbed for DTN Simulations

Estimation and Control of a Multi-Vehicle Testbed Using ... - CiteSeerX

A REALTIME SYSTEM FOR HAND GESTURE ... - CiteSeerX

Computer Support for Distributed Collaborative Writing - CiteSeerX

Implementing a Testbed for Mobile Multimedia - CiteSeerX

a Network Virtualization Testbed for Overlay ... - CiteSeerX

WIPPET, A VIRTUAL TESTBED FOR PARALLEL ... - CiteSeerX

Adam â A Testbed for Distributed Virtual Environments - IEEE Xplore

The rapid tooling testbed: a distributed design-for - Semantic Scholar

Building An Information System for a Distributed Testbed - arXiv

Designing a Testbed for Large-scale Distributed Systems - Events

Building An Information System for a Distributed Testbed

A Distributed Information Fusion Testbed for Coastal ...

FlockLab: A Testbed for Distributed, Synchronized Tracing and ...

A Testbed for Mobile Networked Computing - CiteSeerX

A Testbed for Study of Distributed Denial of Service Attacks - CiteSeerX

a control system testbed to validate critical infrastructure ... - CiteSeerX

A Distributed Computer Testbed for RealâTime Control of ... - CiteSeerX