Building distributed run-control in UNIX
Ambrasini G1, Bee C.P2, Bu0n0 S2, Ferrari R 1, Ferrara D3, Fumagallil, Ganev P2, Hellman S2, Jones R2, Khodabandeh A2, Mapelli L2, Mornacchi G2, Polesello G1, Prigenr D2, Randot C3, Tamburelli F2 1: Dipartimento di Fisica dell’Universita’e Sezione INFN di Pavia, Italy 2: CERN, Geneva, Switzerland
3: Centre de Physique des Panicules de Marseille, lN2P3, France We have developed a nm-control package as part of the data acquisition system for the RD13 project at CERN which rims on RISC based UNIX workstations and front-end processors. We have used a commercial toolkit, called Isis, which aids the implementation of distributed applications by offering facilities to perform interprocess cormnunication, computation over multiple processes and provide fault-tolerance, synchronization and automatic recovery from software and hardware crashes. ISIS and its use in the nm control system is described. We relate our experiences with the toolkit and explain its impact on the programming model used. 1 Introduction
Modem data acquisition systems can no longer be implemented using a single program, they involve many programs running on a network of computers. Such distributed systems demand specialized software in order to control all the aspects of the data acquisition system. Pro cesses need to cooperate to perform processing functions, they need to share data and informa tion. It must be possible to synchronize processes and monitor their progress. Operating systems, such as UND(, do not provide all the services and facilities required to implement such applications and so we have to look else where to iind tools that can provide the missing features. One such tool is Isis[1], a toolkit for distributed programming. Isis started life as a research project at Comell University and has since become a commercial product distributed by Isis Distributed Systems Inc. 2 Overview of Isis
Applications that use Isis are organized into process groups. The membership of a process group can change dynamically, as new processes join the group or as existing members leave, either out of choice or because of a failure of some part of the system. A process can be a member of many process groups. Messages can be sent, or broadcast, from a process to a pro cess group so that all the members receive a copy, without explicitly addressing the current membership. A process broadcasting a message can indicate that it wants to wait for replies from the recipients of the message. Process groups provide a convenient way of giving an abstract name to the service implemented by the membership of a group. A process can instruct Isis to notify it when certain types of events occur. For example, a process can be noti tied if a new process joins a process group or if Isis fails on a particular machine. Isis supports a set of broadcast primitives for sending messages to all members of a group. Depending on consistency requirements, the programmer may choose between atomic broad cast primitives that guarantee a global order on all messages and the more efficient casual
289 OCR Output
broadcast protocol that guarantees only the delivery order of messages is consistent with cau sality. 2.1 Tasks
A task mechanism is required for a process to be capable of controlling several sources of input and output simultaneously. A task mechanism allows several threads of control to exist within a single process. Isis provides a task system so, for example, if a task has broadcast a message and is waiting for the replies when a new message arrives, another task may be started to handle the new message even though the first task has not yet terminated. Non—Isis sources of input and output can be incorporated into the task mechanism. The use of a tasking mechanism means the application must give control to Isis and accept to be called only when an Isis event occurs or when Isis is idle. If it is not possible to allow Isis to take control of the application then a timer should be established to pass control to Isis period ically. 3 Run Control Facility
We have used Isis to implement a run-control system as part of a data acquisition system built by the RDl3[2] project at CERN. We run Isis on a network of UNIX workstations and special ized embedded computer systems that run the TC/Di operating system from Control Data Corp., which is an implementation of UNIX with real-time extensions.
The components of the data acquisition system are modelled as nnite state automata which are controlled by a run-control program. The run-control program sends commands to the compo nents in order to change their states. Starting from this model, we defined a set of process groups and message formats to support the manipulation of finite state automata. A library of routines, called rcl, was implemented to support the model and provide a framework in which component—specific code could be inserted. The finite state automata for the components are defined in a trivial declarative language. A LEX and YACC based translator is then used to transform the declaration into a list of SED commands that are applied to a C code template. This code is compiled and linked with the rcl library and the code specific to the data acquisi tion component to produce an executable program. When the programs run they join the pre defined process groups and establish tasks that handle the messages. The library also provides a programming interface by which the run-control program can send commands to compo nents and be informed of the result and their current state.
The run-control program is itself modelled as a finite state machine. To interact with the run control program, we have developed a Motif based graphical user interface from which the user can send commands and interrogate any process using the library. The graphical interface is independent of the number and function of DAQ components in the system as well as their finite state automata.
4 Error Message Facility We have also used Isis as the basis for an error message reporting facility. Data acquisition components report error conditions by calling a routine in the rcl library that broadcasts the condition as a message to a dedicated process group. The members of the process group are
290 OCR Output
processes that wish to capture error reports. A filter mechanism, based on UND( regular expressions, allows members of the group to capture subsets of all the errors reported. The graphical user interface is also a member of this process group and displays all reported error message to a window on the screen. Another process writes error reports with a time stamp and the identification of the sender to a log file. Filters can be down-loaded to data acquisition components in order to suppress error reporting at the source. An application, using suppres sion filters, has been developed to automatically suppress the reporting of error messages that occur too frequently. 5 Future Work
The run-control and error message systems described above are implemented and will be used in a test beam for the RD2 [3] project at CERN in November 1992. Further advances are being made, as described below, using other facilities including some tools based on Isis.
5.1 Implementing the run control library using Meta Rather than using Isis process groups to implement the run-control library, we could use Meta[4] which is toolkit based on top of Isis. The components of a Meta world are a control program and the applications which form an environment to be controlled. The control pro gram observes the applications’ behavior through interrogating sensors, which are functions that return values of an application’s state. Similarly the behavior of an application can be altered by using procedures called actuators. The control program, which is written in a reac tive, rule-based style, uses a low-level postfix language called NPL, which is executed by a fault-tolerant distributed interpreter. If we were to use Meta instead of Isis to implement the run-control library we could replace the definition of Isis process groups and messages by sensors and actuators that manipulate finite state automata. The run-control program could then be written in terms of NPL state ments that reference the defined sensors and actuators. This offers the advantage of simplify ing the implementation of the rcl library and means the run-control program can be seen as a set of rules, written to handle particular situations, which are always ready to fire. 5.2 Using the Resource Manager for job control Another important part of the run-control system is the ability to start components of the DAQ system. This is typically needed at the start of a physics run or on request from the operator. In the current system, we start components (executable programs) using the remote shell facility of UND(, rsh, but this is not flexible enough and does not offer much control over the pro grams launched. An altemative is a tool built on top of Isis called the Resource Manager[5]. The Resource Manager pools together heterogenous workstations with spare capacity onto which it schedules jobs, monitors their progress and deals with events such as crashes, the need to release a workstation when its owner resumes active use, and the introduction of new
machines or software services. The characteristics (estimated amount of CPU time required, command line parameters, environment variables etc.) and the resources required (file systems that need to be mounted, machine architecture type, software services etc.) of each job are described to the Resource Manager. When a request is received to start a particular job the
291 OCR Output
Resource Manager tries to match the job characteristics and requirements against the machines in the processor pool. As an example, consider a monitor program which is a component in the DAQ system. We may want to run the program and we have compiled two versions of it: one that runs on SUNS and another that runs on the front-end processors. When we submit a request to start the job the Resource Manager will select the most appropriate available machine and run the corre sponding version of the monitoring program. Our software can be informed of the program’s progress or if it crashes. We intend to create job specihcations for all the components of the DAQ system in this manner and include calls in the run-control program to launch the jobs as they are required. 6 Conclusions
We have found Isis to be a reliable and robust product that simplifies interprocess communica tion. Its notion of location independence removes an obstacle for distributing applications. It has a rich programming interface and offers a paradigm, that of process groups, for organizing distributed applications. Its adherence to standard UNDI facilities has helped us to break down the traditional barrier between the back—end workstations and the front-end embedded
electronic systems of a data acquisition system. Meta provides a higher level of abstraction than Isis and its rule-based control language appears to be a more flexible and realistic approach to implementing a control system. We found the principles of sensors and actuators provide a convenient means for supporting finite state automata. The Resource Manager is typical of a number of tools (both commercial and freely available) appearing on the market that allow the management of jobs over a number of machines. We found it to be flexible, allowing a large variety of job characteristics and requirements to be taken into account and well·suited to the needs of a run-control system. Isis and and its related tools offer features which one might hope to find in future operating systems and so, even if these tools are not actually used on future experiments, they offer many of the facilities we will require and are hence a step in the right direction. For further information on Isis and related tools, contact Kenneth R Birrnan, 4105 Upson Hall, Dept. of Computer Science, Comell University, Ithica, NY 14853 (USA). Email:
[email protected] nell.edu
7 References
l. K.P.Birman et al., The ISIS System Manual, Version 2.1.
2. LMapelli et al., A Scalable Data Taking System At A Test Beam for LHC, (CERN/DRDC/91-1). 3. A.Clark, C.Goessling et al., A Proposal to Study a Tracking I Preshower Detector for the LI-IC, (CERN! DRDC/90-27).
4. M.D.Wood, The Meta Toolkit Version 2.1 Functional Description.
5. T.Clark, K.Birman, Using the ISIS Resource Manager for Distributed, Fault-Tolerant Computing, Dept. of Computer Science, Cornell Univ. USA.
292