Controlling remote systems on the Web. Pascal Ogor, Philippe Le Parc, Jean Vareille and Lionel Marc e
Project Languages and Interfaces for Intelligent Machines Computer Science Department - EA2215 Universite de Bretagne Occidentale 20, Av. Victor Le Gorgeu BP 809 - 29285 BREST Cedex Tel : (+33) 298 016 487 Fax : (+33) 298 016 252 Email :
[email protected] or
[email protected]
Abstract The remote control of machines or robots is something quite usual in the industry. Dierent technics and tools exist but they all rely on the fact that the network connection between the machines and the users is private and safe. In this paper we want to relax this constraint and to use the Internet network as means of communication. As Internet is neither private nor safe, we have to propose an architecture and mechanisms to control and estimate the quality of the network and some procedures to react depending on our measurements. In the rst part, we describe the concepts of tele-manipulation, then the architecture we propose and after that we try to show how such an architecture may be validated. We conclude by presenting an experiment from our laboratory.
Keywords : remote control, network, modelling, verication, tele-manipulation.
1
1 Introduction The tele-manipulation of a manufacturing system (or tele-manufacturing) corresponds to the control by a distant user of a system composed of dierent tools such as machine-tool, robot, camera. . . . The objective is to authorize the user to manipulate this distant system as if he were close to it. In our approach, we consider that the manufacturing system (gure 1) may communicate its state to a computer and may receive orders via the same computer also called Local Control Unit (LCU). Some external sensors are also around the system (like a camera or a microphone) to give another point of view of the system. The distant client (also called Remote Control Unit - RCU) may have at his disposition all these informations to be able to have a safe control of the system. These two parts of the system (manufacturing systems and distant client) are connected by a network and are extremely dependent of the quality of this media. In our approach, not only one RCU may be connected to the LCU, but of course just only one RCU has the control of an operative part. The others act as observers of the system which may be useful in a lot of domain such as tele-teaching.
RCU
Operative Part
NETWORK
Human
LCU
RCU
RCU
Human
Human
Figure 1: Global architecture of a tele-manipulation system Choosing a network support such as ATM, guarantees a good Quality of Service (QoS) but has the inconvenient of its cost and its deployment 2
comparatively to a world-wide network support like Internet. Nevertheless, if you want to use Internet in the context of tele-manufacturing and more generally in tele-surgery, tele-medicine, tele-teaching or tele-maintenance it may be possible if you take this lack of quality as a main parameter when building the architecture of your system. The objective of our work is to build such an architecture which not prevents from changes in the quality of the network but is able to have a safe comportment when the bandwidth of the network decreases or when the connection breaks down. When designing the system we have to take into account in all parts that we don't know the minimum of available bandwidth or the maximum for the delay. Using communication protocols like TCP, just makes it possible to know whether or not an information has been transmitted over the network. A common tele-manipulation system is presented in the rst part of this paper, then in the second part, we describe some strategies used in telerobotics and in the third, we explain the way to model each component of the proposed architecture. In the fourth part, we discuss about the validation aspect. In fact, to ensure a safe system we have to verify our architecture using, if possible formal methods. To illustrate this discussion, the fth part presents the realized platform at the LIMI (Languages and Interfaces for Intelligent Machines) laboratory.
2 The tele-robotics evolution The rst real tele-robotics systems appear in the 60's, during the spatial conquest She92] and She93]. The main question was "How can we teleoperate vehicles on the moon through three-second round-trip delay due to the distance ?". The reply has been the "Move and Wait strategy". This means that the operator could commit only a small incremental movement (as large as reasonable without risking collision or other error) without feedback, then stopping and waiting one delay period for feedback to "catch-up", then repeating the process in steps until the task is completed. A pure "Move and Wait strategy" is not usable in the context of telemanipulation because during a certain amount of time the controlled system is working on its own without any means for the remote user to interact and even to know what happens. Furthermore, this strategy is not usable for economical reasons due to the waiting time between orders. A suitable solution is to automate the system. The user is no more the operator but the supervisor. The automation consists of a loop between the 3
control unit (local to the operative part), the operative part, its sensors and its actuators. Depending of the objective, the automation can be used to produce (tele-manufacturing) or only to ensure the security of the system (tele-maintenance). This strategy of automation is now implemented on spatial satellites, to create a rescue procedure in case of communication failure. With the use of such procedure, the NASA saved the SOHO satellite in 1998 NAS98]. Such rescue procedures have to be implemented in the context of telemanipulation. They depend on the operative part controlled and also on the failures observed.
3 System specication 3.1 General architecture
The general architecture of the system we propose, is shown gure 2. Local Control Unit (LCU)
Physical devices
Internet
Remote Control Unit (RCU)
Groom
Local Client (LC) Manager
Robot
Tool Interface
Camera
Tool Interface
Device Manager
Cmd. Tool Interface
LC Sender
RC Receiver
LC Receiver
RC Sender
Pinger
Ponger
Cmd. Tool Interface
Remote Client (RC) Manager
Connection Manager
: Socket sender
: Dynamic thread (associate to each client)
: Socket receiver
: Static thread
: Communication channel
Figure 2: Global architecture of a tele-manipulation system
Physical devices represent the devices a distant user may control. They belong to dierent classes of devices.
4
The Server part is composed of two dierent types of processes : permanent in grey color and dynamic processes in white. The two main processes are the Device manager which is in charge of the communication between the devices and the system and the Connection Manager which takes care of the dierent connected clients. On the gure, LCxxx processes are created to manage a connection between one client and the main system. The Client part is composed of a main process called Remote Client Manager" which can communicate with the user through Tool Interfaces, with the system with RC Receiver and RC Ssender processes. The Pinger and Ponger processes are used to get informations about the quality of the network connection.
In the following subsection, these dierent parts are described more precisely.
3.2 Operative part specication
The control of the devices is realized by the modules called "Tool Interface". These modules play an important role because unsecured actions made by the devices, can damage the tool itself but also its environment. The modelling of the behavior of these modules is realized by the specication of the stop and go modes of the devices using the Gemma methodology ADE81]. This modelling represents all the possible states and evolutions of an operative part from the initialization to the normal production through the emergency stops. Figure 3 shows a general gemma which may be instantiated depending on the machine you have to control. Transitions between states are not represented to simplify the scheme. The stop and go modes are : Stop initial state which is the state of the system when it is stopped in the normal way. Start procedure is the procedure to start the system, in fact the initialization of the system. End procedure is the procedure to stop the system. For example for an oven, you have to cool it. 5
Stop initial state
Start Procedure
End Procedure
Stop for security
Normal Production
Production Deteriorated
Figure 3: Operative part specication
Normal production is the state when everything is working ne and
you can use the system normally. Stop for security which is the state of the machine when a problem occurs and that a stop is realized to immediately put the system in a safety position. Depending of the operative part, this stop can consist of an action. For example, on a milling machine the mill have to be taken out from the piece. Deteriorated production is the state when some problem occurs (for example large delay for the network communication) but you can use the system in a deteriorated mode (reduction of speed manipulation, limited movements.. . ). Transitions between the rst four states correspond to the normal behavior of the system and depend on the status of the system and the orders send by the user. Transitions between these states and the two last ones, managed automatically by the "Tool Interface" processes depending on the informations coming from the tool and also from the informations on the network coming through the "Device Manager", denote a problem on the system. 6
3.3 Local Control Unit
The "Local Control Unit" is composed of three permanent processes and four dynamic processes for each client. All these processes may communicate together whith the help of specic communication channels. Most of the time, these processes are sleeping : they may be awaken by another process and then they may consult their communication channels to react to the stimuli. The goal of the "Device Manager" is to manage the dierent physical devices through the "Interface Tool" processes. It provides an uniform way to send orders and to receive informations from the devices. Adding a new device may just change some conguration option from this process. It is connected to "Connection Manager" to get information about the quality of the connection of the controlling client. The latter is also connected to the "Device Manager" and is the only one which may send orders to the devices. The "Connection Manager" is used to manage the life of the connections between the dierent RCU and LCU : starting, observing, having the control, living the system. It also have the responsibility to give the control of the devices to one and only one client according to a specic algorithm with may vary according to the application (priority between connections, best quality of the connection, time quantum per client ...). To do its job, it is connected to the dierent "Local Client Manager" and also to the "Pinger" processes (one per client) to have informations about the network quality. This process will also have to manage the available bandwith, that is to say to raise or to decrease the ow of informations according to the network quality, for example, the number of images sent by a camera. The "Groom" process is used to listen to the requests coming from a new RCU. When a request is done, it creates a new "Local Client Manager". The latter will then open its communication channels with the "Connection Manager" and after with the "Device Manager". It also start a "Local Client Sender" process and a "Local Client Receiver" process to be able to communicate in both direction with the RCU. Finally, it creates the "Pinger" process. Dierent priority levels are associated to the dierent processes. A high level is given to the "Device Manager", to the "Connection Manager" and to the "Local Client" which has the control of the system. A low level is associated to the "Groom" and to all the "Local Client" which act as observers. 7
3.4 Remote Control Unit
The "Remote Control Unit" is composed of two dierent parts connected to a "Remote Client Manager". The rst part is a set of graphical interfaces called "Command Tool Interface" which authorize the user to get informations from a specic device and to send to it some orders. Such interface is associated to each device and is made to be as plug and play as possible. The second part is symmetric from the "Local Control Unit" with a "Receiver" and a "Sender" to be able to communicate and a "Ponger" which will answer "Pinger" requests.
3.5 Network specication
Internet is a strongly connected networks of computers, communicating with each other using packet-switched protocols. The delay aecting the data packets depends on the packet's route, on the dierent handling policies at each node traversed, and on the network congestion. Thus, it's almost impossible to determine an exact analytical model of an Internet communication. OF98] use a black-box model to specify Internet. To build this system, they make a simplifying assumption : the forward and feedback parts on Internet-based tele-robotics system are characterized by dierent delays, T1(t) and T2 (t), with RTT = T1(t) + T2(t), and by dierent packet loss.
Master
X(t)
Delay T1
X(t-T2(t))
Delay T2
X(t-T1(t))
Slave
Y(t)
Internet
Figure 4: Internet description To model the network layer, we use the same tools as for the operative part : some sensors, which authorize us to catch the state of the commu8
nication. The delay sensors can be represented by a packet sent, and the catch of this packet round-trip time (RTT). This is implemented using the "Pinger" and "Ponger" processes. Secondly, the use of TCP (Transmission Control Protocol) authorizes us to know the well transmission or not of each control instruction Com95]. TCP includes reliability features such as error recovery. That means that if the transmission isn't a success, the packet is retransmitted after a time-out. This retransmission occurs until a limit that can be xed by the user (in the norm, the default value is 5 seconds Pos81]). To ensure that the order arrives in time or not arrives at all, the time-out has to be reduced to the maximal delay tolerated by the application. Once this characterization is made, a modelling of the Internet parameters needed for a tele-manipulation application is possible.
4 Verication As the proposed architecture is made to perform safe tele-manipulations, some verications have to be made on it using formal methods. It will consist rst of its modelling and then on the proof of some properties on it. Verifying such a distributed system is not an easy task, even if you consider a good quality network and may be made at several levels of abstraction.
4.1 High level verication
On a rst approach, the "Local Control Unit" and the "Remote Control Units" may be seen and modelized as black-boxes. We can use a typical approach using nite state automatas, one per modelized system, and describe their communications through messages. Then a model checker like Mec Arn92] may be used to verify the whole system. The main problem with this technics is the fact it does not take care of temporal constraints. The delay generated by the network is variable, and after a while the system, if the communication failed for example, has to change its state to take into account this parameter. With the timed automata NSY92] , it is possible to get a variable delay in a system. You can generate a time which represents a state of a system connected to a variable delay. The way to do it is to classify the delay between limits. These limits might be generated by the operative part constraints. For example, a certain type of machine must have a delay inferior to a certain limit to authorize the normal production. This delay is the communication 9
time between the user and the operative part. In the scheme below (gure 5), the system sends a packet to the user and then waits until reception of the response. Then in function of the elapsed time between the emission and the reaction, the delay type of the user can be dened. Start Measure
X0>Limit2; DelayType=3; Reset{}
X0