May 26, 2004 - [33] Michael P. Georgeff and Amy L. Lansky. Reactive reasoning and planning: an experiment with a mobile robot. In Proceedings of the 1987 ...
The SharC Cognitive Control Architecture Deliverable D x I3-[SharC]; Workpackage 2 Robert Ross
26 May 2004
Version: 0.8
SFB/TR 8: I3-[SharC] Collaborative Research Center for Spatial Cognition Universitat Bremen http://www.sfbtr8.uni-bremen.de/project.html?project=I3 e-mail: robertr AT uni-bremen dot de
Abstract In this report we present the first draft of the SharC Cognitive Control Architecture. The SharC architecture is a hybrid design based on the premise that Multi-Agent Systems (MAS), with their benefits in distributing control of complex systems, can provide an abstraction for robot architecture design that overcomes many of the limitations of current monolithic approaches. The architecture, implemented with the AgentFactory Framework, uses a society of intentional agents within each robot. By using societies of intentional agents rather than dumb components, we can provide a robust, scalable and open robot architecture, which retains the key benefits of deliberative and reactive control, while providing an infrastructure for more natural human-robot dialog. Details of a variant of the architecture, specifically implemented for the Bremen Autonomous Wheelchair, Rolland, are presented.
Acknowledgements The Cooperative Research Center for Spatial Cognition (Sonderforschungsbereich/Transregio SFB/TR8) of the Universities of Bremen and Freiburg is funded by the Deutsche Forschungsgemeinschaft (DFG), whose support we gratefully acknowledge.
Document History Version 0.8
Date 30 May 04
Author robertr
Comments Initial Version
i
Contents 1 Introduction
1
2 Robot Control Architectures
3
2.1
Evolution of Robotic Control Paradigms . . . . . . . . . . . . . . . . . .
3
2.1.1
Sense-Plan-Act Robotics . . . . . . . . . . . . . . . . . . . . . . .
3
2.1.2
Reactive & Behavioural Robotics . . . . . . . . . . . . . . . . . .
4
2.1.3
Hybrid Architectures . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.1.4
Humanoid Robotics . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.1.5
Collective Robotics . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.1.6
Social Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.1.7
Safe Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
Case Studies in Social and Service Robot Architectures . . . . . . . . .
9
2.2.1
Atlantis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.2.2
Saphira . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
2.2.3
The Social Robotic Architecture . . . . . . . . . . . . . . . . . .
12
2.2.4
Intelligent Machine Architecture . . . . . . . . . . . . . . . . . .
13
2.3
Comparison of Case Study Architectures . . . . . . . . . . . . . . . . . .
15
2.4
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
2.2
3 SharC Architecture Description
20
3.1
Architectural Approach . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
3.2
The SharC Architecture for Rolland . . . . . . . . . . . . . . . . . . . .
21
3.3
Architectural Framework Implementation . . . . . . . . . . . . . . . . .
23
3.4
Anatomy of a SharC Agent . . . . . . . . . . . . . . . . . . . . . . . . .
24
3.4.1
25
Components
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ii
3.4.2
Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 The SharC Architecture for Rolland
26 27
4.1
Ontologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
4.2
Software Agents/Components . . . . . . . . . . . . . . . . . . . . . . . .
29
4.3
Main Component Relationships . . . . . . . . . . . . . . . . . . . . . . .
34
iii
Chapter 1
Introduction The Servie Robot is a mobile robot intended for use in everyday environments including domestic, office, and light industrial domains. Unlike industrial robots, the service robot must be capable of safely performing a wide variety of goal oriented tasks in a dynamic environment. Interestingly, these tasks might often need to be performed with, or under the supervision of, technically naive users. Operation in a dynamic envirnment, with the possibly of multiple users, introduces the potential for conflicting knowledge and instructions. Service robots will need to arbitrate between conflicting goals and derive appropriate courses of action. All necessary reasoning, along with task execution, will have to be performed without perturbing the underlying reactivity of the robot. Interaction with technically naive users requires the development of technologies that will allow the exchange of information between a user and the robot’s internal state. Users must be allowed to instruct and query the robot in a natural manner. Similarly, robots must be capable of reporting their own state to users – particularly in the case of erroneous situations. Natural Language has long been acknowledged as the most potentially fruitful, but notoriously difficult modality in human-machine interaction. Despite difficulties encountered, customer preference, will inevitably require that the Servie Robot make full use of this medium. Therefore, users must be allowed express instructions and receive information about a robot’s state using natural language. This not only requires the development of natural language sysnthesis and recognition technologies, but the provision of methodologies for dialog analysis, and translate between a robot’s internal state and natural langauge. The provision of robotic systems that can exchange information with a user in a nautral way, and arbitrate between conflicting goals, will require incorporating a large body of diverse software and hardware components. Such diversity, and for use in a user environment, emparts additional integration and safety requirements. To address all of the above requirements we must look at how a systems components are put together. A traditional approach to systems intengration, common in consumer electronics design, is to build the complete control system into one monolithic architecture. Such an approach is suitable for device of limited computational power, but does nothing to address the special requirements of Service Robotics. Robot Control Architectures, or more generally Cognitive Architectures, are an area of research that looks 1
I1-[OntoSpace]:D1
2
at the development of intelligent control systems with different layers of intelligence. In this document we present the first draft of a cognitive control architecture for Service Robots. Our architecture, based on a more abstract Multagent Architecture for Robot Control (?), decomposes the control stack into a community of intentional, deliberative agents. The agent community is based around a well formaed Agent Oriented (AO) programming language, and uses a formally defined ontologies to allow the agents to reason about their own state as well as their environment. In Section 3 the SharC architectural is presented, describing the architectural princples, and the tools which are used to implement such an architecture. This is followed in Section 4 with a description of all agents to be used in the application of the SharC Cognitive Architecture to Rolland, the Bremen Autonomous Wheelchair. First, Section ?? review the history of robot control architectures, presenting a number of robot control examples, and an analysis of the limitations of each of these approaches.
Chapter 2
Robot Control Architectures Robot architectures have been studied for over three decades. Before we consider the development of a new architecture for robot control, we must consider what has already been done in the field and, more importantly, what are seen as the failings of these approaches. A review of the robot architectures schools, and an analysis of a number of case studies now follows. Robotic control systems have evolved through a series of radically different approaches over the past 30 years. The development of these systems has been highly influenced by research in other areas of computer science and electronic engineering. Although an all win architecture has not yet been developed, recent research has been pushing towards the use of hybrid open architectures, which allow for learning and sophisticated control in complex environments. The fundamental approaches to robotic control systems are described in Section 2.1, followed by a detailed comparison of a number of different hybrid architectures in Section 2.2.
2.1
Evolution of Robotic Control Paradigms
Robotic control development has taken a two track approach as seen in Figure 2.1. Typically initial development is done in the field of single robot implementations, and ideas from this work are then incorporated into multi-robot designs. Each of the main architecture types will now be discussed, along with some of the better known implementations of these architectures.
2.1.1
Sense-Plan-Act Robotics
Given that the classical execution of an algorithm was typically to (a) gather data, (b) process this data, and (c) output a result, the first generation of robotic control systems centred around a similar notion of sensing or gathering data, planning to take new actions based on this data, and acting out these plans. These Sense-Plan-Act (SPA) architectures aimed to execute all functionality of a robotic system, from map building to basic motor control in this algorithmic style. The approaches taken were highly influenced by classical AI techniques, typically involving the construction of a
3
I1-[OntoSpace]:D1
4
Figure 2.1: An Overview of Robot Control Architecture Schools
world model and symbolic reasoning on this model. One of the most famous of the SPA robots was Shakey, developed in what was then the Stanford Research Institute (?). Equipped with a vision system, bumpers, and a triangulation range finder, Shakey was able to perform basic navigation tasks. Shakey used a high level representation to plan and take action. Unfortunately when situated in the real world Shakey’s planning system, like the planning systems of other SPA based robots, was unable to perform their job in a timely fashion. The system would produce a plan, but before this plan could be executed in full, it often became invalidated by changes in the real world.
2.1.2
Reactive & Behavioural Robotics
Frustrated with the inability of SPA architectures to perform even the simplest real world operations, researchers searched for a robotic control method that did not rely on high level reasoning. In 1986 Rodney A. Brooks published a ground breaking article which detailed what would become the most famous of a new class of reactive or behavioural architectures. The Subsumption Architecture (?) was characterised by (a) a lack of representation of the outside world, (b) the analysis of the architecture on a task rather than a functional basis, (c) the subsuming of behaviours by higher level behaviours, (d) a tight coupling between sensors and actuators. Brooks claimed that where SPA architectures had to be redesigned substantially to allow for the inclusion of new abilities in the robotic system, the Subsumption Architecture allowed new abilities to be added by simply adding a new behavioural module to the
I1-[OntoSpace]:D1
5
system. The basic premise was that the systems design should be such that it allowed for a new behaviour to be added to the system, without destructively interfering with behaviours already present. The original design approach (?) allowed higher level behaviours to override or subsume lower level behaviours when necessary. Such an implementation was effective for simple tasks like navigation, but failed on anything more complex. In the years that followed the Subsumption Architecture the school of reactive and behavioural robotics emerged. Architectures from this school are typified in their rejection of a symbolic representation of the outside world, along with a bottom-up approach to achieving complex behaviours (?, ?). One area in which architectures from the Behavioural School differed was on the question of how multiple behaviours could be effectively combined. The subsumption of behaviours as advocated by Brooks was one approach, others included the use of Fuzzy Logic as seen in the behaviour layer of Saphira (?), and the Dynamic Systems approach advocated by Luc Steels (?). The execution of behaviours in parallel is only one of two behaviour selection problems. The other problem is how to sequence behaviours to perform complex tasks. One approach immediately taken was the development of behaviour sequencing languages such as RAPS (?) and PRS (?). These behaviour sequencing languages allowed complex tasks to be modelled as behaviour sequences and sub-sequences, and have been used in the construction of multi-robot mission specification systems (?) and as the primary sequencer layer in some hybrid architectures (?). These behaviour sequencing languages became more complex in an attempt to scale the behaviour approach to complex tasks. In addition, work has progressed on robot learning techniques (?, ?). Through the development of self-thought, human-thought, and robotthought learning strategies, researchers hope to circumvent the need to explicitly develop robot behaviours on a drawing board prior to the robots deployment. Such flexibility in design would give robot designs a longer lifetime, since new tasks could easily be integrated into the system. Despite advances in behaviour sequencing and learning, it became clear to many that a purely behaviorist approach would not scale to complex intelligent behaviour (?). Even Brooks acknowledged that the move to human level intelligence would require more then the original behaviorist approach (?). Due to the lack of symbolic reasoning and representation, purely behaviorist architectures could not plan future actions in any way. Despite a general trend away from the purely behaviorist approach, some researchers were of the opinion that many gave up on the purely behaviorist approach too early, and were unwilling to make the commitment needed to develop complex behavioural systems (?). In either case elements from the behavioural approach are still key components in most robotic control systems, and their success on platforms such as the Sony Aibo artificial pet (?) are a testament to their importance in developing robots that can cope effectively in complex dynamic environments.
2.1.3
Hybrid Architectures
SPA robotics failed to produce good experimental results because the level of planning and other cognitive tasks attempted was too complex to cope with in a real world
I1-[OntoSpace]:D1
6
environment. Reactive robotics suffered in another way, in that only immediate reactive actions could be performed, allowing for the implementation of some fast insect-like behaviours, but incapable of performing complex tasks which needed to be ’thought about’ ahead of their execution. The natural progression was therefore the development of architectures comprising of both reactive and deliberative components. These Hybrid architectures may be characterised by a layering of capabilities, where low level layers provide reactive capabilites, and high level layers provide the more computationally intensive deliberative capabilites. The most popular variant on these hybrid architectures are Three Layered Architectures (TLAs) as reviewed by Gat (?). Gat characterises the TLA anatomy in terms of a three layered software architecture, where the speed of execution of components increases as we move further up the layers. The bottom layer or Controller implements low level functionality that is used to service sensors and drive motors. In terms of software engineering the controller would be a series of drivers with basic responses, while from a biological point of view the controller is the set of nerve connections to muscles and other organs. Controller elements should have low computational complexity to allow them to react quickly to stimuli, and execute basic behaviours speedily. Since the Controller is effectively a collection of drivers, they are typically wrote as C modules. Above the Controller Layer sits the Sequencer, which has to decide what primitive Behaviours are to be implemented and in what order, Naturally, this ordering cannot be a simple linear list, since the environment being operated in can change unexpectedly, and primitive Behaviours in the Controller can fail. The most common method used to implement the Sequencing layer is to use a Conditional Sequencer Language such as RAPs, PRS or ESL, which have been specifically designed for the purpose of sequencing low level behaviours. In biological terms, the sequencer can be thought of as the execution of a routine task, or set of tasks by an animal such as movement or reflex reaction. The Planner or Deliberative Layer can be implemented in standard programming languages. The Planner contains the heaviest computational components, traditionally containing an exponential or high polynomial state space searcher. The Planner interacts with the Sequencer in two basic ways: (a) The planner makes plans for the sequencer to execute directly, (b) The Planner answers information requests from the Sequencer. The most notable of the TLA style architectures are Atlantis by Gat (?), SSS by Connell (?), and 3T by Bonasso et al. (?). Some other variations on Hybrid architectures include Firby’s 2 layer architectures based on RAPS (?), the Saphira Architecture (?), and Arkin’s AuRa (?). Although Reactive Robotics rejected the need for symbolic representation of the outside world, hybrid architectures have reintroduced symbolic reasoning as a necessity for planning complex task execution. It should be also noted that symbolic reasoning and grounding (?) have become research subjects of their own from a robotics perspective (?) rather then a purely Cognitive Science discipline.
I1-[OntoSpace]:D1
2.1.4
7
Humanoid Robotics
It could be argued that one of the main driving goals of robotics has always been the development of robots that look and act like humans. There have been two limiting factors in the pursuit of such a goal, namely the physical implementation of a humanoid body, and the development of control systems with human like intelligence. The design of the physical robot has focused firstly on the construction of body parts like limbs, eyes and the face, and secondly on the design of hardware which acts as a container for the robots control system, i.e. the physical brain. Research in the development of body parts has progressed steadily over the past 20 years. Even when processor speeds were comparatively slow, and memory prices high, work could still be carried out on the design of objects like artificial hands that were more dependant on motor technology than processor speed. On the other hand, the design of the robot’s physical brain has often stuck rigidly to the path set out by processor technology. Expensive custom processors which were limited in speed, were originally used as the CPU for many robot implementations. Cheap fast architectures based on the PC have allowed for the use of far more complicated computation in the robot control systems than was previously possible. As the processing ability of robotic systems has improved, so too has the amount of computation seen in the higher level of robot control systems. Humanoid robot control architectures (?, ?), like other autonomous robot architectures in recent years (?, ?, ?), have been hybrids based on the original TLA approach. These architectures differ mainly in the amount and type of deliberative or cognitive processing performed by the robot. In all cases, the basic premise of the three layer architecture is that both low level reactive behaviour, and high level cognitive behaviour are needed if the system is to cope with complicated tasks in the real world. A core theme in Humanoid Robotics is the need for a human to be able to interact with a robot in a natural way, using gesture and natural language. Original work on integrating natural language capabilities focused on integrating basic audio command interpretation behaviours into the robot (?). Such systems were very similar to voice command, and basic conversational dialog interaction managers (?, ?). Interesting uses of voice interaction have been attempts to teach robots using basic verbal instructions (?). These dialog systems for robots are characterised by the disjunction of the audio and natural language processes from the core of the control architecture. The notion of integrating the natural language processing system more directly as a core element of the architecture has only recently been given attention (?, ?). As a research topic, humanoid robotics is more concerned with achieving human like behaviour, than with how this behaviour is achieved. Hence, control architectures for humanoid robotics can often be greatly biased towards either a behaviorist approach (?, ?, ?), or a cognitive approach (?), but more often are examples of hybrid architectures (?, ?, ?, ?, ?). Sense-Plan-Act architectures, behaviourist architectures, hybrid architectures and humanoid robotics have in general studied the design and behaviour of single autonomous robots. In the following two subsections collective robotics, and social robotics will be introduced as research disciplines which study group behaviour amongst architecturally simple and complex robots respectively.
I1-[OntoSpace]:D1
2.1.5
8
Collective Robotics
A natural solution to the problem of the intellectually limited reactive robots, was the adoption of an approach using a team of robots to preform complex tasks (?). Collective Robotics is a general term used to describe research aimed at studying techniques which allow a team of robots to perform a task that cannot be performed, or would be performed less efficiently by an individual robot. Dudek (?) provides an excellent taxonomy of collective robotic systems in terms of features such as communication method, collective size and individual robot intelligence. Collective Robotics is largely inspired by biological evidence that high level emergent behaviour can be seen in a collection of simple organisms (?). The classic example of this phenomena is the construction of an architecturally complex ant hill, by a colony of apparently simple individual ants, without any central coordination system. Cooperative Robotics is the largest branch of collective robotics, and is characterised by the use of a cooperative interaction strategy between team members, rather then a competitive one. Experiments have been performed that appraise the effective intelligence of a group, given different levels of communication between individuals in the group (?). A major difficulty in collective or cooperative robotics research is the quantification of cooperation in a robot team using a given architecture (?). Since quantitative analysis of multi robot performance is difficult, actual competitions in areas such as Robot Soccer (?) have been used to compare and contrast different cooperative techniques. Clear similarities exist between the fields of collective robotics research and MAS. In this sense, collective robotics entails a bottom-up approach to problem solving, where a number of possibly heterogeneous robots cooperate, with some goal being achieved as a by-product of that cooperation. Cooperative robotics differs from social robotics in the level of intelligence bestowed on the individual team members. Whereas robots used in cooperative robotic team are simple, behavioural entities, those used in social robotic teams are typically of a hybrid architecture design, allowing any individual robot to make models and reason about other robots in its environment.
2.1.6
Social Robotics
Where collective robotics has mainly focused on emergent intelligent behaviour from a collection of cognitively limited insect like robots, Social Robotics studies group robotic behaviour in teams of robots based on hybrid style architectures. Individual robots within a social robot collective are capable of modelling other robots within the collective, and, using this model, to reason and plan to achieve its goals. MAS have provided social robotics with a framework by which robots may reason about the actions of humans as well as other robots (?), and provided means by which the robots may communicate abstract ideas between one another (?). Social Robotics is still a relatively new research discipline. Although work has been carried out on the construction of social robots (?), as well as the analysis of embodiment in social agents (?), work has yet to be started on analysis of group behaviour in social robot communities.
I1-[OntoSpace]:D1
2.1.7
9
Safe Robotics
Real world applications for robotic systems invariably place responsibility with the robot for both its own safe operation, and the safety of its environment and those agents (artificial and human) with which it interacts. As robotic applications take on more and more responsibility, it may be advisable for us to be able to make guarantees about the robot’s operation. One approach which can be taken to addressing this requirement is the adoption of methodologies from the formal methods community to highly engineer the system, and formally analyse its operational model to eliminate any sources of error. This Safe Robotics (?) approach aims to eliminate all unpredictable behaviour in the robot’s actions. This highly engineered approach has the clear advantage that certain guarantees can be made about the robots behaviour in dangerous situations. In practice however robotic systems are often highly complex in nature and formal analysis of such complex systems is extremely difficult (if not computationally intractable). Using a simplified model of the robots behaviour can reduce the problem to something which can be analysed. The approach also has another limitation in that by definition the formal analysis of a robot’s actions aims to eliminate autonomy and emergent ability from the robot. Essentially the robot is purposefully reduced to being a mechanism in order to make guarantees of behaviour. At first glance this is clearly in contrast with the goal of truly intelligent mobile robots. In the long term there may however be possibilities to find a middle ground between the two approaches, where certain safety guarantees can be made about the robot’s behaviour, while still leaving room for emergent activity on the part of the robot.
2.2
Case Studies in Social and Service Robot Architectures
It is now commonly accepted that a purely reactive, or purely deliberative approach to the design of a robot control architecture is limited. Therefore the following case studies will only concern architectures which are hybrid in nature. It should be noted that not all architectures are equally hybrid, with many hybrid architectures choosing to omit an explicit planning layer in favour of using plan libraries or a behaviour sequencer. Each case study aims to identify core elements which are essential to providing an intelligent robot control architecture. These architectures will be compared in the next section to produce a set of requirements for an intelligent robot control architecture.
2.2.1
Atlantis
Description Depicted in figure 2.2 Atlantis by Gat was one of the first hybrid architectures to emerge in the early 90s (?). Atlantis is a true hybrid architecture with separate layers for functional control, behaviour sequencing and high level deliberation / planning. Atlantis’s
I1-[OntoSpace]:D1
10
sequencing layer is based around Firby’s RAPs system. The deliberation layer provides a planner which operates on a world model to provide suggested plans of action. Essentially the deliberator provides RAPs which are executed by the sequencing layer.
Figure 2.2: The Atlantis Architecture
At the time, Gat was keen to point out that the deliberator merely provided advice to the sequencer and was not in the control loop. In essence, the deliberator ran in its own thread, allowing the sequencer layer and the control layer to provide reactive capabilities to the system. This was a key step in the design of a truly hybrid system with independent components taking care of the three different levels of control. Analysis The architecture was one of the first of its kind, but was only designed to provide a limited amount of functionality to any robot implementing the architecture. There was only one central planning system, which essentially meant that as the robots capabilities and world view grew, the complexity of this one component would grow at a huge rate. In essence, there was no distribution of high level control, which would eventually lead to the controller module becoming bulky and overburdened.
I1-[OntoSpace]:D1
2.2.2
11
Saphira
Description Shown in figure 2.3, Saphira (?) is a hybrid architecture based around: (a) a collection of reactive and goal seeking behaviours; (b) a monolithic representation of the outside world called LPS (Local Perceptual Space); and (c) a high level PRS based controller which was used to decide the sequence of behaviour activations. Behaviours can be combined using fuzzy logic, but designers must take care to avoid combining conflicting behaviours. A limited form of interaction via the speech channel was possible, with speech input fed directly into the high level controller.
Figure 2.3: The Saphira Architecture. Perceptors are presented on the left hand side, with actuators on the right. As we move vertically up the cognitive abstraction increases, culminating with the PRS-Lite interpreter which provides high level reasoning for the robot.
Analysis The interesting thing about Saphira is its use of the PRS-Lite reasoner to provide the high level cognitive abilities of the robotic agent. PRS is broadly designed around a BDI architecture, and as such results in Saphira being one of the first major robot
I1-[OntoSpace]:D1
12
architecture implementation to model the robot as an intentional agent. Saphira was not however a true three layer hybrid architecture in the style of Atlantis. Instead it was essentially a two layer architecture with low level controllers and behaviour sequencing. Although PRS-Lite did provide a reactive controller mechanism, it did not perform planning in the classical AI sense of the term. Thus the system was limited to those behaviours which were coded into the schema library at deployment time.
2.2.3
The Social Robotic Architecture
Description The Social Robotics Architecture or SRA as shown in figure 2.4 is a hybrid robot control architecture developed in University College Dublin (?). Broadly speaking the SRA was inspired on the Machiavellian Intelligence Hypothesis (?), which proposes that the development of high level cognitive ability came about due to a need to deal with complex social interactions, rather than the need to improve the performance of manual tasks. Interesting features of this architecture include: • The use of an Intentional BDI style agent at the deliberative level. • The use of a collection of movement and visual analysis behaviours to provide a reactive level of control. • The ability of SRA based robots to work as a team thanks to their social ability to model and reason about other agents in their environment.
Figure 2.4: The Social Robotic Architecture.
Analysis The unique and interesting feature of this architecture is that it was one of the first to clearly acknowledge the existence of the robot agent as only one component in a greater
I1-[OntoSpace]:D1
13
social community of robots. Until this point robot architectures had rarely considered the interactions of the robot with those around it in a social manner. Cooperative robotics had looked at a reactive robots interactions with other reactive agents, but the study of social robots was only beginning. That said, the architecture did have limitations: • The SRA like many robot architectures did not take full advantage of the power of MAS in the design of the internals of the robot. Although it was not expressly precluded, the deliberative layer did not split control amongst a number of intentional agents. Instead all behaviour was controlled by one centralised agent. At the reactive level all behaviour components were encapsulated at libraries and modules, where an agent encapsulation may have been a possibility. • In practice little was done to address the questions of coordination of reactive behaviours. Although subsumption was quoted as being used in the arbitration process, there was in practice no such subsumption or any other form of coordination of conflicting behaviours. • Although the singling out of social ability as a separate layer of the architecture was interesting, this lead to a discontinuity in the integration of social abilities with the rest of the robot’s control systems. • Like Saphira the SRA was not a true 3 layer hybrid architecture in the style of Atlantis. Although the intentional agent at the deliberative layer did provide some form of reactive choice and provision of plan schemas, this was not a true Deliberative Layer. In practice no goals nor means-end reasoning were provided, hence limiting the robot to pre-formulated plans and scenarios.
2.2.4
Intelligent Machine Architecture
Description The Intelligent Machine Architecture (IMA) has been used in recent years in the production of a number of different platforms in the Intelligent Robotics Lab in the Vanderbilt University, Tennessee (?, ?). Figure 2.5 presents an abstraction of the IMA, along with the schematic of a specific robot control application based around the IMA. The IMA is not in itself a robot control architecture. It is an application framework which provides agents for use in robot control architectures. As seen in figure 2.5, IMA is generally speaking a hybrid architecture which goes from an inner core of actuator agents to outer levels of composite behaviour and Task Interaction. The core selling point of IMA and its application in ISAC is that all aspects of the robots control system are modelled around agents. Agents are loosely coupled and communicate through DCOM (Distributed Component object ,model) over multiple platforms. Analysis IMA was developed with the aim of meeting the requirements of the integration of large scale software systems for mobile robots. The use of multiple agents to provide all levels
I1-[OntoSpace]:D1
14
Figure 2.5: The Intelligent Machine Architecture (top) along with a schematic of an architecture implemented for the ISAC robot.
I1-[OntoSpace]:D1
15
Figure 2.6: A feature based comparison of the Robot Control Architectures discussed above
of control is a key selling point of the architecture and worthy of study. The architecture is however limited on a number of levels: • IMA agents are modelled as software objects which communicate over DCOM. The agents do not have inbuilt intelligence in the shape of communication or cooperation protocols. Essentially the agents are socially dumb, and require a lot more work before any advantage can be gained from the agent abstraction. • It has not been made clear how either planning or behaviour sequencing will be achieved for the control of an autonomous robot. To date their applications centre around direct human control of a complex software system. For the control architecture to be used in the development of autonomous robotic systems, these questions will need to be addressed.
2.3
Comparison of Case Study Architectures
A comparison of the case study architectures is now made. The qualities under which the architectures are compared are based on the perceived advantages and disadvantages of each of the case studies. Figure 2.6 presents a summary of this comparison. • Composition - An architecture’s composition indicates how the software is constructed and fits together to form one complete robot control system. These can range from monolithic C style architectures to agent oriented designs. Both Atlantis and Saphira make use of a modular design. Unfortunately little detail is given as to how these components are connected. The architecture’s descriptions indicate that coupling between the modules is static, hence indicating low level network communication or direct method calls between modules. The SRA is implemented around one deliberative agent. All behaviours or controllers are staticly accessed by actuators and perceptors which are embedded within the
I1-[OntoSpace]:D1
16
deliberative agent. IMA/ISAC takes a more flexible approach through the construction of a MAS which controls the whole robot’s control. The individual agents within this MAS are however limited, being essentially reactive in style, providing bare legacy system encapsulation. Due to this static design, the MAS is limited in not being able to adopt to new situations easily. • Explicit Deliberation/Planning Capabilites - Explicit planning capabilites in the classical AI sense are algorithms which can build plans dynamically for novel scenarios. The specific design of the algorithm is not important here, but the use of static plan libraries does not constitute deliberative control. Although reasoning is required to choose a suitable plan from a library, these static plans are merely high level methods which must be provided at deployment time. As one of the first true hybrid architectures, Atlantis has a very clear role for a dynamic planner. Saphira like many hybrid architectures to follow, had replaced a dynamic planner with a set of pre-defined behaviour schemas. These schemas were effectively static plans. In practice the SRA had no dynamic planning, but did make references to the existence of a static plan library. Also in this vein IMA/ISAC does not specifically describe any mechanisms for dynamic planning in the architecture as a whole. It is of course possible that IMA/ISAC might have a planning agent built within the architecture. • Reactivity - For almost 20 years it has been widely accepted that explicit reactive abilities are essential to guaranteeing that an intelligent robot can operate in real world environments. Unsurprisingly, reactivity is an emphasised feature in each of the case study architectures. • Behaviour Sequencing & Coordination - Closely related to the topic of reactivity is the sequencing and coordination of reactive behaviours. Where overlapping behaviours exist, it is very important to provide mechanisms to deal with the coordination and arbitration of conflicts between reactive behaviours. This is an absolute necessity if situations such as the Stuck in the Middle1 robot are not to occur. Both Atlantis and Saphira have explicit handling of behaviour sequencing and coordination. Atlantis uses RAPS to provide this mechanism, while Saphira uses a combination of PRS for the sequencing of behaviours and Fuzzy Logic at a lower level for the actual combination of behaviour effects. Saphira was however still prone to some Stuck in the Middle problems. The SRA calls for the use of the subsumption to combine low level behaviours, but in practice this was never done. Although IMA/ISAC makes several references to behaviour agents and composite agents, little information is provided on the details of arbitration. • Openness - Many early architectures were designed in a static manner. Each robot was designed from scratch to meet a specific set of hardware and functional requirements. Such designs are inevitably not open to extension or modification. The autonomous intelligent mobile robot will require a large amount of software components, which are unlikely to all be prepared in advance for a grand Integration Day . It is far more likely that components will need to be added in and 1 Here one behaviour wants to move the robot forward, and another behaviour wishes to move backwards. Without an intelligent form of arbitration between these behaviours, the robot will simply stay stationary and never move from its current position.
I1-[OntoSpace]:D1
17
out of a running software architecture for some time after it is initially developed. Recently some architectures have began to tackle this issue, but in truth all effects are still in their infancy. From another perspective the need for designers to worry about questions of integration takes away from more interesting development issues. An open agent design would mean that a designer would not have to spend so much time worrying about integration issues, but could instead focus on the development of intelligent components which would actually improve the usefulness of the robot. Neither Atlantis or Saphira are open architectures. They were both designed for very concrete implementations and pay no lip-service to the need for extensibility and openness. The SRA is a more open architecture in that it does not explicitly define what abilities are possessed by a robot. The design principle was that if a software component could be encapsulated as a behaviour, actuator or perceptor then this could be encapsulated into the overall design in some way. Apart from this philosophy though, little was done to provide mechanisms to ease this development and integration process. The IMA was specifically developed to ease the integration of heterogeneous software components. The general premise is that all software components can be modelled as basic agents which are loosely coupled over a DCOM connection. • Portability - An issue often closely related to the subject of openness is portability. Portability simply means that the architecture can be easily moved from one platform (hardware and/or OS) to another. The portability of a system effectively increases the usefulness of the architecture since the architecture is not constrained in terms of where it can be applied. Saphira and Atlantis were designed for specific platform implementations. As such they have little room for portability. There are however some possibilities for portability in Saphira due to its use of a server/client model for communication between the main system architecture and the low level 4hardware layer. Since the SRA is an abstract architecture definition, it can in principle be implemented on any platform. No tools or methodologies were provided to help in such a porting process, and the only implementation of the SRA was very hard-coded to the implementation platform. The IMA allows little in the way of explicit portability. The architecture was designed to run on a PC platform using NT4, and all interagent communication is explicitly made through a Windows protocol. • Robustness - Perhaps one of the most important criteria for the development of useful mobile intelligent robots will be the robustness of these systems. Since these systems will be built around a large range of software and hardware components there will always be a high possibility of some component failure. In such a case we do not want the system to simply fall over . We instead need it to degrade gracefully, with many systems showing no adverse effects through the failure of some other component. As static designs Saphira and Atlantis have little robustness; the failure of any one component will easily lead to the failure of the whole system. In principle the intentional agent used to provide the deliberative layer of the SRA should provide some level of robustness. In practice however the only concrete implementation of this agent did not have mechanisms in place to provide such robustness. In
I1-[OntoSpace]:D1
18
principle the loose coupling between agents in IMA should provide a considerable amount of system robustness. In practice however this is not demonstrated, and it has to be seen if the implementation will actually live up to this ideal. • Social Ability - We define a Social Architecture as one where the architecture takes a social perspective on the view of the robot. A social agent as being an agent which has a model and can communicate directly with other agent’s in its environment. Similarly a social robot is one which models and can communicate directly with other agent’s in its environment. Saphira and Atlantis are expressly not Social Architectures. The architectures were designed for specific single robot implementations, and take no account of the possibility of other social entities in their environment. It is noted that Saphira is designed with human control in mind, but the processing of instructions from an operator does not count as social modelling. By definition the SRA is a Social Robot Architecture, while the IMA/ISAC does provide social modelling agents within the complete architecture. • Intelligent Agent Modelling - This is a question of modelling, and whether the robot was modelled as an intelligent agent either in its specification or explicit implementation. This question is highly related to the question of whether the robot architecture models the robot as a social entity. Atlantis made no explicit modelling of the robot as an intelligent agent; was a perfect control system, with no notions of BDI, intentionality or dynamic systems style agenthood. Coming a couple of years later, Saphira was born into the world of emerging BDI architectures. Its control sequencer PRS-Lite is specifically designed around a BDI metaphor, and references are made specifically to the use of actuators and perceptors on the robot. The SRA both implements the high level controller of the robot as an intentional agent, and models the robot as a whole as an intelligent agent interacting in a social community. IMA/ISAC also models the robot as an intelligent agent, but as an agent with limited social interactions. • Human Robot Interaction - An architecture which explicitly considers (Human Robot Interaction) issues undoubtedly makes a more suitable candidate for a service or humanoid robot implementation than otherwise. Saphira and Atlantis were primarily developed for the study of hybrid design. Unfortunately this meant that HRI issues took a backseat in comparison to the basic reactive and deliberative issues. Although the SRA was by definition a social architecture, the concrete implementation did not address direct interaction with humans. ISAC/IMA was explicitly developed for the study of interaction between humans and robots. Natural language is explicitly addressed in the concrete ISAC implementation, but little detail is given on how this interaction is achieved. A Note on other Perceived Problems in Current Architecture Designs Above, we have seen how the architectures compare on issues, which are addressed by at least one of the architectures. There are of course many issues which are not considered by any of the above architectures. One of the most clearly lacking deficiencies in current approaches is the lack of an inclusive natural language (NL) treatment. The typical
I1-[OntoSpace]:D1
19
approach is to take pre-packaged software and treat it as a black-box within the complete architecture. This often leads to performance issues, since the pre-packaged software is often developed for use on a desktop. In particular speech recognition software fails when simply ’thrown onto’ the robot platform. Given that the NL modality will be such an important tool in the development of intelligent robots which can interact with users in an easy way, this neglect of NL usefulness is regrettable. Projects such as HERMES (?) attempt to circumvent these problems by placing constraints on the usability of the robot in the form of microphone headsets for users, or extremely rigid dialog systems. The limitations of the speech recognition system are effectively ignored, meaning that the developers of these tools never receive feedback as to the usefulness of their systems on mobile robot platforms. The author argues that a more open approach to issues of NL integration is required. Architectures must be developed with a mind open to the issues which are going to arise with NL Integration. Limitations of speech recognition systems should not simply be ignored and worked around, instead the design of speech systems should be influenced by experiences in the mobile robot domain.
2.4
Discussion
We have seen that hybrid architectures such as Saphira and Atlantis were successful in combating the dual needs for reactive and goal driven control of mobile robots. There architectures were however static, designed for explicit hardware platforms and scenarios. Control was centralised, leading to a bloating of the central processing which could lead deliberative ability to become sluggish. The architectures were not easily extendable, nor could they robustly cope with component failure. However, rather than simply splitting the control program into a number of components or weak agents, the author believes that it is essential to model each agent as a strong intentional agent, with reasoning, reactive, and social abilities. Only through this decomposition, can dynamic cooperative abilities eventually emerge, while providing the essential deliberative and reactive control qualities of the robot. Based on these motivations, this thesis produces a complete framework, which can be used to develop robot control architecture out of communities of intelligent intentional agents.
Chapter 3
SharC Architecture Description Although hybrid architectures were successful in combating the dual needs for reactive and goal driven control of Service Robots, they had many limitations. Hybrid architectures were traditionally static, designed for explicit hardware platforms and scenarios. Control was centralised, leading to a bloating of the central processing which could cause deliberative ability to become sluggish. The architectures were not easily extendable, nor could they robustly cope with component failure. Furthermore, these architectures had little support for intelligent natural language support. Our architectural approach is to split Service Robot control amongst a number of deliberative agents. Each of the agents has the capacity for high level reasoning akin to the traditional hybrid architectures. But, by distributing control amongst a number of agents, we achieve robustness and scalability gains. The SharC Cognitive Control Architecture, presented here, is based on a more abstract MultiAgent Architecture for Robot Control (MARC) (?).
3.1
Architectural Approach
Many traditional service robot implementations view control as a monolithic software stack. Good software engineering methodology requires that such stacks should be split into a number of disparate components. In such a decomposition, each component takes care of a particular system on the robot (e.g. one component manages natural language synthesis). A Componenet Based Software Engineering (CBSE) approach can improve modularity through the use of a Middleware solution such as the Open Agent Architecture (OAA) or JADE. In such an approach components are loosely connected to each other, resulting in a system that is distributed, open, and more scalable. This CBSE approach has already been applied in many different Service Robot projects (?). Although a CBSE approach can improve the robustness of a service robot design, it does little to elevate the system’s intelligence or user-friendliness. To address these issues, our approach must go beyond a simple middleware decomposition. Therefore, we split the software stack into a number of intentional deliberative agents. This is done by decomposing the software stack into a number of components - as with a CBSE - and then encapsulating each component in an Agent Oriented Programming
20
I1-[OntoSpace]:D1
21
Figure 3.1: Decomposing a Software Stack into a number of Deliberative Agents
(AOP) wrapping. This AOP wrapping abstracts a component to the Intentional Stance (?, ?), rather than the more traditional Physical or Design Stances. The abstraction - made in Folk Psychological terms like Beliefs, Commitments, and Desires - results in a specification that is free of low level implementation details. Furthermore, the abstraction, made in human like terms, allows a natural representation of the robot’s internal state. These intentional abstractions are formalised on BDI Logics (?, ?), thus allowing practical reasoning on the state of any one agent. Through decomposition, our approach, illustrated in figure 3.1, keeps the CBSE gains of scalability, open design, and distribution. Furthermore, by abstracting to strong deliberative agents, rather than weak agents or software components, our architectural approach gains in reasoning, robustness, and human-computer interaction. An important aspect of this approach is that underlying components are unaffected. This first means that we can easily integrate legacy systems into a complete control architecture. Second, it also means that individual component developers do not need to be aware of AO programming design issues. Furthermore, the approach gives such designers great freedom in internal implementation choices - due to the loose coupling, components can be implemented on a wide variety of programming languages and hardware platforms.
3.2
The SharC Architecture for Rolland
The SharC architecture is being primarily developed for use on Rolland, the Bremen Autonomous Wheelchair. As well as being a case study in the production of safe service robots, Rolland serves as a demonstration platform for many techniques developed by the SFB-TR 08 Spatial Cognition. The SharC architecture is a high level control architecture for Rolland. This should be contrasted with Rolland’s lower level automation architecture, which has previously been described in (?). The automation control architecture, which addresses low level automation and safety issues, is encapsulated in
I1-[OntoSpace]:D1
22
Figure 3.2: The SharC Architecture for Rolland
one SharC agent. Although primarily developed for the Rolland platform, our agent oriented approach will allow us to easily migrate SharC to other platforms as needed. Figure 3.2 presents the SharC architecture for Rolland. Yellow block represents a complete control agent that encapsulates a system component - each of these agents are discussed in detail in section 4. As mentioned above, control of the physical wheelchair is encapsulated within one complete agent. This is in contrast to an earlier approach where control of the physical robot was split between a number of agents (?). Arrows between the agents show primary information flow. All information exchange is via messages rather than more tightly coupled method calls. This provides a loosely coupled distributed system which can be implemented across a number of different machines. Where possible, we have based the agents around off the shelf components. This code re-use approach was essential in precuring the tools for speech synthesis and recognition. However with integrating legacy components, there is always a risk that some components may not behave as expected. In such cases it is importnat that the overall architecture be robust to fault. SharC’s agent oriented approach is ideally suited to such occurances. The architecture is being developed for both German and English use. This bi-lingual requirement is facilitated with linguistic components that will perform mapping from either German or English to internal representations. Key to this mapping is the use of formally verified Linguistic and Domain Ontologies (?). These two bodies of knowledge provide the agents with a common ontological viepoint, based on which they can also reason about the environment and internal states. The pink area in figure 3.2 shows where the Spatial Ontology is principally used. This ontology provides SharC agents with a common-sense style of spatial knowledge, and is used in the definition of Rolland’s
I1-[OntoSpace]:D1
23
internal map representation, the RouteGraph (?). The blue region shows the influence of the Linguistic Ontology over the SharC architecture. Concepts for the Linguistic Ontology, or Generalized Upper Model (?) form the cornerstone of SharC’s handling of natural language. As can be seen from the ontological overlaps, the SharC architecture can be split between a natural language independent, internal representation, and a language dependent section. The job of natural language generation and understanding is to mediate between these different viewpoints.
3.3
Architectural Framework Implementation
Implementation and deployment of the SharC architecture requires: (1) the development of individual components; (2) the development or adoption of appropriate data interchange formats; and (3) the development of the architectural framework. The individual components and interchange formats are discussed in section 4. In this section we look at SharC’s architectural framework. The architectural framework could be designed from first principles, but tools and languages exist to help us in this endeavour. We have used the AgentFactory Agent Development Framework (?, ?) to produce the underlying agent architecture for SharC. Unlike other middleware solutions AgentFactory is based around a true Agent Oriented (AO) language. The language, AF-APL (?), constructs a program out of folk psychological primitives like Beliefs and Commitments to abstract component implementations to a high level. The language is well defined, with an asynchronous reasoning model that guarantees the continued reactivity of all agent. The language also provides deliberation and means end reasoning capabilities to allow agents, and the robot as a whole to reason and arbitrate between conflicting goals. The AgentFactory (?, ?) is a complete agent prototyping environment for the fabrication of deliberative agents. As shown in figure 3.3, AgentFactory can be conceptually split into two components: the AgentFactory Runtime Environment (AF-RTE) and the AgentFactory Development Environment (AF-DE). The AF-RTE includes an agent interperter; a library of standard actuators, perceptors, and plans; a platform management suite, which can hold a number of running agents at any time; and an optional graphical interface to view and manipulate agents and the platform. The AF-DE includes an agent/role compiler along with an Integrated Development Environment. These tools have been implemented in Java, meaning that we can develop SharC agents on any platfrom that support a Java VM. Also, since the underlying language, AF-APL, has its own clear semantics, we could - if necessary - implement an interperter in C. The AgentFactory Programming Language (AF-APL) models knowledge and action at the intentional level. To do this effectively, the language has logical features for knowledge representation, as well as imperative features for the structuring of action. Technically, an AF-APL agent is a tuple of component sets, each of which is indicated by a separate box in figure 3.3 The Agent’s Beliefs Set contains the agent’s world knowledge, or more precisely what the agent believes to be true. The Agent’s Commitments are a collection of promises that the agent has made to itself or others. This idea of a commitment is key to the ability of the agent to reason about its own actions. The agents plans are imperative like definitions which are used to perform complex actions that are built out of more basic actions and Plan Operators. The agent’s goals are states
I1-[OntoSpace]:D1
24
Figure 3.3: The AgentFactory Development Framework
of the world, or sets of beliefs that the agent wishes to see brought about. These goals are typically resolved in a two step process. Firstly, means end reasoning or planning attempts to come up with a plan to achieve the goal. This plan must then be executed in some way. AF-APL provides a form of inheritance in agent design through the use of explicitly defined role classes. AF-APL Roles allow a collection of actuators, perceptors and other agent components to be grouped together into an agent prototype. These agent prototypes can then either be instantiated directly into agents, or included in other agent designs. The AgentFactory Runtime includes includes a standard library of agent roles that can be used to rapidly prototype AF-APL agents. These roles will be used to impart basic social skills to the SharC agents. For a team of agents to perform some task, information must be exchanged between the cooperating agents. Agent Communication Languages (ACLs) represent a middle ground in communication between the expressiveness of natural language, and the conciseness of using a simple flag or clearly defined function call. ACLs are constrained communication languages based on Searle’s Speech Act Theory (?). The AgentFactory standard library provides a number of role classes that can be used to provide full FIPA compliant ACL communication to developed agents. The FIPA (Foundation for Intelligent Physical Agents) ACL standard provides an outer communication language which acts as a transport mechanism for an inner content language. This is analogous to how a TCP/IP packet has both carrier information and a content payload. The outer language includes a number of performatives such as request , commit , and confirm, while the inner language is application dependent. Section 4.3 discusses the content languages to be used between SharC agents. These languages are currently under development, and will be mediated by concepts from the linguistic and spatial ontologies.
3.4
Anatomy of a SharC Agent
From an implementation standpoint, the general design of a SharC agent is shown in figure 3.4. At the core of the agent lies the task code that is to be encapsulated as an agent; we refer to this task code as the component. The component - with a traditional
I1-[OntoSpace]:D1
25
Figure 3.4: Anatomy of an Agent Approach 1 - Component is embedded as library in Agent Code.
Figure 3.5: Anatomy of an Agent Approach 2 - Agent communicates with a component which is running as a server application.
C or Java style interface - is wrapped up in a AF-APL code. From a practical point of view, this language acts as the glue between the more conventional component code and the agent communication language protocols that define the interface to the agent.
3.4.1
Components
A component is a conventional program, object, library, or general piece of code to perform some task. Within the SFB demonstrator some typical components might include: a C implementation of RouteGraph with searching and management algorithms; a LISP based speech generator; or the Nuance Speech Recognizer. Within a component, any appropriate design strategy or coding approaches may be taken - developers are free to implement their components as they see fit . Externally, a component should use a very simple interface, with methods which only take strings as arguments, and only return strings as results. Interface functions cannot take ints, chars,
I1-[OntoSpace]:D1
26
doubles or any other type except a string of text. Object serialisation is a simple way to produce and process information in string format. A more useful, long term, strategy would be to develop XML schema to define the information that comes in or out of a component. As with arguments, return values must be strings, but may also be booleans. This loss of static typing is essential to providing a component architecture which is independent of programming language, operating system, or hardware implementation. For example, an interface function for a developed C library might take the form: char* process_data(char *inMesg) It should be noted that the components developed - and their external interfaces - will be independent of AgentFactory or any particular agent implementation. Therefore, if at any time the components were to be used in a new project (or if the agent approach was to be abandoned for any reason), it would be easy to get these components working together with a different framework.
3.4.2
Agents
The components developed by the individual sub-projects need a middleware to allow them to communicate together over a potentially distributed network. As mentioned above, AgentFactory will provide this wrapping. Each component will be embedded in an AF-APL - an agent oriented programming language. This language will essentially act as a glue between the conventional component code, and an Agent Communication Language based interface to the agent. As mentioned above, those designing individual components will not need to worry about the ’agent-oriented’ aspects of designing these agents; they need only need to design their own code, and provide a suitable component interface. Agent design will be supported through the integration process. For this reason, we will address the agent-oriented aspects further here.
Chapter 4
The SharC Architecture for Rolland Implementation of the SharC architecture for the Rolland platform, requires the development of a number of software agents. In this section we provide a description of each of the SharC agents that are being developed for Rolland. Figure 4.1 provides a high-level schematic for the SharC architecture, showing hardware and software relations. The chapter is composed as follows: Section 4.1 discusses the ontologies to be used by the SharC agents. Section 4.2 then intorduces each of the agents, before Section 4.3 discusses communication channels and data interchange formats.
4.1
Ontologies
The SharC Architecture makes use of two formally verified ontologies that cover the linguistics and spatial domain. These two bodies of knowledge provide SharC agents with a common ontological viepoint, based on which they can communicate and reason about the environment and internal states. Spatial Ontology
27
I1-[OntoSpace]:D1
Figure 4.1: Detailed SharC Architecture for Rolland
28
I1-[OntoSpace]:D1
Summary:
Contacts: Code:
Platform: Hardware: Uses: Used By:
29
Ontology of spatial concepts and relationships to allow for reasoning about space when interacting with users. RouteGraph and analysis of the RouteGraph will directly influences and use the concepts of the Spatial Ontology. John Bateman, I1-OntoSpace, I4-Spin The description of the ontology will take a two fold approach. The ontology will be specified at its highest form in CASL, but Description Logic (LOOM or RACER) specifications are likely for use within the runtime demonstrator. Unknown Unknown RouteGraph, Dialog Manager, Automation Control, Language Generation, Language Analysis
Linguistic Ontology Summary:
Contacts: Code:
Platform: Hardware: Uses: Used By:
4.2
A linguistic ontology partly based on the General Upper Model. The ontology covers high level linguistic knowledge, and is key to language generation and analysis. John Bateman, I1-OntoSpace, I4-Spin The description of the ontology will take a two fold approach. The ontology will be specified at its highest form in CASL, but Description Logic (LOOM or RACER) specifications are likely for use within the runtime demonstrator. Unknown Unknown Natural Language Analysis, Understanding and Generating components, Dialog Manager.
Software Agents/Components
Speech Recognizer (E) Summary: Contacts: Code: Platform: Hardware: Uses: Used By:
Take audio input in english and generate a plain english text representation of the input speech. Tilman Verkuff, Reinhard Moratz Nuance Windows XP Nuance can only be run on a select number of PCs. Directly by the Speech Analyser (G) and potentially indirectly by the automation controller to implemented reactive stopping behaviours.
Issues: Speech Recogniser (G)
I1-[OntoSpace]:D1
Summary: Contacts: Code: Platform: Hardware: Uses: Used By:
30
Take audio input in german and generate a plain german text representation of the input speech. Tilman Verkuff, Reinhard Moratz Nuance Windows XP Nuance can only be run on a select number of PCs. Directly by the Speech Analyser (G) and potentially indirectly by the automation controller to implemented reactive stopping behaviours.
Issues: Speech Analyser (G) Summary: Contacts: Code: Platform: Hardware: Uses: Used By: Issues:
Take in a string of german text and attempt to produce a semantic representation of the meaning of the string. Tilman Verhuff C Unknown Unknown The Speech Recogniser (G) The Dialog Manager The output of this component will need to conform with a model used by both the Speech Generator and Dialog Manager.
Speech Analyser (E) Summary: Contacts: Code: Platform: Hardware: Uses: Used By: Issues:
Take in a string of german text and attempt to produce a semantic representation of the meaning of the string. Unknown Unknown Unknown N/A The Speech Recogniser (E) The Dialog Manager The output of this component will need to conform with a model used by both the Speech Generator and Dialog Manager. We need to choose a suitable english speech analysis component.
Dialog Manager
I1-[OntoSpace]:D1
Summary: Contacts: Code: Platform: Hardware: Uses: Used By: Issues:
31
Provide command interpretation, dialog management and high level control for the wheelchair, based on input from several sources. Hui Shi Unknown Unknown N/A The Speech Analyser (E), Speech Analyser (G), GUI, RouteGraph, Automation Controller. Unclear How much reasoning will be performed by the Dialog Manager, and how much will be provided by other agents? Will the dialog manager also provide general command interpretation abilities?
RouteGraph Summary: Contacts: Code: Platform: Hardware: Uses: Used By: Issues:
Provide a representation of the robots environment. Christian Mandel, Thomas Rofer Custom C/C++ Code Windows XP Odometry Sensors, 3D Laser Scanner Dialog Manager (Speech) might provide annotations to the RouteGraph The Dialog Manager. Will runtime analysis of the RouteGraph be performed by RouteGraph, or will the RouteGraph only be a data type which must be reasoned on by another component.
Speech Generator Summary:
Contacts: Code: Platform: Hardware: Uses: Used By: Issues:
Translate a high level semantic representation of an utterance to an annotated english or german string for output by a speech synthesis agent. John Bateman Custom LISP Code Unknown N/A Speech Recognition (E), Speech Recognition (G), GUI Dialog Manager Will two separate speech generators be developed, or can the same running generator be used both for English and German, simply by tagging the input appropriately?
Speech Synthesiser (E)
I1-[OntoSpace]:D1
Summary: Contacts: Code: Platform: Hardware: Uses: Used By: Issues:
32
Take an annotated string of english text to be outputted and vocalise it. Rob Ross Festival Linux Basic sound card, speakers or headset. Speech Generator None
Speech Synthesiser (G) Summary: Contacts: Code: Platform: Hardware: Uses: Used By: Issues:
Take an annotated string of german text to be outputted and vocalise it. Rob Ross MARY Linux Basic sound card, speakers or headset. Speech Generator None
GUI Summary: Contacts: Code: Platform: Hardware: Uses: Used By: Issues:
Provide a user with a graphical interactions with the wheelchair when vocal interactions are infeasible or unwanted. Rob Ross Java Cross Display Unit Dialog Manager, Speech Generator What amount of visual interaction will this component attempt to provide.
Movement Sequencer Summary: Contacts: Code: Platform: Hardware: Uses: Used By: Issues:
Sequences high level movement descriptions into low level direction commands. Udo Frese Custom C/C++ Windows N/A The Wheelchair Controller The Dialog Manager Will this exist? Maybe this should be incorporated as a module into the Automation Controller?
I1-[OntoSpace]:D1
33
Automation Controller Summary: Contacts: Code: Platform: Hardware: Uses: Used By: Issues:
Take low level direction commands and actuate them while providing safety critical obstacle avoidance behaviours. Christian Mandel, Paolo Torrini, Udo Frese Custom C/C++ Code Windows Odometry Sensors, Front + Rear Laser Scanners Movement Sequencer
I1-[OntoSpace]:D1
4.3
34
Main Component Relationships
(DM,SG) - Dialog Manager to Speech Generator Dialog Manager sends abstract representation of speech to be output to the speech generator. (DM,GUI) - Dialog Manager to GUI Dialog Manager sends visual information to be displayed to the user to the GUI for rendering. Similarly, the GUI can send information to the Dialog Manager about the information which has been inputted by the user through the GUI. (SG,SS) - Speech Generator to Speech Synthesiser Speech generated sends a representation of speech to the vocalised to the Speech Synthesiser. Main exchange format - SABLE. (SU,DM) - Speech Understander to Dialog Manager Speech Understander sends an abstract representation of a recognised linguistic input to the Dialog Manager. (DM,SCM) - Dialog Manager to Shared Control Manager (SG,GUI) - Speech Generator to GUI Speech Generator sends GUI a representation of speech to be presented to the user. (GUI,SU) - GUI to Speech Understander GUI sends the Speech Understander a representation of textual input given by the user through the GUI. (SR,SU) - Speech Recognizer to Speech Understander Speech Recognizer sends the Speech Understander a textual representation of an utterance recognised. (DM,MS) - Dialog Manager to Movement Sequencer Dialog Manager sends the movement sequencer an abstract representation of some path to be taken. (MS,WC) - Movement Sequencer to Wheelchair Controller Wheelchair controller sends low level movement commands to the wheelchair controller. (DM,RG) - Dialog Manager to RouteGraph Dialog Manager queries RouteGraph for spatial information to resolve ambiguous user instruction.
Bibliography [1] Bryan Adams, Cynthia Breazeal, Rodney A. Brooks, and Brian Scassellati. Humanoid robots: A new kind of tool. 2000. [2] Ronald C. Arkin and Tucker R. Balch. Aura: principles and practice in review. JETAI, 9(2-3):175–189, 1997. [3] Ronald C. Arkin, Masahiro Fujita, Tsuyoshi Takagi, and Rika Hasegawa. Ethological modeling and architecture for an entertainment robot. In Proceedings of the IEEE International Conference on Robotics and Automation, Seoul, Korea, 2001. [4] Minoru Asada, Karl F. MacDorman, Hiroshi Ishiguro, and Yasuo Kuniyoshi. Cognitive developmental robotics as a new paradigm for the design of humanoid robots. Robotics and Autonomous Systems, 37:185–193, 2001. [5] M. Asado et al. Progress in robocup soccer research in 2000. In Proceedings of the 2000 International Symposium on Experimental Robotics, 2000. [6] John A. Bateman, Robert T. Kasper, Johanna D. Moore, and Richard A. Whitney. A general organization of knowledge for natural language processing: the PENMAN upper model. Technical report, USC/Information Sciences Institute, Marina del Rey, California, 1990. [7] Rainer Bischoff. Hermes - a humanoid experimental robot for mobile manipulation and exploration services. IEEE International Conference on Robotics and Automation, 2001. Video Abstract. [8] R. P. Bonasso, D. Kortenkamp, D. P. Miller, and M. Slack. Experiences with an architecture for intelligent, reactive agents. In M. Wooldridge, J.-P. M¨ uller, and M. Tambe, editors, Intelligent Agents II — Agent Theories, Architectures, and Languages (LNAI 1037), pages 187–202. Springer-Verlag: Heidelberg, Germany, 1996. [9] Johan Bos, Ewan Klein, and Tetsushi Oka. Meaningful conversation with a mobile robot. In Proceedings of the Research Note Sessions of the 10th Conference of the European Chapter of the Association for Computational Linguistics (EACL’03), 2003. [10] R. Brooks. Prospects for human level intelligence for humanoid robots, 1996. [11] R. A. Brooks and A. Flynn. Fast, cheap and out of control: A robot invasion of the solar system. Journal of the British Interplanetary Society, 42(10):478–485, October 1989. 35
I1-[OntoSpace]:D1
36
[12] Rodney A. Brooks. A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, RA-2(1):14–23, April 1986. [13] Rodney A. Brooks. Intelligence without representation. Artificial Intelligence, January 1991(1-3), 47:139–159, 1991. [14] Rodney A. Brooks. From earwigs to humans. Robotics and Autonomous Systems, 20(2):291–304, 1997. [15] R. Byrne and A. Whiten. Machiavellian Intelligence. Clarendon Pres, 1988. [16] Y. Uny Cao, Alex S. Fukunaga, and Andrew B. Kahng. Cooperative mobile robotics: Antecedents and directions. Autonomous Robots, 4:23, 1997. [17] Philip R. Cohen and Hector J. Levesque. Communicative actions for artificial agents. In Victor Lesser and Les Gasser, editors, Proceedings of the First International Conference on Multi-Agent Systems (ICMAS’95), pages 65–72, San Francisco, CA, USA, 1995. The MIT Press: Cambridge, MA, USA. [18] Rem W. Collier. Agent Factory: A Framework for the Engineering of Agent Oriented Applications. PhD thesis, University College Dublin, 2001. [19] Rem W. Collier, G.M.P. O’Hare, Terry Lowen, and C.F.B. Rooney. Beyond prototyping in the factory of the agents. In 3rd Central and Eastern European Conference on Multi-Agent Systems (CEEMAS’03), Prague, Czech Republic, 2003. [20] J. H. Connell. SSS: A hybrid architecture applied to robot navigation. In Proc. of the IEEE Int. Conf. on Robotics and Automation, Nice, France, may 1992. [21] Silvia Coradeschi and Alessandro Saffiotti. Perceptual anchoring of symbols for action. In Bernhard Nebel, editor, Proceedings of the seventeenth International Conference on Artificial Intelligence (IJCAI-01), pages 407–416, San Francisco, CA, August 4–10 2001. Morgan Kaufmann Publishers, Inc. [22] Kerstin Dautenhahn. Embodiment and interaction in socially intelligent life-like agents. Lecture Notes in Computer Science, pages 102–142, 1999. [23] Terrence W. Deacon. The Symbolic Species : The Co-Evolution of Language and the Brain. W.W. Norton & Company, 1998. [24] Daniel C. Dennett. The intentional stance. The MIT Press, Massachusetts, 1987. 388 pages, 1987. [25] Daniel C. Dennett. Kinds of Minds: Toward an Understanding of Consciousness. Basic Books, New York, 1996. [26] Gregory Dudek, Michael Jenkin, Evangelos Milios, and David Wilkes. A taxonomy for multi-agent robotics. Autonomous Robots, 3:375–397, 1996. [27] Brian R. Duffy. The Social Robot. PhD thesis, University College Dublin, nov 2000. [28] T. Estlin, R. Volpe, I.A.D. Nesnas, D. Mutz, F. Fisher, B. Engelhardt, and S. Chien. The claraty architecture for robotic autonomy. In Proceedings of the 2001 IEEE Aerospace Conference, Big Sky Montana, 2001. IEEE.
I1-[OntoSpace]:D1
37
[29] R. James Firby. Adaptive Execution in Complex Dynamic Domains. Yale University Technical Report YALEU/CSD/RR 672, New Haven, CT, January 1989. [30] R. James Firby, Roger E. Kahn, Peter N. Prokopowicz, and Michael J. Swain. An architecture for vision and action. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pages 72–79, 1995. [31] E. Gat. Integrating planning and reacting in a heterogeneous asynchronous architecture for controlling real-world mobile robots. In Proceedings of the Tenth National Conference on Artificial Intelligence (AAAI-92), pages 809–815, San Jose, CA, USA, July 1992. AAAI Press. [32] E. Gat. On three-layer architectures. In Artificial Intelligence and Mobile Robots. MIT/AAAI Press, 1997. [33] Michael P. Georgeff and Amy L. Lansky. Reactive reasoning and planning: an experiment with a mobile robot. In Proceedings of the 1987 National Conference on Artificial Intelligence (AAAI 87), pages 677–682, Seattle, Washington, July 1987. [34] Scott B. Huffman and John E. Laird. Flexibly instructable agents. Journal of Artificial Intelligence Research, 3:271–324, 1995. [35] R. A. Peters II, D.M. Wilkes, D.M. Gaines, and K. Kawamura. A software agent based control system for human-robot interaction. In Proc Second International Symposium on Humanoid Robotics, 1999. [36] K. Konolige, K. L. Myers, E. H. Ruspini, and A. Saffiotti. The Saphira architecture: A design for autonomy. Journal of experimental & theoretical artificial intelligence: JETAI, 9(1):215–235, 1997. [37] Kurt Konolige and Karen Myers. The saphira architecture for automonous mobile robots. In In AI-based Mobile Robots: Case studies of successful robot systems. MIT Press, 1996. [38] C. Ronald Kube and Hong Zhang. Collective robotic intelligence. In Second International Conference on Silulation of Adaptive Behaviour, pages 460–468, 1992. [39] Claus Ronald Kube. Collective Robotics: From Local Perception to Global Action. PhD thesis, University of Alberta, 1997. [40] A. Lankenau and T. R¨ofer. A versatile and safe mobility assistant. IEEE Robotics and Automation Magazine, 7(1):29 – 37, 2001. [41] Axel Lankenau and O. Meyer. Formal methods in robotics: Fault tree based verification. In Proc. of Quality Week Europe, 1999. [42] Douglas C. MacKenzie, Ronald C. Arkin, and Jonathan M. Cameron. Multiagent mission specification and execution. Autonomous Robots, pages 29–52, 1997. [43] Maja J Mataric. Learning in behavior-based multi-robot systems: Policies, models, and other agents. In Cognitive Systems Research, special issue on Multi-disciplinary studies of multi-agent learning, pages 81–93, apr 2001.
I1-[OntoSpace]:D1
38
[44] H. Meng, S. Busayapongchai, J. Glass, D. Goddeau, L. Hetherington, E. Hurley, C. Pao, J. Polifroni, S. Seneff, and V. Zue. WHEELS: A conversational system in the automobile classifieds domain. In Proc. ICSLP ’96, volume 1, pages 542–545, Philadelphia, PA, 1996. [45] Nils J. Nilsson. Shakey the robot. Technical Report 323, AI Center, SRI International, 333 Ravenswood Ave., Menlo Park, CA 94025, Apr 1984. [46] R.T. Pack, D. M. Wilkes, G. Biswas, and K.Kawamura. Intelligent machine architecture for object-based system integration. In Proceedings of the 1997 IEEE/ASME International Conference on Advanced Intelligent Mechatronics, Waseda University, Japan, june 1997. [47] Rolf Pfeifer. Cognition - perspectives from autonomous agents. Robotics and Autonomous Systems, 15(1-2):47–70, 1995. [48] Anand S. Rao and Michael P. Georgeff. Modeling rational agents within a BDIarchitecture. In James Allen, Richard Fikes, and Erik Sandewall, editors, Proceedings of the 2nd International Conference on Principles of Knowledge Representation and Reasoning (KR’91), pages 473–484. Morgan Kaufmann publishers Inc.: San Mateo, CA, USA, April 1991. [49] Anand S. Rao and Michael P. Georgeff. BDI agents: from theory to practice. In Victor Lesser, editor, Proceedings of the First International Conference on MultiAgent Systems (ICMAS’95), pages 312–319, San Francisco, CA, USA, 1995. The MIT Press: Cambridge, MA, USA. [50] C. F. B. Rooney, R. P. S. O’Donoghue, B. R. Duffy, G. M. P. O’Hare, and R. W. Collier. The social robotic architecture: Towards sociality in a real world domain. In Towards Intelligent Mobile Robotics, Bristol, England, 1999. [51] Robert Ross. Marc - applying multiagent systems to service robot control. Master’s thesis, University College Dublin, 2004. [52] Robert Ross, Rem Collier, and G.M.P. O’Hare. Af-apl: Bridging princples & practices in agent oriented languages. In Proc. The Second International Workshop on Programming Multiagent Systems Languages and tools (PROMAS 2004). Held at AAMAS‘04, New York, USA, 2004. [53] John Searle. Speech Acts. Cambridge Univesity Press, Cambridge, England, 1969. [54] Luc Steels. The artificial life roots of artificial intelligence. Artificial Life Journal, 1(1):89–125, 1994. [55] Luc Steels. The origins of syntax in visually grounded robotic agents. Artificial Intelligence, 103(1–2):133–156, 1998. [56] John K Tsotsos. Behaviorist intelligence and the scaling problem. Artificial Intelligence, June 1995, 75:135–160, 1995. [57] S. Werner and B. Krieg-Br¨ uckner. Modelling navigational knowledge by route graphs. In C. Freksa, C. Habel, and K.F. Wender, editors, Spatial Cognition II, number 1849, pages 295–317. Springer-Verlag; D-69121 Heidelberg, Germany; http://www.springer.de, 2000.