Cerberus'08 2-Legged Team Description Paper - Semantic Scholar

Cerberus’08 2-Legged Team Description Paper H. Levent Akın1 , Çetin Meriçli1 , Tekin Meriçli1 , Barıs¸ Gökçe1 , Can Kavaklıo˘glu1 , 1 ¨ Ergin Ozkucur and Olcay T. Yıldız2 1

Bo˘gaziçi University, Dept. of Computer Engineering, Bebek, Istanbul, TURKEY 2 Is¸ık University, Dept. of Computer Engineering, Sile, Istanbul, TURKEY {akin, cetin.mericli, tekin.mericli, sozbilir}@boun.edu.tr, [email protected], [email protected], [email protected]

1 Introduction “Cerberus” has been competing in the 4-Legged league since RoboCup 2001 competition [1]. Bo˘gaziçi University has a strong research group in AI. The introduction of Robocup as a unifying theme for different areas of study in autonomous robots has attracted many talented students and accelerated research efforts with more than 40 journal and conference papers. Until 2008, the department had teams both in Robocup four legged and rescue simulation leagues. Together with the introduction of the new humanoid robot Nao as the new standard platform as a replacement for Aibo, the team now has a new challenge to make the robots play soccer in a more “humanlike” manner and the opportunity to test the methods developed for bipedal walking and control against other teams. Bo˘gaziçi University is collaborating with Is¸ık University. The variety of robotic platforms that the team has worked on so far also brought the necessity to develop a platform-independent robotics library details of which will be also provided in this document. The rest of the document provides information about research done by our team in various fields of AI and robotics which we apply to the new platform for playing soccer such as, vision, localization, locomotion, and planning and coordination/cooperation.

2 Software Architecture We have a general library architecture. The code repository is separated into components called BOUNLib, Cerberus Player, and Cerberus Station. 2.1

BOUNLib

We have collected the general parts of our code base in a library structure called BOUNLib. Using this library will enable us to easily write code for different platforms or different robots by reusing most of our code base. BOUNLib library includes a versatile input output interface, called BOUNio, providing essential connectivity services to the higher level processes such as reliable UDP protocol, file logging and TCP connections. Connections are made seamlessly to the sender, thus there is no need to write specific code to any application or test case.

2.2

Cerberus Station

Using BOUNio library enabled us to implement a very general version of our previous Cerberus Station using Trolltech’s Qt Development Framework [2]. Using the well structured architecture of our runtime code and Cerberus Station, it is very easy to test any new features to be added to the robot, which is a very vital resource for any experiment. Cerberus Station is designed to have the same features of old Cerberus Station and more, mainly aimed at visualizing the new library based code repository which helps in debugging components both on-line and off-line. 2.3

Cerberus Player

The Cerberus Player code base is also being switched to a more robust library structure instead of the previous static modular approach. The new design consists of common, vision, and planner libraries together with robot specific elements. – Common library is dedicated to containing robot data and associated primitive functions such as serialization/deserialization. The primary element of the Common library is the Robot class, which defines the data elements a robot needs to keep track of at run time. Using this single source of data greatly simplifies some of the architectural constraints. Other elements of the Common library include sensor, vision and configuration related classes. – Vision library is designed to accommodate multiple vision algorithms seamlessly. Using such a standard interface provides a great visual research platform, where any new algorithm can be tested easily. – Planner library is refactored with the new architecture. Having a Planner library enables us to use the implemented planning facilities on other platforms as well. – Robot specific elements provide the bindings between the library components and the robot hardware for a specific platform. • Core Object coordinates communication and synchronization between the other OPEN-R objects. All other OPEN-R objects are connected to it. The Core Object takes camera image as its main input and sets corresponding actuator commands within the Robot object. It can be thought as the container and hardware interface of Vision, Localization and Planner modules. This combination is chosen because of the execution sequence of these modules. All of them are executed for each received camera frame and there is an input-output dependency and execution sequence vision→localization→planner. It is pruned and better organized with the use of the libraries. However robot specific elements are still in the Core Object as a design decision of the current architecture. • Communication Object is responsible for receiving game data from the game controller and for managing robot-robot communication since they both use UDP as the communication protocol. Since communication is also robot and platform specific, communication module is also left in the Robot specific elements.

• Dock Object manages the communication between a robot and the Cerberus Station. It redirects the received messages to Core Object and sends the debug messages to the station. Dock object uses TCP to send and receive serialized messages to and from Cerberus Station.

3 Vision The vision process starts with receiving a camera frame and ends with a calculated egocentric world model consisting of a collection of visual percepts as shown in Fig. 1.

Fig. 1. Phases of image processing. a) Original image, b) Color classified image, c) Found blobs, d) Perceived objects e) Egocentric view

3.1

Color Classification

We use the Generalized Regression Network approach [3] for color classification. Training process is as follows: First, a set of images are hand labeled with the proper colors. Then, classifiers are trained with the labeled data but instead of using only Y, U, V triplet, an extra dimension indicating the euclidean distance of that pixel from the center of the image is also used. After the training phase, the network is simulated for the input space to generate a color lookup table for four bits (16 levels) of Y, six bits (64 levels) of U, six bits of V and three bits (eight levels) of the radius. The resultant color lookup table is very robust both to luminance changes and allows our vision system to work without using any kind of extra lights other than the standard ceiling fluorescents. With

the introduction of distance component, effect of the radial distortion is drastically reduced. According to our measurements, we can play reasonably at an illumination level of 150-200 lux. 3.2

Object detection

The classified image is processed for obtaining blobs. Instead of using run length encoding (RLE), we use an optimized region growing algorithm that performs both connected component finding and region building operations at the same time. This algorithm works nearly two times faster. Since the robot must turn its head in order to expand its field of view, it is necessary to rotate the obtained image according to the actual position of the head. However, since this is a very expensive operation, typically only the identified regions are rotated. We use octagons for bounding regions since they are more appropriate for rotation than boxes and reduces the information loss due to rotation. We employ a very efficient partial circle fit algorithm for detecting partially occluded balls and the balls which are on the borders of the image. Since accuracy in the estimation of ball distance and orientation is needed mostly in cases when the ball is very close and it is so often that the ball can only be seen partially in such cases, having a cheap and accurate ball perception algorithm is a must. The line perception process is an important part of the vision module, since it provides important information for the localization module. We use a Hough Transform based line perception algorithm. This year, we have re-implemented our new goal perceptor using a scan based method. The perception process consist of two stages: scanning and voting. 3.3

Obstacle perception

A general purpose scan line code base allows us to code a Visual Sonar [4], which provides us a robust obstacle perception, giving us information about the robot’s immediate surroundings. Currently we are experimenting on combining Visual Sonar data from multiple robots to further increase precision of our Visual Sonar implementation. 3.4

World Modeling

The vision module provides instantaneous but noisy data. However, we need smoother pose information and knowledge about our past perceptions in higher levels of planning. This is a state estimation problem and there are many applications in the literature for linear or non-linear models with different assumptions like Gaussian or distributionfree uncertainty approximation methods. Previously we used MyEnvironment [5] for this purpose. MyEnvironment keeps a window over the past observations and estimates the state by using statistics over the history. Starting from 2007, we switched to Kalman Filtering (KF) and Extended Kalman Filtering (EKF) methods [6] for tracking the objects around the robot. First of all, we grouped objects in the environment as static and dynamic objects. For static objects, (beacons, goals, goal-bars, etc.) we used only KF, and for dynamic

objects we used both KF and EKF. To estimate the state, an EKF is set up and an observation and a transition model is defined. We use two filters on the dynamic objects since it is difficult to estimate both the location and the velocity of the object at the same time. Therefore, we utilized one filter for estimating the location and the other for estimating the velocity. The pose of the robot, the position, direction, and velocity of the ball, and the distance and orientation of the opponent goal are estimated through this egocentric world model. In our implementation of the market algorithm, which will be explained in Section 5, the robots share their egocentric world models. Each robot fuses its own world model and the world models coming from its teammates, and forms a shared world model.

4 Localization Cerberus employs three different localization engines. – Simple Localization (S-LOC)is an in house developed localization module. It keeps the history of the percepts seen and modifies the history according to the received odometry feedback. The perception update of the S-Loc depends on the perception of landmarks and the previous pose estimate. – X-MCL is a vision based Monte Carlo Localization with a set of practical extensions [7]. – Reverse Monte Carlo Localization (R-MCL) is a contribution of our lab to the literature [8]. This is a hybrid method based on both Markov Localization (ML) and Monte Carlo Localization (MCL) where the ML module finds the region where the robot should be and MCL predicts the geometrical location with high precision by selecting samples in this region. The method is very robust and requires less computational power and memory compared to similar approaches and is accurate enough for high level decision making which is vital for robot soccer. 4.1

Multi-Robot Localization

We have introduced a new method for multi-robot localization which can be applied to more than two robots without identifying each of them individually [9]. The Monte Carlo localization for single robot case is extended using negative landmark information and shared belief state in addition to perception. A robot perceives a teammate and broadcasts its observation without identifying the teammate, and whenever a robot receives an observation, the observation is processed only if the robot decides that the observation concerns itself. We have demonstrated successful robot identification and localization results in different scenarios where it is impossible for a single robot to localize precisely and the shared information is ambiguous.

5 Planning and Cooperation Planning is also a major research area of the Cerberus team. The soccer domain is a continuous environment, but the robots operate in discrete time steps. At each time

step, the environment and the robots own states change. The planner keeps track of those changes, and makes decisions about the new actions. The main aim of the planner should be to model the environment appropriately and to update its status accordingly. The planner should also provide control inputs based on to this model. In robot soccer, the players make both individual plans (single-agent planning) and plans as a team (multi-agent planning). The rest of this section provides the details of our approach to these two types of planning. 5.1

Single-Agent Planning

We use hand coded Finite State Automata based planners which generate the necessary behaviors. We are also doing research on several approaches for generating planners in an autonomous fashion. – POMDP Based Approach: Recently we extended POMDP algorithms to continuous domains [10] where we combine ART-2A, Kalman Filter, and Q-learning. We used this approach to generate behaviors like approach ball and kick the ball toward the goal [11]. We will use this approach to generate more complex behaviors to be used by the planners. – Learning Based Approaches: The robots learn both individual skills and collaboration, which emerge from very primitive behaviors. The action space is designed in such a way to obtain more complex behaviors by combining simple actions, similar to the subsumption architecture. The simple actions include • going toward the opponent goal, • going toward the own goal, • going toward the ball and a specific target, 5.2

Multi-Agent Planning

Soccer is a team game, hence the decision of an individual soccer player is affected by all its teammates decisions. This makes multi-agent planning an important research field for robotic soccer. We have been working on this problem by pursuing several different approaches. – Metric Definition and Validation: Regardless of the planning algorithm to be used, one needs to have some quantitatively measured and informative metrics. We developed a set of novel metrics based on the positions of players and the ball over time and a novel statistical validation technique for testing whether a proposed metric is informative or not [12]. Currently we are investigating the opportunities to extend the immediate reward mechanism of reinforcement learning framework for using it as an evaluation testbed for metric validation. – Market Based Approach: We are using a market-driven multi-agent collaboration algorithm [13] for dynamic role assignment of the players. Once assigned, the agent starts making decisions according to its role until its role changes. The marketdriven method for multi-agent coordination employs the properties of free markets, in which every individual tries to maximize its own profit, hence maximizing the

systems total profit. The overall goal of the system is decomposed into smaller tasks and an auction is performed for each of these tasks. In each auction, the participant agents calculate their estimated cost for accomplishing that task and offer a price to the auctioneer. At the end of the auction, the bidder with the lowest offered price is given the right to execute the task and receives its revenue on behalf of the auctioneer. In order to implement this strategy, a revenue and a cost function must be defined. Then, the net profit can be calculated based on the difference between the values of the revenue and the cost functions. We proposed this method for multi-robot collaboration in robot soccer domain. We defined attacker, defender, and midfielder roles as tasks and we run an auction mechanism for each task. The robots calculate their costs for the tasks and participate in auctions. The auctioning is done with the help of our shared world model structure. – DEC-POMDP Approach: Decentralized POMDP (DEC-POMDP) is an extended version of POMDP where there are multiple agents in the environment. It has recently been shown that solving DEC-POMDP problems is NEXP-complete, while solving POMDP is PSPACE-complete. We have developed an ES-based approximate solution approach for this purpose [14]. We are currently extending this to continuous states which will enable us to use them in realistic robotic applications like robot soccer.

6 Locomotion Our team started doing research on bipedal walking at the beginning of 2007. As in all the other robotic platforms that we have worked on so far, our goal with our humanoid robots has been to develop a platform-independent software architecture that will work both in the simulation environment and on the real robot. We develop our algorithms on the simulator, basically a parametric Central Pattern Generator (CPG) [15] based bipedal walking engine, and we tune the engine on the real Nao. Each joint is controlled by a CPG, and these CPGs are coupled in such a way to produce harmonious motions and make the robot move in the desired direction with a desired velocity. Omnidirectional walking capability is very important for a soccer playing robot; hence, our walking engine is designed to handle such flexibility by interpolating between the signals obtained for discrete walking directions according to the requested motion command. Also, sensory feedback has a vital importance to keep the robot stable since the gait is optimized for the cases where the surface is flat and there is no obstruction or pushing, and the robot is very likely to collide with other players on the field and lose its balance. Therefore, the signals produced by the CPGs are modified according to the sensory feedback in order to compansate for the perturbations. Acknowledgements This work is supported by Bo˘gaziçi University Research Fund through project 06HA102 and TUBITAK Project 106E172.

References 1. Akın, H. L., “Managing an Autonomous Robot Team: The Cerberus Team Case Study,” International Journal of Human-friendly Welfare Robotic Systems, Vol. 6, No. 2, pp-35-40, 2005. 2. Qt: Cross-Platform Rich Client Development Framework http://trolltech.com/products/qt 3. Schioler, H. and Hartmann, U., “Mapping Neural Network Derived from the Parzen Window Estimator”, Neural Networks, 5, pp. 903-909, 1992. 4. Lenser, S.; Veloso, M. “Visual sonar: fast obstacle avoidance using monocular vision” Proceedings. 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2003. 5. Kose, H., , B. C ¸ elik and H. L. Akın,, “Comparison of Localization Methods for a Robot Soccer Team”, International Journal of Advanced Robotic Systems, Vol. 3, No. 4, pp.295-302, 2006. 6. Welch, G. and G. Bishop. An introduction to the kalman filter. Technical report, Chapel Hill, NC, USA, 1995. 7. Kaplan, K. B. C ¸ elik, T. Meriçli, C ¸ . Mericli and H. L. Akın, “Practical Extensions to VisionBased Monte Carlo Localization Methods for Robot Soccer Domain” RoboCup 2005: Robot Soccer World Cup IX, A. Bredenfeld, A. Jacoff, I. Noda, Y. Takahashi (Eds.), LNCS Vol. 4020, pp. 420 - 427, 2006. 8. Kose, H. and H. L. Akın,“The Reverse Monte Carlo Localization Algorithm,” Robotics and Autonomous Systems, Vol. 55, No. 6, pp. 480-489, 2007. 9. Kurt, B., E. ¨’Ozkucur and H. L. Akın, “A Collaborative Multi-Robot Localization Method Without Robot Identification,” RoboCup International Symposium 2008, Suzhou, China, July 15-18, 2008. (Accepted) 10. Sardag, A. and H. L. Akın, “Kalman based finite state controller for partially observable domains,” International Journal of Advanced Robotic Systems, 3(4):331.342, 2006. 11. Sezen, D., Implementation of continuous pomdp algorithms on autonomous robots. M. Sc. Thesis, Bogazici University, 2008. 12. Mericli, C. and H. L. Akın, “A Layered Metric Definition and Validation Framework for Multirobot Systems,” RoboCup International Symposium 2008, Suzhou, China, July 15-18, 2008. (Accepted) 13. Kose, H., K. Kaplan, Ç. Meriçli, U. Tatlıdede, and L. Akın,“Market-Driven Multi-Agent Collaboration in Robot Soccer Domain”, in V. Kordic, A. Lazinica and M. Merdan (Eds.), Cutting Edge Robotics, pp.407-416, pIV pro literatur Verlag, 2005. 14. Eker, B. and H. L. Akn, “Solving dec-pomdp problems using evolution strategies,” Soft Computing, 2007 (Submitted). 15. Pinto, C. and M. Golubitsky, M., “Central pattern generators for bipedal locomotion,” J Math Biol, 2006.