Quality Management for Mobile Robot Development - Semantic Scholar

1 downloads 0 Views 65KB Size Report
effects of a change has to be tested in many situa- tions in order ... robots are equipped with several kinds of sensors, ... lower right corner should shoot the ball.
Quality Management for Mobile Robot Development Hans-Ulrich Kobialka, Peter Schöll GMD, Schloß Birlinghoven D-53754 Sankt Augustin, Germany email:{kobialka,schoell}@gmd.de http://ais.gmd.de/BE/

Abstract Robot programs have to be tested systematically on a daily basis in order to improve robot performance in a controlled way. This requires automated support. We present a framework in which the designer can specify a number of test cases and metrics which are then executed by a simulator off-line.

1 Introduction Today, robots are no longer bound to factories but appear in open (i.e. non-deterministic form the robot’s point of view) environments, for instance, homes, museums, or soccer fields. Improving the performance of mobile robots is not straightforward. Often performance declines dramatically because of changes in the environment setting, or after changes in the program which controls the robot, the robot program. Investigations of such phenomena often reveal that the robot program is overfitted with respect to the test environment and a few test experiments. Overfitting means that the robot program contains assumptions which match perfectly to simplifications and errors contained in the test environment, and which do not show up as errors when executing the usual few test experiments. Nevertheless such assumptions significantly decrease robot performance when executed in a realistic setting [Cohe95]. Thus, tests have to be performed systematically in order to exclude trojan improvements. The effects of a change has to be tested in many situations in order to asses it correctly. A new feature may be an advantage in one situation but a disaster in another one. As experiments take some time, human observers have difficulties to assess the outcome of an experiment. It is hard to concentrate on long and fast event sequences, so the same experiment is often viewed by different designers differently. Often there are merely gradual changes in robot performance which

only show up in some statistics. Statistical performance data are hard to collect manually. Furthermore, observation by humans is too time consuming and expensive. Systematic tests on a daily basis are only feasible if they are performed automatically. We experienced these problems during the development of a team of soccer robots. These participate in the RoboCup middle size league [Bred99]. RoboCup1 is a long-term research effort of the academic and industrial community. It demonstrates scientific and engineering problems related to the development of autonomous robots playing soccer. Our robots are equipped with several kinds of sensors, including a camera, distance sensors, and bump detectors. Robot programs are specified using a mathematical model, called Dual Dynamics (DD). The basic rationale behind Dual Dynamics is that a situated intelligent agent organizes its behavior according to behavioral modes. Behavior modes are specified through differential equations [Jaeg98]. Our approach to systematic quality management for mobile robots is twofold: First, we define test cases [Myer79] and collect metrics about robot performance. Metrics are computations of performance data which express some degree of fitness. Metrics have to be carefully chosen according to the goals of a robot program in order to measure improvements in an objective way [Möll93]. Second, we let our simulator, called DDSim, perform a suite of test cases automatically. Afterwards, the designer can look at the collected statistics and metrics, and replay some interesting experiments without the fear of forgetting important tests. The interface of the simulator and the run time environment on the real robot is exactly the same. A 1. http://www.robocup.org

robot program which has passed all test cases on the simulator can be executed on the physical robot instantly without any change of the program. Of course, the performance of the robot may differ from the simulation.

2 Off-line Assessment of a Robot Program The DDSim simulator reads the number of test case specifications from a file, executes them and writes the results into an output file. Each test case specification consists of • its name, • the duration of the simulation, • the start situation, and • the formula to be used to compute the performance metric at the end of the test run. We distinguish between single robot test cases (2.1) and team test cases (2.3).

Figure 1: A single robot test case: The robot in the lower right corner should shoot the ball into the goal on the right. The two other robots are just obstacles to be avoided.

2.1 Single Robot Test Cases A single robot test case describes a challenge for a robot, for instance, kicking the ball into the goal, or getting the ball out of the own penalty area. The challenge can be aggravated by obstacles (modeled as static, non-moving robots). For example, a typical goal scoring challenge is displayed in figure 1. The metric for scoring is simple: the faster the better. As the time limit is 60 seconds, the metric 1 / minutes_needed_to_score_a_goal

# test # type single single

name

duration (sec) robotPos

gleft180 60 cUpperLeft 30

800,50,180 100,100,0

will yield 1, if the robot misses, and greater results the faster the robot scores. For getting the ball out of the corner in the own half of the field, the metric is simply the distance between the ball and the own goal (which is equivalent to the x coordinate of the ball). Examples for both kinds of test cases are contained in figure 2.

ballPos Obstacles(40x40cm)

Metric

450,225 {600,150 750,200} {1 / $min_needed_to_score} 15,450 {20,300} {$Ball_PosX_after}

# name blue yellow heading (odometry) sensorNoise optimalnoise 1000,3 900,2 0,3,10 # Every 1000 secs the robot erroneously sees a blue goal for 3 secs. # Every 900 secs the robot erroneously sees a yellow goal for 2 secs # When driving straight there is no error (0%). # When turns sum up to 90 degrees, there is a potential error of 3%. # When the robot bumps against an obstacle, there is a potential error of 10%. sensorNoise badVision 100,10 95,10 0,5,10 # Every 100 secs the robot erroneously sees a blue goal for 10 secs. # Every 95 secs the robot erroneously sees a yellow goal for 10 secs

Figure 2: Two single robot test cases (named gleft180 and cUpperLeft) and two configurations of sensor noise (optimalnoise, badVision).

striker1b striker3 striker1b striker3 striker1b striker3 striker1b striker3

optimalnoise optimalnoise optimalnoise optimalnoise badVision badVision badVision badVision

gleft180 gleft180 cUpperLeft cUpperLeft gleft180 gleft180 cUpperLeft cUpperLeft

0.217 0.136 40 122 0.211 0.667 56 98

4.608 7.353 40 122 4.739 1.499 56 98

Figure 3: Output after testing two robot programs (striker1b, striker3) with the noise configurations and test cases displayed in figure 2. The last value in each row is the computed metric. We see that striker3 performs better with optimal noise, but its performance is worse when scoring with a bad vision.

2.2 Sensor noise The simulator also contains statistical models for sensor errors (“sensor noise”). Noise can vary a lot depending on the current situation. For instance, when driving straight there is less odometry error compared to turns or when bumping into obstacles. In case of bad light conditions, the detection of yellow and blue becomes very unreliable. For off-line testing, we define several configurations of sensor noise to be used (see figure 2). Comparing two versions of a robot program (striker1b and striker3) while using the test cases and noise configurations listed in figure 2, yields a test output as shown in figure 3.

robot. Team behaviors have to be tested using team test cases. For a team test case, several robots and their properties are defined including the name of the team to which this robot belongs, and the robot program which should be executed by this robot (see figure 4). The ability of DDSim to simulate robots executing different robot programs enables • different roles within a team (e.g. goalie, defender, striker), • soccer games between an old robot program and an improved one, and • simulation of anticipated strategies of other teams participating in the RoboCup tournament. For a game, we define the teams which should play, the color of their target goals, the duration of the game, the ball position at kickoff time, and the information about the game which should be printed to the log file. Figure 4 shows a team test case for a regular game as defined in the RoboCup rules (each team has 4 robots. They play two halves

2.3 Team Test Cases Beside individual skills, team behaviors will become increasingly important. Similar to human soccer players, passing the ball to a team member is often the best way to do when facing an opponent # #

team name

robot name of no robot program

start pos defence

start pos kickoff

robot robot robot robot

team1 team1 team1 team1

1 2 3 4

goalie2a striker1b striker1b striker1b

25,250,0 250,300,0 350,150,45 250,200,0

25,250,0 350,300,0 400,200,45 250,200,0

robot robot robot robot

team2 team2 team2 team2

1 2 3 4

goalie2a defend4 striker3 striker3

25,250,0 250,300,0 350,150,45 250,200,0

25,250,0 350,300,0 400,250,0 250,200,0

# #

target target blue yellow

game

team1

team2

600

450,225

goalShooters fouls ballIsStuck_count

game

team2

team1

600

450,225

goalShooters fouls ballIsStuck_count

duration (secs)

ball kickoff position output

Figure 4: Specification of two teams and two test cases (i.e. games) between these teams.

# 2nd half

of 10 minutes each). Using a different kickoff position for the ball, a shorter duration, and less robots, we can simulate more concrete game situations, for instance, passing, dribbling, or a penalty situation.

3 Change of the Test Environment Test cases, sensor models, and the simulator itself are subject of change too. The simulator has to be adapted continuously in order to get it close to the driving performance of the real robot. If the motor controller is changed, if stronger motors are used, if additional weight is put on the robot, or if other kind of wheels are mounted - each of these possibly leads to quite different driving characteristics. It happens that the robot program is overfitted with respect to the sensor models and the simulator. Thus, changes in the sensor models and the simulator can decrease the performance of the robot program, or can yield better assessment metrics when older, non-overfitted versions of the robot program are used. Therefore, we keep several previous versions of robot programs for re-evaluation. If a sensor model or the simulator changes we select several archived robot programs and compare their performance with the current one.

4 Conclusions and Future Work The use of quality management techniques like metrics and test cases is mandatory for professional system development. When developing mobile robots acting in open environments, the problem is more severe because there is no complete specification of the environment to test against. Defensive strategies (e.g. stop the robot in case of unclear situations) do not promise any success in competitive settings, like a RoboCup soccer game. We have to try maneuvers which we do not fully understand at the time of coding. Thus, robot programs have to be tested systematically on a daily basis. We have developed a simulator for soccer robots and a test framework for off-line execution of test cases. The currently used test suite consists off 11 single robot test cases, 2 team test cases, and 2 senor noise configurations. We usually compare a changed robot program with its recently released predecessor. This results in 52 test runs which last 2 hours and 22 minutes in total when executed by our simulator on a laptop computer. Additional test

cases can be easily added and still a release test can be performed in a reasonable time, on several computers in parallel if necessary. Smaller sets of test cases are used for quick online tests. This automated test support was just recently established. There are several obvious improvements to be made, for instance, the specification of expected results and some mechanisms to signal unexpected results like a significant decrease in performance. As the number of robot programs, test cases, and sensor configurations will grow, we need selection and visualization techniques and appropriate test plans. In the long run, we wish to use empirical methods for explanation of how the external behavior of robots relate to the internal parameters and states of the executed robot program [Cohe95]. Our test environment will generate the statistical data for this purpose.

References [Bred99] A. Bredenfeld, W. Göhring, H. Günter, H. Jaeger, H.U. Kobialka, P.-G. Plöger, P. Schöll, A. Siegberg, A. Streit, C. Verbeek, and J. Wilberg. “Behavior engineering with “dual dynamics” models and design tools.” In M.M. Veloso, editor, Proc. 3rd Int. Workshop on RoboCup at IJCAI-99, pages 57–62. IJCAI Press, 1999. [Cohe95] Paul R. Cohen. Empirical Methods for Artificial Intelligence. MIT Press, 1995. [Jaeg98] Herbert Jaeger and Thomas Christaller. “Dual dynamics: Designing behavior systems for autonomous robots.” Artificial Life and Robotics, 2:108–112, 1998. http://www.gmd.de/People/Herbert.Jaeger/Publications.html. [Möll93] Karl Heinrich Möller and Daniel J. Paulish. Software Metrics: A practitioner’s guide to improved product development. Chapman & Hall, 1993. [Myer79] Glenford J. Myers. The Art of Software Testing. John Wiley & Sons, New York, 1979.

Suggest Documents