Model Learning for Robot Control: A Survey

Cognitive Science (submitted) manuscript No. (will be inserted by the editor)

Model Learning for Robot Control: A Survey Duy Nguyen-Tuong · Jan Peters

the date of receipt and acceptance should be inserted later

Abstract Models are among the most essential tools in robotics, such as kinematics and dynamics models of the robot’s own body and controllable external objects. It is widely believed that intelligent mammals also rely on internal models in order to generate their actions. However, while classical robotics relies on manually generated models that are based on human insights into physics, future autonomous, cognitive robots need to be able to automatically generate models that are based on information which is extracted from the data streams accessible to the robot. In this paper, we survey the progress in model learning with a strong focus on robot control on a kinematic as well as dynamical level. Here, a model describes essential information about the behavior of the environment and the influence of an agent on this environment. In the context of model based learning control, we view the model from three different perspectives. First, we need to study the different possible model learning architectures for robotics. Second, we discuss what kind of problems these architecture and the domain of robotics imply for the applicable learning methods. From this discussion, we deduce future directions of real-time learning algorithms. Third, we show where these scenarios have been used successfully in several case studies. Keywords Model learning · Robot Control · Machine Learning · Regression 1 Introduction Machine learning may allow avoiding the need to pre-program all possible scenarios, but rather learns the system during operation. There have been many attempts at creating learning frameworks, enabling robots to autonomously learn complex skills ranging from task imitation to motor control [132, 175, 133]. However, learning is not an easy task. For example, reinforcement learning can require more trials and data D. Nguyen-Tuong · J. Peters Max-Planck Institute for Biological Cybernetics Spemannstrasse 38 72076 T¨ ubingen, Germany Tel.: +49-7071-601-585 E-mail: [email protected] E-mail: [email protected]

ound d on wn of n the very. ation pped uture which ance, rch is bots. eling k, to other n this

rmatereo from p can ry as type pe is Both k we lopes ) can r we ction,

Mrinal Kalakrishnan∗ , Jonas Buchli∗ , Peter Pastor∗ , and Stefan Schaal∗†‡

!"#

∗ Computer

Science, University of Southern California, Los Angeles, CA 90089 USA

%#!!

Neuroscience and Biomedical Engineering, University of Southern California, Los Angeles, CA 90089 USA Fig. 1. The Mars Exploration Rover ‘Opportunity’ trapped in the ‘Purgatory’ ‡ ATR Computational @=