Research Article
Behaviour coordinations and motion synchronizations for humanoid robot
International Journal of Advanced Robotic Systems September-October 2017: 1–15 ª The Author(s) 2017 DOI: 10.1177/1729881417728453 journals.sagepub.com/home/arx
SP Parasuraman1, Phua Seong Hock2, MKA Ahamed Khan2, D Kingsly Jeba Singh3 and Chin Yun Han1
Abstract Many features have to be solved by humanoid robot during soccer game to get evidences from the environment such as detect ball, goal, lines and other robotmates. Having these data, the robot has to self-localize and proceed for next action reactively and ensure sense–think–act process efficiently. Sense–think–act processes are still a challenge task for humanoid robots. Hence, a modular framework is proposed for soccer ball game in which the architecture is mainly composed of object detection, field detection and motion synchronization behaviours. Object detection is modularized into ball detection, segmentation and depth estimation to facilitate the control actions. Similarly, field detection is modularized into goalpost and boundaries detection. Motion synchronization is modularized into primitives such as scoring, kip up and diving which uses the proposed support polygon and centre of moment methods. The behaviour synchronization and execution takes place in multilayers which include player and keeper mode as expert layer, modular behaviours as reactive layers and servo and motor command are executed in skill layer. The behaviour analysis and performance are targeted on the trigonometric depth estimation, grid-based segmentation pattern learning and recognition as well as support polygon and Centre Of Mass (COM). Experimental results are demonstrated and discussed. The proposed modular framework in this work has been tested using the NAO robot. Keywords Humanoid robot, motion synchronization, humanoid robot soccer, colour blob segmentation, depth estimation Date received: 4 May 2016; accepted: 5 July 2017 Topic: Humanoid Robotics Topic Editor: Chrystopher L Nehaniv
Introduction The challenges in the robot soccer league are becoming tougher every year and the competitive level of the humanoid team has risen. Mostly, the developments are gone through many stages in mechanical structure, hardware designs, electronics control and related software to fulfil the requirements of the robot competitions.1–3 A simple control systems are infeasible for satisfying the requirements of soccer game. The need arises to migrate to a more powerful modular-based control scheme to endow the robot with a high computational capability. There are embedded solutions which use popular control schemes such as distributed embedded control architecture and locomotion control scheme.3–5 These schemes support network communications, mapping and localization.
The limitations are the behaviour coordination, analysis of camera’s images, stability during kicking the ball and goalie diving and so on. Still the work needs to focus on better understanding on how information is transferred between joints and gaits transition and locomotion. Several
1
School of Engineering, Monash University Malaysia, Petaling Jaya, Selangor, Malaysia 2 Faculty of Engineering, University Selangor, Malaysia 3 Mechanical Engineering, SRM University, Chennai, India Corresponding author: S Parasuraman, School of Engineering, Monash University Malaysia, Petaling Jaya, Selangor 46300, Malaysia. Email:
[email protected]
Creative Commons CC BY: This article is distributed under the terms of the Creative Commons Attribution 4.0 License (http://www.creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/ open-access-at-sage).
2 modular-based behaviour approaches have been tested in wheeled robots,6–8 whereas these schemes are not prevalent in humanoid robots. The differences with regard to the proposed approach are the separation of robot task in to a modular-based behaviour scheme which are broken into layers. This kind of behaviour decomposition for humanoid robot helps to monitor the behaviour easily while performing tasks. New technical challenges have included9,10 later in 2013 aiming at making visual perception and localization more realistic by providing additional landmarks in the environments. NimbRo-OP [VERSION 15-04]9 was introduced in 2014, which had a big impact on the humanoid leagues that was able to adapt so many behaviours such as walk and kick a ball and recovery from a fall out of box quickly. BHuman10 is the Standard Platform League, which provides the suitable communication protocols and infrastructure that allows to adopt to design and build the robot behaviours. Furthermore, even higher level functionality in the software (e.g. localization, vision and behaviour coordination) is implemented using different and often custom middleware. There are now several initiatives to implement soccer robot middleware for important modules such as vision, localization, walking engine and communication. The robot operating system is a popular candidate to simplify interoperability by software developed by different teams.11 Many features have to be solved by robot during soccer game to get evidences from the environment such as detect ball, goal, lines and the other robots. Having this information, the robot has to self-localize and proceed for next action reactively and ensure sense-think-act process efficiently. In order to satisfy these requirements, a modularbased framework for soccer ball game is proposed in which the architecture is mainly composed of object detection followed by field detection and motion synchronizations. Object detection is modularized into ball detection, segmentation and depth estimation to facilitate the control actions. Similarly, field detection is modularized into goalpost and boundaries detection. Motion synchronization is modularized into primitives such as scoring, kip up and diving which uses the proposed support polygon and centre of moments methods. The behaviour synchronization and control takes place in three layers such as expert level for player and keeper mode (Appendix-4, Figure-4A) control, reactive layer for modular behaviours and skill layer for motor command executions. Behaviour analysis and performance are targeted on the trigonometric depth estimation, grid-based segmentation pattern learning and recognition as well as support polygon and COM, and their results are demonstrated and discussed. The proposed modular framework in this work has been tested using the NAO robot. There are two general aspects needed to make NAO robot to play soccer: motion and vision. In soccer game, omnidirectional walking is crucial and robot’s walking uses
International Journal of Advanced Robotic Systems a simple dynamic model of linear inverse pendulum and quadratic programming. The joint sensors will provide the feedback to stabilize the robot which makes walking robust and resistant to small disturbances, and torso oscillations in the frontal and lateral planes are absorbed.12 For walking gait, the robot uses three-dimensional linear inverted pendulum mode and generates the trajectory for a biped walk and foot planner. This gives the COP and foot trajectory.13 As a vision, the NAO robot sees through two 920p cameras which capture up to 30 fps and are located on robot’s forehead as well as at mouth level. These cameras are used to detect the ball, goalpost as well as the field lines using model and feature-based vision. 14 Once the object is detected, the distance to the object can be calculated using triangulation method.15 Therefore, the vision and stability of robot is very important in a soccer game. For robot to be able to play simple soccer game, robot needs to be able to detect the soccer ball as well as knowing the position of the soccer ball relative to robot. To detect the goalpost as well as knowing the direction of the goalpost relative to robot, motion behaviours are required for robot to score a goal and to stand back up after a fall. In order for robot to detect the soccer ball, colour blob segmentation method is used. Blob detection is carried out through the computation of local maxima or minima of some normalized derivatives of the linear scale-space image representation which cannot be obtained from edge detectors or corner detectors.16 Blob detection is widely used because it is fast and simple method. These regions of interest are then further processed in colour histogram analysis where the blob descriptors are used for peak detection for a certain specified range of colour which can be used to segment the region of interest from the image. The input image must be of HSV format where the colour image will be separated into individual H, S and V components. Thresholds are then selected based on the H, S and V values of the colour of interest.17 To know the position of the soccer ball relative to robot, depth estimation using trigonometric properties method can be used. The object distance from the camera can be calculated without the need to know the intrinsic parameters of the camera such as the focal length of the camera. A single image can be obtained from the camera and processed to detect and segment the desired object to obtain a blob area. This blob area is then tracked by the camera in two DOF to ensure that the blob area is always at the centre of the image. The angle of the camera view to the horizontal line and vertical height of the camera are known data. Hence, the depth information of the object can be calculated based on the trigonometric principle of right angle triangle.18 The pattern recognition can be used via feature points to detect the goalpost. Images taken at different angles will have different features and these features can be extracted from the image using scale invariant feature transform (SIFT) where the image data are transformed into
Parasuraman et al.
3
Figure 2. Trigonometric properties to determine depth.
Figure 1. Picture explanation of colour blob segmentation.
scale-invariant coordinates relative to local features. For image matching and pattern recognition, SIFT features are extracted from a set of reference images. These images then stored in a database where patterns to be recognized and prepared as reference SIFT features. During pattern recognition, new image will be matched individually and comparing each feature from the database as prepared. The features are then compared by calculating the Euclidean distance of their feature vectors. The pattern is said to be recognized when most of the feature points of the new image matched to the feature points of the pattern learned and stored in the database.19,20
Methods Colour blob segmentation The colour blob segmentation technique is proposed to detect colour objects, which is called red ball detection. The front camera of robot will constantly input raw images into the processor so that these images can be processed using colour blob segmentation. These raw images are in YUV format which will then be converted into HSV format to separate the colours. The HSV colour image is then separated into individual H, S and V components where the thresholds for red colour are then selected based on the H, S and V value of red colour. These thresholds are then being applied to the HSV image to detect and segment regions that contains red region. Once the red coloured blob is segmented, the edge of this segmented blob is measured. If one of the segmented blobs is circle in shape, it is said to have detected the red ball. Figure 1 shows the difference between each colour space of the same image. The original image shown by the camera monitor that can be seen by the user is in the form of RGB format, which is shown in Figure 1 (top left). The image obtained by the camera located at
robot’s head is in the form of YUV format, which is shown in Figure 1 (top right). This YUV image is processed and converted into HSV image, which can be seen in Figure 1 (bottom left). Based on the HSV image, a colour histogram can be obtained, which can be seen in Figure 1 (bottom right). From the colour histogram, the thresholds to select red colour can be selected and the red ball can be segmented and detected based on the HSV image. From the HSV image, the red ball appears to be very distinct compared to its surrounding. Hence, colour blob segmentation is very effective to detect red ball.21,22
Trigonometric depth estimation The depth of the red ball is estimated in order to calculate the position of the red ball relative to robot’s torso, which allows robot to move towards the red ball until the distance specified. Simple trigonometric properties of right angle triangle are used to calculate the horizontal distance of the red ball to the camera without the need to know the intrinsic parameters of the camera. If the red ball blob is present, the blob area will be tracked by the camera to ensure the blob area is always at the centre of the image. Hence, robot’s head will move in pitch and yaw direction to ensure the blob area always be tracked at the centre of the image. Robot’s head pitch angle is extracted as the angle between the camera looking at the red ball and the horizontal line. As robot moves closer to the red ball, the angle between the centre lines of the camera to the horizontal line will increase so that the red ball blob area is maintained at the centre of the image. Figure 2 shows the trigonometric relations such as the vertical height of robot’s top camera and bottom camera when the robot is in stand still position and length information of robot21 such as ht ¼ 0.38995 m and hb ¼ 0.34405 m. Based on the cameras specification provided, the centre line of both the cameras has an offset to the horizontal line given by yt ¼ 1.2 and yb ¼ 39.7 , which results in t ¼ pitch angle þ 1.2 and b ¼ pitch angle þ 39.7 .23–25 Hence, the depth information can be calculated based on two different equations (1) and (2) depending on which camera is used
4
International Journal of Advanced Robotic Systems
0
D ¼ 0
D ¼
ht tanð t Þ
(1)
hb tanð b Þ
(2)
Grid-based segmentation pattern learning and recognition At different angles of view, the goalpost will appear differently to robot and will have different feature points appearing on the image captured by the camera. During the learning phase, these features are extracted from the image using SIFT and stored in a database which will be used in pattern recognition phase. During the recognition phase, the new image to be recognized will be processed to extract the SIFT features and these features will be compared to the features stored in the database prepared. The features are said to be matching when the Euclidean distance between the two features is minimum. The pattern is recognized if most of the feature points of the new image are matched to the feature points prepared in the database. Grid-based segmentation pattern learning splits the region of movements, that is, half of soccer field into grid size of 2 3. During the learning phase, robot is placed at the centre of region with its body facing front and robot head turned facing the goalpost. Database is prepared manually by selecting the region of interest to be learned so that it enhances the accuracy of pattern recognition. In a grid size of 2 3, only six images are required to prepare the database as each region will only require one image.26,27 Figure 3 shows the segmentation of the soccer field into 2 3 grid size. For each region, the centre of the region is marked with a small box indicating the location where robot is taught to recognize the goalpost. Figure 4 shows the manual selection of region where robot needs to learn. It can be seen that at different regions, the goalpost will appear differently. Hence, SIFT will be able to extract different feature points that are able to differentiate each region.
Motion synchronization The synchronization of predefined behaviours which enable robot to play soccer is divided into two roles, namely as a player and as a keeper.
Figure 3. Grid segmentation of 2 3.
ALRedBallDetection. Under this core function, the proposed colour blob segmentation method is used to detect the red ball. In order for robot to move towards the red ball until the threshold limit specified, the whole body joints of robot are required to move and this will be activated using the function called ALTracker that is required to know the position of the red ball relative to robot. Under this core function, the proposed technique called trigonometric depth estimation method is used to calculate the position of the red ball relative to robot. Figure 6 shows the primitive branch that defined red ball detection behaviour where each level of components that made up red ball detection behaviour is defined clearly. Goalpost detection. To detect the goalpost, the front cameras of robot are essential and processed using vision module. This module will activate the core function called ALVisionRecognition for the pattern recognition of goalpost. Under this core function, a pattern recognition database is proposed and uploaded to the robot. This database is required to be trained and prepared manually by the user. In order to recognize the goalpost, the grid-based segmentation pattern learning and recognition techniques are proposed as shown in Figure 7, which shows the primitive branch that defined goalpost detection behaviour where each level of components that made up goalpost detection behaviour is defined clearly.
Player The synchronization of predefined behaviours for a player role includes red ball detection, goalpost detection, kip up and scoring where each individual behaviour is proposed and designed based on certain primitive rules that made up the behaviour. Red ball detection. The image obtained from robot is processed by vision module core function called
Kip up motion. In order for robot to perform respective kip up motions, the three axes accelerometer of robot is mandatory. The accelerometer values are obtained by sensing module. This module will activate the core function called ALMemory which provides the accelerometer values that can be used to determine the current pose of robot based on the pose determination using accelerometer. From the determined pose, the respective motions are carried out
Parasuraman et al.
5
Figure 4. Pattern learning for preparing the database.
Red ball detection Front camera x2 Vision module ALRedBall detection Colour blob segmentation method
Whole body joints Motion module ALTracker Trigonometric depth estimation method
Figure 5. One of the sequence motion of front kip up designed. Figure 6. Primitive branch of red ball detection.
accordingly to enable robot to stand back up. In order for robot to carry out the respective motions that will enable it to stand back up, the whole body joints of robot are required. The joints are controlled by motion module, which are proposed and developed as shown in Figure 8. This module will activate the motion sequence as proposed by front kip up III, back kip up I, lean left struggle and lean right struggle. Figure 8 shows the primitive branch that defined kip up motion behaviour where each level of components that made up kip up motion behaviour is defined clearly.
score a goal, the whole body joints of robot are required to move. These joints are controlled by motion module as shown in Figure 9. This module will activate the motion sequence as proposed by kick I and kick II behaviours. These motion sequences are designed based on the proposed support polygon and COM. Figure 9 shows the primitive branch that defined scoring motion behaviour where each level of components that made up scoring motion behaviour is defined clearly.
Scoring motion. In order for robot to carry out the kicking motions that will enable it to kick the red ball and try to
The synchronization of predefined behaviours for a keeper role includes diving and kip p where each individual
Keeper (Appendix-4)
6
International Journal of Advanced Robotic Systems
Goal post detection
Diving NAO’s head joints
Front camera x2
Front camera x2 Vision module
Vision module
ALVisionRe cognition
ALRedBall detection
Pattern recognition database
Color blob segmentation method
Grid-based segmentation pattern learning and recognition method
Figure 7. Primitive branch of goalpost detection (goalpost detection behaviour).
Kip up Whole body joints
3 Axis accelerometer Sensing module
Motion module
ALMemory (Accelerometer values)
Motion sequence designed (Front kip up III, back kip up I)
Pose determination using accelerometer
Support polygon and centre of mass method
Figure 8. Primitive branch of kip up motion (ALMemory).
Whole body joints
Motion module
Motion module
ALTracker
Motion sequence designed (DiveLeft II, DiveRight II, squat)
Trigonometric depth estimation method
Support polygon and centre of mass method
Figure 10. Primitive branch of diving motion (ALRedBallDetection).
for the detection of red ball. Under ALRedBallDetection, colour blob segmentation techniques are pooled and applied to detect the red ball. In order for robot to track the red ball, robot’s head joint motion is controlled by the motion module. This module will activate the core function ALTracker that is mandatory to know the position of the red ball relative to robot. Under this core function, the proposed trigonometric depth estimation method is used to calculate the position of the red ball relative to robot. In order for robot to carry the diving motions that will enable it to block the red ball, the whole body joints of robot are required. These joints are controlled by the module called motion module (Figure 9). This module will activate the motion sequence designed for this behaviour, that is, dive left II, dive right II and squat. These motion sequences are proposed based on the support polygon and COM method. Figure 10 shows the primitive branch that defined diving motion behaviour where each level of components that made up diving motion behaviour is defined clearly.
Scoring Whole body joints
Behaviour analysis and results Motion module
The behaviour analysis is focused on the trigonometric depth estimation, grid-based segmentation pattern learning and recognition as well as support polygon and COM.
Motion sequence designed (Kick I, Kick II) Support polygon and centre of mass method
Figure 9. Primitive branch of scoring motion (motion module).
behaviour is designed based on certain primitive rules that made up the behaviour and shown in Figure 3A of Appendix 3. Diving. In order for robot to locate the red ball, the front cameras of robot are essential and the obtained image is processed using vision module. This module will activate the core function called ALRedBallDetection that is required
Trigonometric depth estimation The red ball was placed at the furthest point of 1.4 m to the nearest distance of 0.1 m from robot, and both cameras are used to analyse the accuracy of this depth estimation method. For each distance, the resulting robot’s head pitch angle was obtained to calculate the estimated depth information based on equations (1) and (2) depending on which camera is used. Five samples were taken for each distance to obtain the average values as shown in Table 1. From the results obtained in Table 1, plots were generated to show the relationship between the factors to be analysed. Figure 11 shows the plot for the estimated depth calculated from robot’s head joints angle compared to the actual depth. Figures 12 and 13 show
Parasuraman et al.
7
Table 1. Summary of results obtained for trigonometric depth analysis. Estimated depth, D0 (m) Actual depth, D (m) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4
Absolute Error, |e| (m)
Top camera
Bottom camera
Top camera
Bottom camera
Top camera
Bottom camera
N/A N/A N/A N/A N/A 0.6727 0.7713 0.8669 0.9720 1.0808 1.1917 1.3199 1.4574 1.591
0.1777 0.2627 0.3494 0.4388 0.5326 0.6263 0.7239 0.8265 0.9321 1.0415 1.1541 1.2664 1.3841 1.5023
N/A N/A N/A N/A N/A 0.0727 0.0713 0.0669 0.0720 0.0808 0.0917 0.1199 0.1574 0.1901
0.0777 0.0627 0.0494 0.0388 0.0326 0.0263 0.0239 0.0265 0.0321 0.0415 0.0541 0.0664 0.0841 0.1023
N/A N/A N/A N/A N/A 12.12 10.19 8.36 8.00 8.08 8.33 9.99 12.11 13.58
77.73 31.33 16.46 9.70 6.53 4.39 3.42 3.32 3.57 4.15 4.92 5.54 6.47 7.31
Absolute error, |e| against estimated depth, D'
Estimated depth, D'
1.50 1.00 0.50 0.00 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5
Absolute error, |e| (m)
2.00 Estimated depth, D (m)
Percentage Error (%)
0.23 0.20 0.18 0.15 0.13 0.10 0.08 0.05 0.03 0.00
y = 0.2212x2– 0.3707x + 0.223 R² = 0.9965
0.00
0.50
y = -0.0457x3+ 0.2584x2– 0.299x + 0.1238 1.00 1.50 2.00 R² = 0.9987 Estimated depth, D' (m)
Actual depth, D (m) Top camera
Bottom camera
Figure 11. Plot of estimated depth against actual depth for both cameras.
Top camera
Bottom camera
Poly. (Top camera)
Poly. (Bottom camera)
Figure 13. Plot of absolute error against estimated depth for both cameras.
Percentage error (%)
0.20 0.18 0.15 0.13 0.10 0.08 0.05 0.03 0.00
100.00
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 Actual depth, D (m) Top camera
Bottom camera
Figure 12. Plot of absolute error against actual depth for both cameras.
the plot for the absolute error calculated from the estimated depth compared to the actual depth and estimated depth, respectively. In Figure 13, the data points were able to be fitted into respective polynomials which will be used for further analysis. Figure 14 shows the plot for the percentage
Percentage error (%)
Absolute error, |e| (m)
Absolute error, |e| against actual depth, D
50.00 0.00 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5
Actual depth, D (m) Top camera
Bottom camera
Figure 14. Plot of percentage error for both cameras.
error calculated from the absolute error compared to the actual depth.
Grid-based segmentation pattern learning and recognition The goalpost appears differently to robot from different angles of view and will have different feature points. The
8
International Journal of Advanced Robotic Systems
Table 2. Summary of results obtained for grid-based segmentation learning and recognition analysis.
Comparison between motion behaviours
Percentage of false detection (%)
Percentage of failed detection (%)
2 3 2 3
85.00 83.33 83.33 66.67
0.00 16.67 3.33 33.33
15.00 0.00 13.33 0.00
2 2 3 3
Motion Sequence
COM close to Edge
20 15 10 5 0 Front Kip Back Kip Up III Up I
database prepared during pattern learning phase will have a list of feature points for different angles of view which will be used in pattern recognition phase. Hence, the preparation of database is important as the accuracy of goalpost recognition will be based on the effectiveness of the database. This analysis will compare the effectiveness of robot recognizing the goalpost using different methods for preparing the database, that is, divide half of the field court into different grid sizes of 2 2, 2 3, 3 2 and 3 3. Robot will be placed at the centre of each region of the segmented grid and robot’s head will be turned in yaw direction to face the goalpost so that pattern learning can take place. In pattern recognition phase, each region has five trials and each trial robot will be placed differently in that specific region. The five trials are namely trial 1 (centre of region), trial 2 (top left of region), trial 3 (top right of region), trial 4 (bottom left of region) and trial 5 (bottom right of region). Pattern recognition is carried out for each trial in each region and the summary of the results obtained for each grid size is shown in Table 2.
Support polygon and COM The designed motion behaviours are made up of a series of motion sequences where the stability of each motion sequence is analysed and the overall motion behaviour stability is examined. The stability of robot is a very important aspect in designing motion sequences as to prevent robot from falling down while performing the desired motion. The stability of the motion behaviours created is ensured by checking the support polygon of each motion sequence and its respective COM. The support polygon is determined by first locating all the contact points of robot and the soccer field. These contact points are then connected to each other forming a base region under robot. The largest region is classified as the support polygon of robot for that particular motion sequence. Robot is said to be stable if the COM falls within the support polygon. Robot will fall if the COM falls outside the support polygon. This analysis was carried out on all six motion behaviours designed namely front kip up III, back kip up I, kick I, kick II, dive left II and dive right II. The stability of each motion sequence was observed and summarized in Figure 15.
Unstable Occurrence
25 Number of times
Grid size
Percentage of accurate detection (%)
Kick I
Kick II Dive Left Dive Right II II
Motion behaviours
Figure 15. Bar chart comparing the performance of different motion behaviours.
Discussions Trigonometric depth estimation From the plot shown in Figure 11, both top and bottom cameras display the correct trend of pattern in which both estimated depth and the actual depth increase linearly. As each time the actual depth increases linearly by 0.1 m, the expected estimated depth should also increase linearly by 0.1 m. Although the increase is not exactly 0.1 m for estimated depth, but the increase is almost linear giving it the correct trend of increasing estimated depth as actual depth increases. The maximum depth that both cameras can estimate is only 1.4 m due to the limitation on the object classification which is the colour blob segmentation method. The red ball at 1.5-m depth appears to be a very small region in the image captured by robot where it is not considered as an appropriate region to be classified. The minimum depth that the bottom camera can estimate is 0.1 m and 0.6 m for the top camera. The limitation on bottom camera is the chest of robot which will block the view of the red ball by the camera when the depth goes below 0.1 m. As for the top camera, the limitation is on the maximum joint angles of robot’s head pitch where the red ball could not be maintained at the centre of the image if the depth goes below 0.6 m. From Figure 12, the absolute error is larger at both the extreme of 0.1 m and 1.4 m for bottom camera. Whereas the absolute error is the small at the centre of the working range which is at 0.7 m depth. Although the working range of top camera is only from 0.6 m onwards up to 1.4 m, similar trend could be seen where the absolute error is large at the extreme of 1.4 m. The plot also shows that the absolute error of the top camera is larger than that of the bottom camera. By further analysing, the calculated camera absolute error could be estimated into a quadratic or cubic equation to define the relationship between the absolute error and the estimated depth. This is shown in Figures 12 and 13. For the bottom camera, a cubic equation (3) is found with R2 value of
Parasuraman et al.
9
99.87%, whereas quadratic equation (4) is found with R2 value of 99.65% for top camera. jej ¼ 0:0457D3 þ 0:2584D 2 0:299D þ 0:1238
(3)
jej ¼ 0:2212D2 0:3707D þ 0:223
(4)
Both these equations are of good fit to the data as they have R2 value more than 99%. These equations could be used to determine the absolute error so that the results obtained could be as accurate as possible to the actual depth. Figure 14 shows the percentage error of the estimated depth where the percentage error for the bottom camera is as high as 77.73% at 0.1 m. It decreases as the actual depth increases, but after 0.7 m, the percentage error increases again. This conforms to the trend shown by Figure 12 where the absolute error at both extreme is larger. However, the percentage error of 0.4–1.4 m is less than 10%. Similarly for top camera, the percentage error decreases as the actual depth increases until 0.9 m and the percentage error increases back. The percentage error of 0.8–1.2 m of top camera is less than 10% but is still higher compared to that of bottom camera. In this project, both the cameras are used although bottom camera seems to have an advantage in the accuracy of determining the estimated depth. In order to view objects at a distance such as at 1.4 m, top camera has a wider finding range where robot’s head yaw can rotate from 70 to 70 without collisions with its shoulder. In order for bottom camera to view objects at the same distance of 1.4 m, the range where robot’s head yaw can rotate decreases to a range of 40 to 40 only. Hence, top camera is used to locate for the red ball at a distance more than 0.7 m and bottom camera will be used to move robot towards the red ball until a threshold of 0.2 m where the accuracy is critical. The reliable depth perception eases and enables a large variety of attentional and interactive behaviours on humanoid robots. However, the use of depth estimation as described above in real-world scenarios is hindered in the earlier methods28,29 by the difficulty of computing realtime and robust disparity maps from moving stereo cameras. In these scenarios, the proposed depth estimation techniques help the robot to identify the ball and environment and prepare the robot for next action in the real-world settings. Regarding the estimation of the camera parameters, the procedure described in the studies by Fanello et al.28 and Ciliberto et al.29 was adapted except the calibration procedures. The proposed method is a viable as compared with articles referred in the studies by Pasquale et al.30 and Sadeh-Or and Kaminka31 in dynamical conditions as these pose static constraints in real-world scenarios. In the RoboCup Humanoid League (RCHL),20 few of the robots are able to use the inherent dynamics of the motion and find the ball location and estimate the actual depth of the environment. As compared to RCHL, the proposed depth estimation techniques are enhanced energy efficiency methods as per qualitative comparison. The use
of compliance in control is greatly improved compared to the RCHL20 while high speed kicks robustness to fall and safe robot interactions.11
Grid-based segmentation pattern learning and recognition From Table 2, the percentage of accurate detection is the highest for 2 2 grid size followed by 2 3 and 3 2 grid size which have the same percentage of accurate detection and 3 3 grid size has the least percentage of accurate detection. As the number of grids available increases, robot has more regions to learn about the goalpost. Images of the goalpost will be captured and features are being extracted. When the number of grids increases, which lead to an increase of regions, the features of goalpost of some region will look very similar to that of another region which will cause robot to differentiate the regions wrongly. The percentage of false detection is high for grids with three columns compared to that of grids with two columns. Most of the false detection occurs in region 2 which is right in front of the goalpost. When robot rotates its head in yaw direction, there is a tendency to detect region 2 as region 1 or region 3 as the feature points of region 2 made up part of the feature points of regions 1 and 3. The difference in region detected depends on the direction of robot’s head rotation. However, the percentage of failed detection is high for grids with two columns, while grids with three columns have 100% detection rate. The failed detection always happens along the centre line where robot is facing the goalpost perpendicularly. This is because there are no feature points learned for a direct face up with the goalpost as the features learned by robot are from either left or right side of the goalpost. There is no failed detection for grids with three columns because robot is able to learn the features of a goalpost from three different directions, namely left, right and front. In this project, the percentage of successful detection is very important as the goalpost needs to be determined in order for robot to perform scoring actions. The regions classified are not as important because the regions classified are not utilized. The rotation that enables robot to turn in the direction to face the goalpost is determined from the head yaw angle which is independent of the region classified. Hence, grid size with 100% detection rate is required to ensure robot always able to detect the goalpost. Among grid size of 2 3 and 3 3 which has 100% detection rate, grid size of 2 3 is chosen because it has a higher percentage of accurate detection. The results demonstrate the feasibility of the proposed method. Comparing the feature-based pose and learning,31,32 the proposed method significantly reduced computational resources since the features are very minimum. The computational resources needed for the feature-based learning are quite expensive and the method cannot run for the entire journey of the robot in real time. As a result, it may still be possible that the error in robot pose
10 estimate becomes too high and cannot always be corrected by landmarks.
Support polygon and COM Figure 1A shows that for a more complicated motion behaviour, more number of motion sequence are required to ensure the stability. Kip up behaviours (Figure 5) which enable robot to stand back up after a fall are the most complicated motion behaviour among the six motion behaviours that required 23 motion sequences and 10 motion sequences, respectively, for front kip up III and back kip up III so that these behaviours could be executed in a stable manner. As for diving behaviours which enable robot to fall down in a specific manner that are the least complicated motion behaviour. Hence, the least number of motion sequences are required to perform the behaviours in a stable manner. The robot stability at dynamic walking is achieved applying the support polygon criteria in the control scheme to guarantee the balance control at walking.24,33 To support the stability, the COM has to be completely within the support polygon al time. If one leg is in the air, the support polygon is equal to the safe of the foot which is connected with the ground, so the COM has to be completely within footprint to support stability. If both feet are connected with the ground, the COM can be with the area which is built by two foot prints.34 The efficiency of the proposed techniques has been proved by generating support polygon as suggested and the results have been demonstrated, which are shown in Figures 1A and 2A. As referred to Figures 1A and 2A, most of the cases COM was within the area which is built by two foot prints. As referring to Figure 15, when the COM close to edge, the unstable occurrences taking place and the suitable remedies have taken. The stability of robot for a particular motion sequence decreases when COM falls close to the edge of the support polygon as obtained from the results analysed from the support polygon of each motion sequence. When the number of motion sequence with COM falls close to the edge of support polygon increases, the number of unstable occurrence throughout the whole motion increases. This can be seen from Figures 1A and 2A (Appendixes 1 and 2) where kip up behaviours have the most number of unstable occurrences throughout the whole motion as these behaviours have the most number of motion sequence with COM falls close to the edge of their support polygon. Similarly in kick behaviours which have the least number of motion sequences with COM falls that closes to the edge of support polygon to have the least number of unstable occurrence throughout the whole motion. This is because when the motion sequences are combined together, this motion sequence with COM close to the edge of the support polygon might shift out of the support polygon due to the moment and inertia caused by each motion sequence causing a momentary instability to robot. However, the shift will not be too much and the degree of instability of robot is low. Robot regains its stability when the next stable motion
International Journal of Advanced Robotic Systems sequence behaviour is being executed. In conclusion, more complicated motion behaviour requires more number of motion sequences to perform a stable motion. While designing each motion sequence, the stability has to be ensured by ensuring the COM falls within the support polygon. The higher the number of motion sequence with COM falls close to the edge of support polygon, the higher will be the number of unstable occurrence throughout the whole motion. However, the overall motion behaviour will still remain stable.
Conclusions The article has efficaciously presented the primitive rules used in designing the individual behaviours that are required to enable the humanoid robot to play soccer. The behaviours created are separated into two categories, namely object detection behaviours and motion behaviours. Object detection behaviours include red ball detection which uses colour blob segmentation and trigonometric depth estimation primitive rules, while goalpost detection uses 2 3 grid-based segmentation pattern learning and recognition primitive rule. Motion behaviours include scoring, diving and kip up which use the primitive rule of support polygon and centre of mass of robot. The primitive rules used to synchronize all the individual behaviours are also being presented. The synchronization of these behaviours resulted in two different roles that could be perform by robot, namely player and keeper role where player role consists of red ball detection, goalpost detection, scoring and kip up behaviours, whereas player role consists of diving and kip up behaviours. Author note Phua Seong Hock is now affiliated to School of Engineering, Monash University Malaysia, Petaling Jaya, Selangor, Malaysia.
Declaration of conflicting interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding The author(s) received no financial support for the research, authorship, and/or publication of this article.
References 1. Matsumura R, Shibatani N, Imagawa T, et al. Teamosaka team description paper. In: RoboCup 2007, Atlanta GA: Humanoid League Team Descriptions, 2007. 2. Kitano H and Asada M. The RoboCup humanoid challenge as the millennium challenge for advanced robotics. Adv Robot 2000; 13(8): 723–737. 3. Matsumura R and Ishiguro H. Development of a highperformance humanoid soccer robot. Int J Humanoid Robot 2008; 5(03): 353–373. 4. Ha I, Tamura Y, Asama H, et al. Development of open humanoid platform DARwIn-OP. In: Proceedings of IEEE SICE
Parasuraman et al.
5.
6. 7.
8.
9.
10. 11.
12.
13.
14.
15.
16. 17. 18.
19. 20.
annual conference, 13–18 September 2011, pp. 2178–2181, The Society of Instrument and Control Engineers. Can˜as JM and Matella´n V. From bio-inspired vs. psychoinspired to etho-inspired robots. Robot Autonom Syst 2007; 55: 841–850. ISSN 0921-8890. Herrero D and Martinez H. Embedded behaviour control of four legged robots. In: RoboCup symposium, 2008 Friedmann M, Kiener J, Petters S, et al. Versatile, highquality motions and behavior control of a humanoid soccer robot. Int J Humanoid Robot 2008; 5(03): 417–436. RoboCup Humanoid League Technical Committee. RoboCup Soccer Humanoid League rules and setup for the 2013 competition in Eindhoven, Netherlands, 28 May 2013. http:// www.tzi.de/humanoid/. Schwarz M, Pastrana J, Allgeuer P, et al. Humanoid teen size open platform NimbRo-OP. In: Proceedings of 17th RoboCup International Symposium, Eindhoven, Netherlands, June 2013. DOI:10.1007/978-3-662-44468-9_51. Ro¨fer TR, Laue T, Muller JM, et al. B-human team code release, 6 January 2014. Haddadin S, Laue T, Frese U, et al. Kick it with elasticity: safety and performance in human–robot soccer. Robot Autonom Syst 2009; 57(8): 761–775. Gutmann JS, Fukuchi M and Fujita M. 3D perception and environment map generation for humanoid robot navigation. Int J Robot Res (IJRR) 2008; 27(10): 1117–1134. Michel P, Chestnutt J, Kagami S, et al. GPU-accelerated realtime 3D tracking for humanoid locomotion and stair climbing. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2007. Chestnutt J, Nishiwaki K, Kuffner J, et al. An adaptive action model for legged navigation planning. In: Proceedings of the IEEERAS international conference on humanoid robots (humanoids), 29 November 2007, pp. 196–202. Krivanec S, Petrackova A, Thi Phuong Linh T, et al. Robot Application developed in choregraphe environment. Czech technical University Faculty of Electrical Engineering Department of Cybernetics, Czech Technical University, CTU-CMP-2010-21, 9 December 2010, pp. 1–20. Ming A and Ma H. A Blob detector in color images. Amsterdam, The Netherlands: CIVR’07, 2007. Emami S. Facebots: steps towards enhanced long term human robot interaction. Paladyn J Behav Robot 2010; 1(3). Fischler MA and Bolles RC. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. CSCW 2000, ACM, Philadelphia, USA, 2 December 2000. Lowe DG. Distinctive image features from scale-invariant key points. Int J Comput Vis 2004; 60(2): 91–110. Dominey PF, Metta G, Nori F, et al. Anticipation and initiative in human-humanoid interaction proc. In: IEEE
11
21. 22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
conference on humanoid robotics, South Korea, 1 December 2008. 10.1109/ICHR.2008.4755974. Fox D, Burgard W and Thrun S. Markov localization for mobile robots in dynamic environments. J Artif Intell Res 1999: 11. Atherton T, Kerbyson D and Nudd G. Passive estimation of range to objects from image sequences. Coventry, UK: University of Warwick, 24 December 1991, pp. 343–344. Cupec R, Schmidt G and Lorch O. Experiments in visionguided robot walking in a structured scenario. In: Proceedings of the IEEE international symposium on industrial electronics, 2005. Parasuraman S, Hang FJ and Ahmed Khan MKA. Robotcrawler: statically balanced gaits. Int J Adv Robot Syst 2012; 9: 1–9. Hornung A, Wurm KM and Bennewitz M. Humanoid robot localization in complex indoor environments. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), 2010. Parasuraman S, Ganapathy V and Shirinzadeh B. Behavior rule selection using -level FIS for mobile robot navigation during multiple rule conflicts in the complex environments. Inderscience. Int J Autom Control 2008; l1(4): 342–336. Joglekar A, Joshi D, Khemani R, et al. Depth estimation using monocular camera. Int J Comput Sci Inf Technol 2011; 2(4): 1758–1763. Fanello S, Pattacini U, Gori I, et al. 3D stereo estimation and fully automated learning of eye-hand coordination in humanoid robots. In: IEEE/RAS international conference on humanoid robots, 2014, pp. 1028–1035. Ciliberto C, Fanello S, Santoro M, et al. On the impact of learning hierarchical representations for visual recognition in robotics. In: Proceedings IEEE/RSJ international conference on intelligent robots and systems, 2013. DOI: 10.1109/IROS.2013.6696893. Pasquale G, Ciliberto C, Odone F, et al. Teaching iCub to recognize objects using deep Convolutional Neural Networks. In: Proceedings of the 4th workshop on machine learning for interactive systems, 32nd international conference on machine learning, France, 6 July 2015, pp. 21–25. Sadeh-Or E and Kaminka G. Anysurf: flexible local features computation. In: Rfer T, Mayer N, Savage J and Saranl U (eds) RoboCup 2011: Robot soccer world cup XV. Vol. 7416, Springer, 2012, pp. 174–185. Yan W, Weber C and Wermter S. A hybrid probabilistic neural model for person tracking based on a ceiling-mounted camera. J Ambient Intell Smart Environ 2012; 3(3): 237–252. Vukobratovic M and Borovac B. Zero-moment point— thirty five years of its life. Int J Humanoid Robot 2004; 1: 157–173. Suleiman W, Kanehiro F, Miura K, et al. Enhancing zero moment point-based control model: system identification approach. J Adv Robot 2012; 25(3-4): 427–446.
12
Appendix 1 Motion sequence designed: Front kip up III
Figure 1A. First eight motion sequences of front kip up III designed.
International Journal of Advanced Robotic Systems
Parasuraman et al.
Appendix 2 Motion sequence designed: Back kip up I
Figure 2A. Motion sequences of back kip up I designed.
13
14
International Journal of Advanced Robotic Systems
Appendix 3 Behaviour synchronization player mode
Figure 3A. Program flow chart for behaviours synchronization in player mode.
Parasuraman et al.
Appendix 4 Behaviour synchronization keeper mode
Figure 4A. Program flow chart for behaviours synchronization in keeper mode.
15