Using integrated vision systems: Three Gears and ...

3 downloads 92063 Views 537KB Size Report
user is always occupied by the device. Recently ... With the arrival of iPhones and Android devices, gestural interfaces ... 0.01 mm stated in specs and 0.2 mea-.
Using integrated vision systems: Three Gears and Leap Motion, to control a 3-finger dexterous gripper Igor Zubrycki and Grzegorz Granosik Lodz University of Technology, tel +48 42 6312554 [email protected],[email protected] http://robotyka.p.lodz.pl/

Abstract. In this paper we have tested two vision based technologies as possible control interfaces for dexterous 3-finger gripper. Both qualitative analysis and quantitative comparison with sensor glove are presented. We also provide some ready to use solutions to directly control movements of the gripper and support operator in difficult manipulation tasks by applying gestures. Keywords: vision system, Leap Motion, Three Gear, ROS

1

Motivation

Manipulators with dexterous grippers can manipulate objects in constrained environments in ways that would be otherwise impossible with only 6-DOF manipulator and parallel-jaw grippers [1]. As robots become more and more used in areas less structured than typical industrial environments, the expectation that robots can manipulate objects in more dexterous ways is growing. However, coordinated motion of dexterous gripper is not trivial and, in fact, the whole system can be considered as a group of several manipulators - fingers, working in the common space. In tasks like telemanipulation, programming or teaching robot the gripping actions, there is a common need for an intuitive control of the gripper. As we have explained in previous work, we can use sensor glove to successfully operate grippers in an intuitive way [2], [3]. However, this in not ideal solution as sensor gloves have some important disadvantages: they need to be calibrated as readings drift with time and temperature, they wear mechanically and one hand of the user is always occupied by the device. Recently, several solutions for tracking hand motion using 3D vision systems became available; they have advantages that can be favorable when controlling dexterous gripper: – based only on the vision information, they provide function to track fingers poses without any mechanical devices attached to hand (even markers)

– they return data on position and orientation of hands in space – systems provide some motion and gesture information that can be additionally used in controlling behavior of manipulator or gripper Based on these features we have examined two easily available solutions: Leap Motion (LM) sensor and Three Gears (3G) system, shown in Fig. 1. The gripper we have used in test was the 3-finger dexterous hand (Schunk Dexterous Hand).

Fig. 1. Three Gears (left) system and Leap Motion (center) vision systems, SHD gripper (right)

2

Basic features of Leap Motion and Three Gears vision systems, and their comparison

Both Three Gears system and Leap Motion are integrated vision systems, that means they are not conventional 3D scanners providing information in the form of 3D points cloud but they are systems designed specifically for hands tracking and providing additional information about hand movements and gestures. These are commercially available products with their main market being PC users seeking gestural interfaces to control various applications. Where did they come from? With the arrival of iPhones and Android devices, gestural interfaces became available for large group of people, proving that this kind of control can be very intuitive and comfortable for users [4]. As the original solution requires a touchscreen to function, users of normal desktops started looking for some substitute. Additionally, the big success of the kinect sensor proved that there is also a large market for 3D gestural interfaces. Both tested systems use IR cameras and computer vision to determine the 3D shape of observed object, but as integrated solutions designed for hand tracking they provide additional functionalities, namely:

– Hands position and orientation. Systems automatically determine which part of the 3D scene is a hand, extract it and give information (position and orientation) about the center of the palm. – Position and orientation for every visible finger. – Recognition of some set of gestures. – Additional high level information (e.g.: tracking accuracy, hand scale). Although basic functionality of both systems is similar they differ significantly in application, comparison a few important features is provided in Table 1. Table 1: Comparison of Three Gears and Leap Motion systems. Three Gears

Leap Motion Construction Uses 3D cameras developed by Mi- Was designed specifically for hands crosoft and PrimeSense sensor. Sensor tracking and can be placed on the desk uses Light Coding technology, where just next to the keyboard. It is small in built in laser projector, projects a pat- size: 76 x 30 x 17 mm and lightweight: 45 tern on whole scene which is observed by g. Leap Motion uses two cameras and 3 IR camera. Readings are then decoded IR LEDs to improve lighting conditions. and sent to a computer as a 3D point All calculations necessary to transform cloud data and RGB video. The device stereovision images into points cloud was created to tack the pose of whole and further to objects positions are perbody and because of that focal point is formed on the host computer. This can set for an according distance. All devices result in a high machine load: about 20% designed by PrimeSense can be used but processor usage for Intel Core i3, 2.4GHz Three Gears recommends using a short laptop. range sensor Carmine 1.09 ( range 0.35 - 1.4 m). 3D scanner has to be set up 0.7 m above workspace on a camera stand. Positioning precision 0.01 mm stated in specs and 0.2 mea- 1 mm stated in specs (for PrimeSense sured in tests. Carmine 1.09) Hand Tracking API quaternion of palm rotation and vector vector of direction and vector of transof translation. lation. Speed and direction of hand motion Finger Tracking API each joint identified by name with direction and translation vectors of visquaternion of rotation and vector of po- ible fingers without names sition relative to the base Gestures API static gestures of the hand using one or gestures based on hands motion (swipe, two hands (pressed, dragged, released, keyboard tap, wall tap, circle) moved, simultaneously pressed etc.)

Additional API Tracking accuracy, hands scale, calibra- ability to track long, straight objects tion information (giving their direction and translation vector), calculating the radius of a ball fitting inside the palm (when imitating or catching something). Operating systems and Languages availability Main binary server has to be on Mac/ Windows/ Mac, full API and airspace Windows computer, library availability (App store for Leap Motion programs) in JAVA, C++ also server publishes in- availability, Linux: beta version for deformation in text protocol described on velopers of libraries and driver. Webwebsite, so any language that can parse Socket Json server, but with libraries for it can be used (each message ends with python, java, c++, javascript. carrige return) Robustness Basic 3D cloud acquisition done di- Based on IR stereo vision camera with rectly by Kinect device. Laser projector artificially lighted scene. Two modes provides good cloud acquisition every- that are switched based on the intensity where but direct sunlight/ around bright of ambient light - normal and robust. lights. Hand position tracking is reli- Worst tracking observed for near switchable when hand is at least few cm above ing light conditions. Stereovision based ground. Finger pose tracking reliability on edges detection with poor results for is highly dependent on hands orientation smooth skin - like hands in gloves. Finand works well only for small angles rel- ger tracking poor for large angles, fingers ative to sensor plane when fingers do not that reappear on the scene have different self-occlude. Gestures are well classified position on the list - difficult to track. Popularity and support 133 persons on Twitter, we have email 26 894 persons on Twitter. Active decontact with developers velopers forum and phone customer support.

3

Gripper Control

We have been working with the SDH gripper for a few months now, preparing convenient ROS-based control architecture [2] and testing various grips [3]. With its 7 DOF this three finger hand is really dexterous providing a few default grasps and independent control for all joints. However, intuitive operation of this device is challenging. We have started our development with using sensor glove as the operators interface, now we are proposing vision systems as possible substitution or improvement because: – it offers non-contact control, operators hand is not occupied permanently (which might be important for some tasks, e.g. SWAT operations)

– information of the position and orientation of the entire hand can be used to control manipulator while fingers pose can be used to control gripper – no need to calibrate – some hand gestures based on the movement or position of hands relative to each other can be detected and used However, vision systems have their own limitations, mainly instability and self-occlusion of objects. The latter problem is important for a single view schemes - like in case of Three Gears and Leap Motion. Taking into account both advantages and limitations we have tested two possible scenarios of controlling the SDH gripper: direct and indirect that we are explaining in the following sections. 3.1

Direct control

Integrated vision systems can be used for direct control of grippers. In this case information about the position and orientation of fingers is translated into particular pose of the gripper. If the SDH is considered the motion of 5 human fingers has to be transformed to the joint position of 7 DOFs. We have tested different ways of direct mapping, taking into account specific features of two devices, namely: – 3G provides angle values for all phalanges and therefore we can calculate (using some constant ratio) angle positions for each joint of SDH. We can choose three particular human fingers to follow or more sophisticated mapping. – LM provides information about position of the fingertip and its orientation and therefore, we have to solve inverse kinematics problem for the grippers finger to calculate appropriate joint angles. – LM has also special function to track position and orientation of elongated objects (visible in the scene) - based on this feature we have built mock-up gripper (reproducing Willow Garage experiment) which position/orientation can be followed by grippers position/orientation The most important limitations we have observed are: occlusion, erroneous pose detection and API malfunctions. 3.2

Indirect control - Using gestures and higher level information for gripper control

Recent success of devices such as tablets or smartphones proves that gestural control can be intuitive and ergonomic for users. Previous works in robot control using gestures focused on important task of gesture recognition from flat 2D pictures as well as providing a set of robust gestures that could be used in many environments [5], [6]. Integrated vision systems use 3D data in gesture recognition, and provide additional information about speed, direction and position of hands or fingers when the gesture was done. This can give users additional control, without enlarging the set of gestures necessary to control the gripper [7].

Both LM and 3G systems can recognize a number of gestures, although these are gestures of different type. Leap Motion recognizes gestures based on hand movement and finger movement, similarly to smartphones or tablets. API can process such motions as: swipe, rotate, wall tap (finger moving in direction of imagined wall), keyboard tap (finger moving up or down). For each gesture programmer has also additional information: about speed of motion, its direction and other data specific to gesture. There are also libraries for learning new gestures basing on a frame motion or a hand pose. Leap Trainer provides tools for various learning algorithms e.g. Geometric Template Matching or Neural Networks. For the Three Gears system, gestures are similar to those done with an ordinary mouse, but using motion of the whole hand (as opposed to finger movement recognized by Leap Motion). System differentiates: pinch gesture, drag, release and pinch with both hands. Provided in the same time are information about positions of both hands, which enable implementing a bigger spectrum of gestures, like rotate with two hands instead of two fingers. Control modes using gestures We can propose three approaches to utilize gestures: – Discrete type gestures with force control: in this method user can choose different grip type using gestures. Each grip type provides different finger orientation and different expected forces. The gripping action ends when expected force structure is acquired or a force threshold is exceeded. – Continuous movement gestures: in this mode, gesture controls the degree of gripping. Recognized gesture is translated into continuous movement of gripper, that ends with the end of gesture. – Discrete type gestures that have a helper function in the control system: i.e. they can change an operation mode of the gripper. The advantages of using gestures are of several types: they can be implemented without the need of calibration, recognizing hand gestures and tracking hand is much easier that tracking particular finger movement so it can be done robustly. However, gestural interface for gripper has one important limitation: the user need to remember functions of all movements. In case of direct control hand motion translates directly into gripper motion. Therefore, the user has a natural method to learn the control just by observing the movement, and for well-tuned parameters interaction can be fully intuitive. In case of gestural control, the user has to remember entire mapping as it is based on symbolic and not direct connection [8]. Also, gestures being recognized by a vision system are done in the air, without screen that can provide additional information as in case of tablets or smartphones. As gestural control is new field there is ongoing effort to create a universal vocabulary of gestures that could be used in human-machine interaction. In case of interaction with robots, it is not yet standardized, and we can see many

different gestures used by researchers for even simple commands like stop [5], [9]. Fortunately, there are some procedures for designing gestural interfaces [10] and also some universal principles of designing any kind of human machine interface such as visibility, feedback, consistency, non-destructive operations, scalability and reliability [17]. As gripping task requires full attention from the operator, we suggest the following four rules that make feasible controlling of the gripper using gestures: 1. use relatively small number of gestures to make the recall task easier, especially in stressful situations [10], [11], 2. give feedback when gesture is recognized as well as the command is successful, this will help the operator whose focus will be on the task performed by the gripper - we recommend using signals different than visual such as: haptic (e.g. vibrations) or audio [6], [12], [13], 3. provide user with simulation environment or training mode of operation, 4. use only gestures that are recognized robustly. This means for example setting wide range for gesture parameters (e.g. swipe can be performed with various speeds). Use other feature to avoid erroneous recognition, e.g. position of gesture can be used to discriminate intentional or unintentional action as it is very well sensed by both vision systems.

4

Control System

For the purpose to control a 3-finger dexterous gripper and test both vision systems we have created a control structure based on the Robot Operating System (ROS), as shown in Fig. 2. In this system we are using universal pose or gesture commands that could be generated by either vision systems, sensors glove or combination of these. We have tested the efficiency, robustness and usability of different solutions in controlling 3-finger gripper. Data coming from a sensing device, such as 3D scanner or sensors glove is formatted and sent, as hands pose and gestural commands to the trajectory generator node. Gesture commands can change state of device and hand pose can be used for direct control of the gripper. Direct control node realizes the reference trajectory (produced by previous node) by sending velocity commands to the gripper, with position feedback directly from SDH hand. It also transfers to the system all information about the current state of the gripper. To prepare this controller in the ROS we had to write a few new and to modify some ROS packages, i.e. ROS HandKinect package. Some implementation problems were associated with the limitation of the Three Gears system only to Windows or Apple platforms. Therefore, this equipment is running on the separate Windows machine as a TCP/IP server that publishes a text based protocol. On the Linux platform with ROS middleware we have created special wrapper package to receive and transform these data, as schematically shown in Fig. 3. Our package converts data stream into ROS topics describing flexion angles and recognized gestures, as well as converts rotation quaternions and

Fig. 2. Control structure based on ROS

position vectors described in absolute coordinate system into a series of transform frames that could be used for forward kinematics and visualization, as presented in Fig. 4 Application of the Leap Motion system in ROS required modification of the Rosleapmotion package. It contains a series of python nodes that read the LM protocol and translate it to ROS topics. For our purpose it was necessary to update the code for the newest Leap Motion API and expand it by additional functions providing gestural data info.

5

Practical applications and tests

We have tested two integrated vision systems in the following modes: – direct control of the SDH gripper – calibration of the sensor glove – hybrid system composed of the direct control of the SDH gripper and discrete switching of states by gestures recognized by the Three Gears system Direct gripper control In the direct control mode, joint angle data was translated into the gripper movement. As noted in the Section 3.1 fingers pose recognition is very unstable and dependent on the orientation of the whole hand (possibility of the self-occlusion), and therefore sometimes gripper movements were chaotic.

Fig. 3. Operation scheme of the Three Gears system

Fig. 4. Visualization of the hands posture in Rviz based on the transformation frames

Sensor glove calibration In our previous research [2] we have used sensors glove as a sensor device. It has many attractive features such as no self-occlusion or no dependence on lightning conditions. However, flexion sensors used in it change their parameters in time, as well as they wear out mechanically. Resistance of these sensors can vary over 30% between batches and therefore sensors glove requires calibration. The Three Gears system with HandKinect node provides information about flexion of each joint of the finger, the same values can be measured by the flex sensor mounted in our sensors glove we have collected these data and showed in Fig. 5. Although there is a large number of outliers, there is also a linear correlation between these readings. Relation was calculated using a least squares method. The best fit was obtained when each finger was flexing slowly

and separately. This minimizes an error for vision system as well as gives enough data for robust estimation.

Fig. 5. Data readings from 3 Gears and sensor glove for index finger’s proximal joint and middle finger’s distal joint

Hybrid system The biggest problem with using sensor glove to control a gripper is the fact that one hand is occupied all the time. Its pose is translated into the gripper movement, therefore, this hand cannot be used to other actions like pushing buttons or gestures. Moreover, when the operator is precisely controlling some manipulation task he/she should have eyes always focused on target action and as a consequence using the keyboard or control buttons has to be avoided. Interesting solution for this problem might be the use of gestures. They can be recognized by a vision system, they can be done in the air, almost anywhere on the scene, without touching anything, without finding accurate spot, and even without watching. We have tested the following scenario of controlling the gripper with the sensors glove on one hand, and with special actions provided by gestures made with a second hand. Gestures recognized by the vision system can switch between three states: 1. sensor glove controls gripper directly, 2. sensors glove controls model (so called shadow hand [2]) and movement of real gripper is confirmed, 3. stop gripper in the current pose. Tested vision systems provide recognition of different gestures, therefore, we have used two sets for the same switching commands, as described in Table 2.

We have tested the proposed system with several students seeing it for the first time (see Fig. 6). After only single description of the task and control methods they succeeded to manipulate the gripper and use all gestures. We have

Table 2. Set of gestures to activate different operation modes Action /Gestures Activate gripper control Activate model control Stop motion

Three Gears Single pinch Double pinch Single pinch again

Leap Motion Keyboard tap Circle Horizontal swipe

observed that users needed some kind of feedback about the gesture interpretation and confirmation of an action see our postulates of successful control with gestures collected in Section 3.2. This is particularly important when gestures are removed from the device [12] or the device is changed to another one with different set of gestures. In our system we have used sounds generated from ROS node with pygames program.

Fig. 6. Students testing our control system with 3D vision

6

Conclusion

In this paper we have described two commercially available integrated vision systems, both giving accurate information about hands position and orientation. We have analyzed their potential role in controlling the dexterous gripper and proposed adequate control system architecture. We are also providing some experimental results with direct and indirect control using Three Gears and Leap Motion.

Acknowledgment Research was partially supported by the National Centre for Research and Development under grant No. PBS1/A3/8/2012.

References 1. Onda, H., and Suehiro, T. Motion planning and design of a dexterous grippergraspability, manipulability, and virtual gripper. Proc. IIEEE/RSJ Int. Conference on. Intelligent Robots and Systems, Vol. 1, 1998. 2. Zubrycki I., Granosik G. Test setup for multi-finger gripper control based on robot operating system (ROS), Proc. of 9th Int. Workshop on Robot Motion and Control, pp. 135-140, July 3-5, 2013, Wsowo, Poland, 2013 3. Zubrycki I., Granosik G. Grip recognition and control of 3-finger gripper with sensor glove, to be published in Proc. of Int. Conference on Robotics and Artificial Intelligence. Problems and perspective. (RAIPAP13), Brest, Belarus, 4-6 November 2013 4. Lomas, N. Gartner: 1.2 Billion Smartphones, Tablets To Be Bought Worldwide In 2013; 821 Million This Year: 70 5. Pietrasik, M., arychta, D. Multimedial methods for control of robots. (in Polish), in Postpy robotyki. [part 1], pp.331-338, Warszawa 2006. 6. Zhai, S, et al. Foundational Issues in Touch-Screen Stroke Gesture Design-An Integrative Review. Foundations and Trends in Human-Computer Interaction 5.2 (2012), pp. 97-205. 7. Dhawale, P., Masoodian, M. and Rogers, B. Bare-hand 3D gesture input to interactive systems. Proc. of the 7th ACM SIGCHI New Zealand chapter’s international conference on Computer-human interaction: design centered HCI, 6 Jul. 2006, pp. 25-32. 8. Norman, D.A., and Nielsen, J. Gestural interfaces: a step backward in usability. Interactions 17.5 (2010), pp. 46-49. 9. Ruttum, M. and Parikh, S.P. Can robots recognize common Marine gestures? IEEE 42nd Southeastern Symposium on System Theory (SSST), 2010. 10. Nielsen, M. et al. A procedure for developing intuitive and ergonomic gesture interfaces for man-machine interaction. Proc. of the 5th International Gesture Workshop Mar. 2003, pp. 1-12. 11. Nacenta, M.A., Kamber, Y., Qiang, Y. and Kristensson, P.O. Memorability of Predesigned and User-defined Gesture Sets. Proc.s of the 31st annual ACM SIGCHI Conference on Human factors in computing systems CHI 13, 2013 12. Wigdor, D. and Wixon, D. Brave NUI world: designing natural user interfaces for touch and gesture. Elsevier, 2011, pp. 81-95 13. Williamson, J. and Murray-Smith, R. Audio feedback for gesture recognition. Technical Report TR-2002-127, Dept. Computing Science, University of Glasgow, 2002 14. Weichert, Frank et al. ”Analysis of the Accuracy and Robustness of the Leap Motion Controller.” Sensors 13.5 (2013): 6380-6393. 15. https://forums.leapmotion.com/forum/general-discussion/general-discussionforum/130-processing-where-it-is-computed 16. https://www.sparkfun.com/datasheets/Sensors/Flex/flex22.pdf 17. Norman, Donald A. ”Natural user interfaces are not natural.” interactions 17.3 (2010): 6-10. 18. Hasanuzzaman, Md et al. ”Real-time vision-based gesture recognition for human robot interaction.” Robotics and Biomimetics, 2004. ROBIO 2004. IEEE International Conference on 22 Aug. 2004: 413-418.

Suggest Documents