This paper presents a novel approach to control or navigate mobile platforms in a GPS denied .... cross-platform framework which can define application.
2015 International Conference on Information and Communication Technology Research (ICTRC2015)
A Gesture Based Kinect for Quadrotor Control Ahmed Mashood, Hassan Noura
Imad Jawhar, Nader Mohamed
Department of Electrical Engineering UAE University, Al Ain, Abu Dhabi, UAE e-mail: {a.mashood, hnoura}@uaeu.ac.ae
College of Information Technology UAE University, Al Ain, Abu Dhabi, UAE e-mail: {ijawher, nader.m }@uaeu.ac.ae Gestures based interaction with robots will soon find non-experts users who have little knowledge about keyboard and the system interacting and operating the robots using natural gestures [6]. Thus, pointing on the importance of equipping robots with Natural user interfaces. The advent of sophisticated software in the field of digital image processing has garnered immense interest from all part of engineering spectrum. This main contribution from this work is the proposal to introduce an intuitive framework for control of mobile robots, in a GPS denied environment. This framework introduces lot of challenges that has to be addressed carefully.
Abstract— This paper presents a novel approach to control or navigate mobile platforms in a GPS denied environment, using natural human gestures. Gesture based interfaces and visual computing techniques are used to develop systems that can be used in Unmanned Aerial Vehicle (UAV) navigation using body postures. This approach will easily marginalize the complexity in Human-Computer interaction (HCI), making it more intuitive and repetitive. Keywords—Teleoperation; Natural User Interface; Kinect; Quadrotor control; Interactive system; Human Computer Interaction; Human-Machine Interfaces
The paper is structured as follows: Section 2 reviews and compares other similar works. Section 3 introduces architecture of the systems and its components. It also describes gesture design and mapping in detail. Section 4 describes the implementation part with example. Section 5 and 6 are results and conclusions respectively.
1. INTRODUCTION Scientific Exploration in Teleoperation of mobile robots and Unmanned Aerial Vehicles (UAVs) have considerably evolved in the present era. Several countries have already mastered UAV technologies, and is a key military asset known as drones. Due to an increased demand in military and civilian applications, commercial drones market has grown and flourished rapidly. A Middle Eastern country, United Arab Emirates has recently unveiled their plans for delivering documents using UAVs [1]. Recently an Indian based pizza chain in Mumbai delivered pizza using a UAV[2]. In UAV, concept of on-board pilot is completely replaced by powerful processors and cameras; hence the vehicles cost is considerably reduced as compared to piloted counterparts, offering a better design flexibility like increased mobility and smaller size. The control can be administered autonomously using the on-board processors, or with the help of remote ground operators [3].
2. SIMILAR WORK Robot navigation using gesture based interfaces has been widely studied. ETH Zurich group implemented a way to interact with a single quadrotor depending on the position of arms [7]. The user’s local 3d co-ordinates are mapped to the quadrotor to obtain a dynamic direct mapping system[7]. The approach implemented in this paper is different from ETH Zurich group work. Here, discrete control commands will be triggered or generated by static gesture recognition technique rather than relying on tracking quadrotor position. Sanna et al. [4] also did work on Gesture based navigation of a single quadrotor. The degree of navigation in our system is not constrained by a predefined boundary. The Human operator can control the quadrotor as long as there is a stable wifi connection between the Drone and the laptop
Human Robot Interaction (HRI) has become an important domain in technological innovation and has paved way for new and challenging horizons in the research area. HRI is derived from Human Computer Interaction (HCI)[4]. Traditional systems require direct physical human contact for communication with the interfacing unit’s keyboard or mouse, limiting the scope and dimension of interaction drastically. With the advent of innovative gesture based Natural User Interfaces (NUI) a new frontier of communication technique has evolved, especially in the field of control of ‘Mobile platforms’. Voice and gesture based NUI’s are among two main medium to implement reliable HCI. Since the latter represents direct expression of mental concepts, it is the most preferred [5].
978-1-4799-8966-9/15/$31.00 ©2015 IEEE
3. SYSTEM ARCHITECTURE OVERVIEW Teleoperation or tele-navigation of UAVs (mobile robots) by a single operator coupled with a gesture interface opens up a new dimension in navigation technology. The main advantage of Teleoperation is that, there is a physical separation between the vehicle and an operator[8]. The controller hardware or joystick is replaced by gesture based User Interfaces. During navigation there is no physical contact between the operator and the UAVs controller.
298
2015 International Conference on Information and Communication Technology Research (ICTRC2015)
Our system for Teleoperation of UAV is composed of a Microsoft Kinect[9] sensor (Natural User Interface) Device developed by Microsoft Corporation for motion sensing. An electrically powered radio controlled flying quadcopter (AR Drone)[10] that has 6 degrees of freedom (DOF) inertial measurement unit and a laptop (Base station). Figure 1 shows the High level Architecture of the system. OpenNI[11] is a cross-platform framework which can define application programming interface (API) for writing applications that can be used for Natural Interaction. The ‘Flexible Action and Articulated Skeleton Toolkit’ (FAAST) [12] program uses the cues from RGB camera and depth sensor streams to form a skeleton model of the human body. The skeleton data consists of 24 skeleton joints, along with information of location and orientation. Figure 2 represents a gesture and its skeletal image.
4.
Data processing with minimal time delay.[13]
In Static gestures recognition, Actions depends on gesture in the image and does not change until the threshold is breached again by a new gesture[13]. A total of eleven different static body postures are designed for this work. Figure 3 shows different gesture postures modelled using human body. Table 1 shows the correspondence between body postures detected by FAAST application and commands to be sent to the UAV.
Figure 3: Gestures and Actions Table 1: Correspondence between body postures and commands for the quadrotor Figure 1: High level Architecture of NUI Body Posture Left Arm Above head Right Arm Above head Right Arm in front of face left Arm in front of face Right arm Flexed right Left arm Flexed left Lean Left Lean Right Right arm left of torso Left arm right of torso Left foot apart from Right foot
Figure 2: RGB image (left), skeletal image (right) A. Gesture Design Designing intuitive gestures for recognition is a challenging part. Entire human body is used as gesture source rather than a specific part of the body. A four point rule was followed in designing the gestures. 1. Gestures should be natural, consistent and very easy to do. 2. The information in the captured images should be related to gestures in order to identify gesture changes. 3. There should be a clear distinction between the background and the gestures and avoid interference between them.
978-1-4799-8966-9/15/$31.00 ©2015 IEEE
Command Land Take-Off Forward Backward Right Left Yaw Left Yaw Right Up Down Active
4. FRAME WORK IMPLEMENTATION The Kinect depth image sensor detects and track human movements and streams the gathered data with a resolution of 640x480 pixels at 30 Hz to the base station laptop via USB 2.0 connection. The data is then analyzed and processed by the programmed FAAST software to generate commands or
299
2015 International Conference on Information and Communication Technology Research (ICTRC2015)
‘control commands’ to be sent to the UAV using Wi-Fi. The Graphical User Interface will facilitate the video streaming from on- board AR Drone cameras and display it in the base station screen. By observing the information, the operators will manipulate and control the UAV. Java language written code is used to facilitate the navigation of UAV. Figure 4(a) shows the Data Exchange Hierarchy among the NUI Components and 4(b) shows the software point of view architecture.
Figure 6: Formation flying example
5. RESULTS In order to evaluate the performance of the system, a test was conducted inside a room of 5m X 7m X 3.5m Dimension. Due to the indoor environment GPS tracking was not possible. The UAV’s position was not tracked. The total test flight time was around 27 seconds. Data were collected from built in sensors in the UAV. Figure 4: (a) Data transaction among the NUI’s, (b) Software Architecture
Experiment Results and Discussion Figures 7 and 8 show the altitude and speed data during the flight. Figure 9 shows the Yaw, Pitch and Roll angles. The maximum altitude was set at 2 meters. The data exchange rate between UAV and Laptop was 30/sec. Figure 10 is 3D plot of the flight.
Figure 5 is a Flow chart for Human operator to achieve a successful flight. Figure 6 is a demonstration of navigation of UAV. Figure 6a shows the UAV taking off in accordance with the controller command given by human gestures. Figure 6c and 6d shows roll and pitch movements. [14] Shows a video link for the demonstration.
Figure 7: Altitude graph
Figure 8: Average Speed graph Figure 5: Flowchart for Host Controller Decision making Algorithm
978-1-4799-8966-9/15/$31.00 ©2015 IEEE
300
2015 International Conference on Information and Communication Technology Research (ICTRC2015)
8. REFERENCES [1]
[2] [3] Figure 9: Yaw, Pitch, Roll Angles
[4]
[5]
[6] Figure 10: 3D Plot The Latency of the System or total communication delay between the user and the drone was on an average of 0.32 seconds. This was measured by analyzing the video sequence in [14]and counting the number of frames elapsed between user and Ar.Drone movements. The factors that contribute latency are the delays introduced by the Kinect Driver, API library, the network and the base station-to-drone link.
[7] [8]
[9]
6. CONCLUSIJON AND FUTURE WORK In this paper, a framework for navigating a UAV, using human body postures in a GPS-denied environment was introduced. Experimental results show the niche market that is shaping up in the field of Human Computer Interaction. Devices like Kinect and other NUI Devices that are affordable, help in creating innovative Human Robot interaction. In future work a gesture based centralized Multi agent architecture or Formation Flight of two or more UAVs using Leader-Wingman approach is being considered.
[10] [11] [12] [13]
7. ACKNOWLEDGEMENT [14]
This research is supported by the United Arab Emirates University, Al-Ain, UAE.
978-1-4799-8966-9/15/$31.00 ©2015 IEEE
301
“How drones are speeding up Dubai documents delivery.” [Online]. Available: http://gulfnews.com/news/gulf/uae/general/how-dronesare-speeding-up-dubai-documents-delivery-1.1317319. [Accessed: 12-Aug-2014]. “Mumbai eatery delivers pizza using a drone,” The Indian Express. . R. McCune, R. Purta, M. Dobski, A. Jaworski, G. Madey, A. Madey, Y. Wei, and M. B. Blake, “Investigations of DDDAS for command and control of UAV swarms with agent-based modeling,” in Simulation Conference (WSC), 2013 Winter, 2013, pp. 1467–1478. A. Sanna, F. Lamberti, G. Paravati, and F. Manuri, “A Kinect-based natural interface for quadrotor control,” Entertain. Comput., vol. 4, no. 3, pp. 179–186, Aug. 2013. V. I. Pavlovic, R. Sharma, and T. S. Huang, “Visual interpretation of hand gestures for human-computer interaction: a review,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 677–695, Jul. 1997. C. Hu, M. Q. Meng, P. X. Liu, and X. Wang, “Visual gesture recognition for human-machine interface of robot teleoperation,” in 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2003. (IROS 2003). Proceedings, 2003, vol. 2, pp. 1560–1565 vol.2. Interaction with a Quadrotor via the Kinect, ETH Zurich. 2011. T. M. Lam, V. D’Amelio, M. Mulder, and M. M. Van Paassen, “UAV Tele-operation using Haptics with a Degraded Visual Interface,” in IEEE International Conference on Systems, Man and Cybernetics, 2006. SMC ’06, 2006, vol. 3, pp. 2440–2445. “Microsoft Kinect.” [Online]. Available: http://www.microsoft.com/en-us/kinectforwindows/. “Ar.Drone web site.” [Online]. Available: http://ardrone.parrot.com. “OpenNI.” [Online]. Available: http://structure.io/openni. “Flexible Action and Articulated Skeleton Toolkit (FAAST).” . C. Hu, M. Q. Meng, P. X. Liu, and X. Wang, “Visual gesture recognition for human-machine interface of robot teleoperation,” in 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2003. (IROS 2003). Proceedings, 2003, vol. 2, pp. 1560–1565 vol.2. “Ar drone kinect maneuver,” 30-Dec-2013. [Online]. Available: https://www.youtube.com/watch?v=j5J27FzoVao&featu re=youtube_gdata_player.