Intelligent Human-Machine Interface Using Hand ... - Semantic Scholar

2 downloads 102 Views 557KB Size Report
Tech Univ Cluj-Napoca. North Univ Center Baia Mare. 62A, Dr. Victor Babes Street, 430083 Baia Mare, Romania [email protected]. Abstract— Due to the rapid ...
Intelligent Human-Machine Interface Using Hand Gestures Recognition Stefan Oniga, János Vegh

Ioan Orha

Informatics Systems and Networks Department University of Debrecen Debrecen, Hungary [email protected]

Electric, Electronic and Computer Engineering Department Tech Univ Cluj-Napoca North Univ Center Baia Mare 62A, Dr. Victor Babes Street, 430083 Baia Mare, Romania [email protected]

Abstract— Due to the rapid increase of number of industrial or domestic systems that must be controlled it is clear that new, more natural methods of control are needed. This paper presents an intelligent human machine interface based on hand’s gesture recognition. The gestures based control system is composed by two subsystems that communicated via radio waves. The first subsystem is a bracelet that captures the movement of the hand using accelerometers. The second subsystem is the control box on which the data processing takes place. Artificial Neural Networks (ANN) are used to add learning capabilities and adaptive behavior to intelligent interfaces that can be used even by elderly or impaired people. Field Programmable Gate Array (FPGA) implementation is an easy an attractive way for hardware implementation. The desired network is modeled, trained and simulated using Neural Network Toolbox. Many networks architecture trained with different methods could be simulated and the network that is best performing for given application is chosen for hardware implementation using System Generator tool developed by Xilinx Inc. This also allows the easy generation of Hardware Description Language (HDL) code from the system representation in Simulink. This HDL design can then be synthesized for implementation in the Xilinx family of FPGA devices. Keywords-Human-Machine Interface, Gesture recognition, Artificial Neural Networks, Field Programmable Gate Arrays

I.

INTRODUCTION

Due to the rapid increase of number of industrial or domestic systems that must be controlled it is clear that new methods of control are needed. Gesture recognition is important for developing alternative human-computer interaction modalities that enables humans to interface with machine in a more natural way. There are many types of gesture researches like body gesture, finger point movement, etc. Hand gesture recognition based man-machine interface is being developed strongly in recent years. In earlier study we used a date gloves with microcontroller and connected with the device through a wire. Static and dynamic hand gestures were used also in the research, but hand static gestures (postures) were the most dominant part of research regarding gesture control system. Direct control via hand posture is limited in the number of choices, but is immediate.

II.

THE 3D WIRELESS INTERFACE PROJECT

This intelligent Human-Machine interface presented in this paper is composed by two main subsystems that communicate via radio waves, and a localization identification system. The user has the possibility to control electronic/electric devices only with hand gestures. The presented implementation supports the recognition of five basic hand gestures: idle, up, down, left, right. The main system can be used in two main operation modes: on screen control and direct control. The on screen control operation mode implies the use of a monitor for feedback purpose. The user can navigate through a menu and select an action using simple hand movements. The second mode of operation control is without a feedback, allowing the control of different devices without the use of a display system. This method of control uses a localization identification system in order to pinpoint the user location and based on this location different actions are possible. The complete gesture based Human-Machine interface is presented in Fig. 1. The first subsystem is a bracelet that acquires dynamics and motion data of the hand. These data are transmitted to the second subsystem. In the meantime the location identification system sends data by infrared regarding the user position. The data are sent in raw form using radio waves to the second subsystem.

Figure 1. Gesture based control system

The second subsystem is the control box on which the user position and approximate hand movement are processed. These hand movements allow the user to navigate into a menu displayed on a screen. The menu entries are real world actions that the system can undertake. These menu entries are stored on a SD card in order to be easily modified. This subsystem can control devices directly or through X10 protocol. X10 is an international and open industry standard for communication, using power line wiring for signaling and control, among electronic devices used for home automation. Artificial Neural Networks (ANN) is used to add learning capabilities and adaptive behavior to intelligent interfaces that can be used even by elderly or impaired people. A. The bracelet This subsystem detects the hand movements and user location. The bracelet presented in Fig. 2 is composed by an accelerometer, an infrared receiver, a microcontroller and the transceiver. The bracelet hardware components are presented in Fig. 3. The hand movements are detected using the ADXL345 three axes accelerometer from Analog Devices. The accelerometer was chosen due to its 13 bits resolution and the ability to measure acceleration up to 16 g. The user location is determined by receiving information from the location identification system. Localization data are received using a TSOP31238 infrared module. The infrared module has built-in filters and preamps. The microcontroller receives data from the accelerometer and infrared module and sends these data through radio waves to the control box subsystem to be processed. The microcontroller used is an Atmega168 from Atmel, and the transceiver AT86RF212 on a PmodRF1 board from Digilent Inc., is a low power type that assures that the communication is consistent even in a big house.

B. The location identification system The location identification system consists of infrared beacons that are spread across the area that is to be controlled. The small beacon uses the RC-5 protocol to emit via infrared a unique code that represents the beacons location. The beacon is composed of a SAA3010 IC, an infrared LED and a small number of analog components. C. The control box The control box is built using a Nexys 2 FPGA board from Digilent Inc. equipped with a Spartan 3E FPGA with 500K gates. This assures easy upgrade and a cheap yet powerful platform for data processing. The data are received using the same type of radio transceiver PmodRF1. The FPGA board has a VGA output that is used for displaying the menu on any VGA compatible monitor or even on a regular TV set using an external converter. The menu entries are stored on a SD card under a XML style format and could be accessed using the SPI protocol. Each menu entry represents a possible action or a category of actions the user can choose to control real devices. Devices could be controlled via X10 protocol using the CM11 controller from Marmitek that transmit X10 commands via electrical grid. The controller receives commands via its serial interface connected to the Nexys board a build-in serial port. This way any X10 compatible receivers can be used in order to control any household appliance without a direct connection with the control box. The components of the control box are shown in Fig. 4. The control box is implemented using VHDL language and Picoblaze microcontrollers. The software modules of the control box are presented in Fig. 5.

Figure 4. Control box components

Figure 2. The bracelet

Figure 3. Bracelet components Figure 5. Control box software modules

D. The menu The menu is composed of the entries stored on the SD card. These entries are organized in a treelike structure of actions that group similar entries offering a more intuitive interface. The entries are stored in a XML like raw format presented in Fig. 6. GESTURE RECOGNITION

III.

In order to determine the movement identification method, a series of data from the accelerometer have been captured and plotted into graphs. For example in Fig. 7 a right movement graph is presented. Based on this graphs, the method of movement identification was chosen. In this case, due to the necessity of recognizing only five basic gestures, a compare based algorithm was tested first. This algorithm works well if the number of gestures that has to be identified is small, but if the number of gesture increases, a more powerful algorithm needs to be used in order to ensure a precise identification. A. Hand Gesture The number of gestures that need to be recognized depends on the field of the application: •

To manipulate in a 3D virtual space, 3 gestures are normally sufficient: grab, start, stop, • In more complex systems, 5 to 10 gestures need to be recognized, • In order to recognize the sign language 26 to 51 gestures must be taken into consideration. The navigation in the menu is done with different hand movements: front, back, left, right. These movements can be replaces by inclining the bracelet, especially for those impaired. The purpose of this work is to create a system capable of recognizing 5 gestures used to navigate through a menu displayed on a screen. The five gestures are: • •

horizontal = idle front (up) = enter in submenu or activate the attached command



back (down) = exits from current menu and go back to the parent menu • left, right = navigate within the menu left/right The entries can have an external action attached, e.g.: the light is turned on by selecting the TURN LIGHT ON option from the menu. The front movement of the hand enters a submenu of the current menu or activates an external action (if attached); the back movement exits from a menu to the parent menu, the left and right movements navigate within the menu left/right. B. Gesture Recognition Methods There are several methods used to recognize gestures, like angle filtering, statistical methods, principal component analysis, neural networks, testing pattern proximity, etc. These methods can be divided into two large groups: those that can learn to recognize simple gestures and those that cannot. From known methods of recognition, we chose the solution that utilizes neural network, because it offers the best results regarding the precision. Well known for their success in the recognition of patterns, the neural networks are used in numerous gesture recognition systems. There are multiple advantages that the neural networks have to offer: training based on examples, recognizing gestures even if noise is present or the data is incomplete, and last but not least the generalization. This last characteristic plays a crucial role in the systems performance due to the fact that not even the same user will reproduce its gestures accurately. It is difficult to compare the recognition methods because the implemented systems do not work with the same gestures and were not tested in the same way, etc. Even so, it is obvious that the recognition method that uses neural networks offers the best recognition rate. C. Neural Network Design The neural networks used to recognize gestures are different depending on the gestures that had to be recognized. The desired network architecture is simulated using Neural Network Toolbox, the neural network weights could be saved in a file and will be loaded automatically from Matlab workspace to weight (ROM) memory of the hardware model represented in Simulink. Many networks architecture trained with different methods ware simulated and the network that is best performing for given application was chosen for hardware implementation.

Figure 6. Menu entries store format

The design and implementation of the hardware model is made using System Generator and Xilinx ISE.

400 300 200 x y

100

z 0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20 21 22

-100

-200

Figure 7. Right movement captured data

D. Neural Network Model in Simulink In order to choose a NN capable to recognize as accurately as possible the gestures of a hand, several types of NN were simulated, as well as combination of these. Finaly we have chosen a feedforward with 2 layers architecture:

• First layer has 20 neurons and sigmoid activation function • The second layer has 5 neurons with purelin activation function. The software (Simulink) model of the neural network designed, trained and simulated using Neural Network Toolbox is presented in Fig. 8. Training of the NN was made using TRAINLM Levenberg-Marquardt backpropagation training function and a data set composed of 1500x5 vectors. The designed neuron model in System Generator (Fig. 9) is equivalent with the well know McCulloch-Pitts model and implement (1).

Figure 9. The hardware implemented neuron model

Among the blocks created can be mentioned: • • • •

N

y( x ) = f (a − θ ) = f (∑ w i x i − θ )

(1)

i =1

The model was implemented using blocks from Xilinx Blockset library and neural networks specific blocks created by us. Input 1

xp

weight

x{1}1

Process Input 1

IW{1,1} bias b{1}

weight netsum

logsig

LW{2,1} bias

ay

netsum1

purelin Process Output 1 y{1}1

b{2}

Figure 8. Software model of the ANN used to recognize hand gestures

Weights memory (ROM) MAC block (weighted sum) Bias block Activation function block.

The activation function block is common to all the neurons and is implemented using look-up table. Using this model for a neuron we have implemented in hardware the model presented in Fig. 10. The ANN is composed of following blocks: • • • • • •

Control blocks Input layer Preprocessing (normalization) Hidden layer (20 neurons) Activation function block, Output layer (5 neurons)

Figure 10. Neural network model in System Generator

Figure 11. Simulation of the hardware model of the ANN - Outputs of the neurons from the output layer

IV.

RESULTS

A. Simulations Fig. 11 shows the results obtained simulating the hardware model of ANN. Only one output is active for a given input, those corresponding to the recognized gesture B. Resources The estimated resources used for the implementation of a neuron using full precision of the multipliers and accumulators are presented in Fig. 12.

In a bigger device having some hundreds to 2128 DSP blocks and BRAMs we can implement hundreds and even some thousands of neurons with neuron parallelism. V.

CONCLUSIONS

For more complex method of control, more gestures must be recognized. We have demonstrated the functionality of the system implementing the recognition of five basic gestures: idle, left movement, right movement, up movement and down movement. Recognizing more gestures, the system will allow a greater degree of control and even a more intuitive control system for the end user. The system can also be controlled by hand tilting only, giving persons with a low degree of mobility a possibility to use the system. REFERENCES [1]

Figure 12. Estimated resources neded for implementation of a neuron

The resources depend in mainly on the number of bits used for data and weights representation. The estimated resources are for 14 bits representation of data and 16 bits for weights. The multiply-accumulate operation requires a large amount of logic blocks and therefore represents the bottleneck of NNs implementation in FPGA. The resources needed for the implementation of a neuron could be as low as 12-22 slices if we use various techniques presented in [7] for reduction of number of bits used by the multiplier and the accumulator. The resources needed to implement in a Spartan XC3S500E FPGA, the 25 neurons FF-BP ANN with neuron parallelism using full precision are presented in the Fig. 13.

[2]

[3]

[4]

[5]

[6]

[7]

Figure 13. Resource estimation for implementation of 25 neuron

S. Oniga, A. Tisan, D. Mic, A. Buchman, A. Vida-Ratiu, Hand “Postures recognition system using artificial neural networks implemented in FPGA,” 30th International Spring Seminar on Electronics Technology, ISSE 2007, Cluj-Napoca, ROMANIA, May 9-13, 2007, pp: 507 – 512. L. Marchese, “Digital neural network based smart devices”, Project proposal, FP7 European Research Program, http://www.synaptics.org, December 2006, unpublished. S. Oniga, A. Tisan, D. Mic, C. Lung, I. Orha, A. Buchman, A. VidaRatiu, “FPGA implementation of Feed-Forward neural networks for smart devices development”, Proceedings of the ISSCS 2009 International Symposium on Signals, Circuits and Systems, July 9-10, 2009, Iasi, Romania, pp. 401-404. A. Tisan, S. Oniga, A. Buchman, C. Gavrincea, “Architecture and algorithms for synthesizable neural networks with on-chip learning”, International Symposium on Signals, Circuits and Systems, ISSCS 2007, July 12-13, 2007, Iasi, Romania, vol.1, pp 265 – 268. J. Torresen, Sh.Tomita, „A review of implementation of backpropagation neural networks”, Chapter 2 in the book by N. Sundararajan and P. Saratchandran (editors): Parallel Architectures for Artificial Neural Networks, IEEE CS Press, 1998. ISBN 0-8186-8399-6. J. Zhu, P. Sutton, „FPGA implementations of neural networks – a survey of a decade of progress”, 13th International Conference on Field Programmable Logic and Applications (FPL 2003), Lisbon, Sep 2003, Lecture Notes in Computer Science, 2003, Volume 2778/2003, 10621066, DOI: 10.1007/978-3-540-45234-8_120. S. Oniga, A. Tisan, D. Mic, A. Buchman, A. Vida-Ratiu, “Optimizing FPGA implementation of feed-forward neural networks”,, Proceedings of the 11th International Conference on Optimization of Electrical and Electronic Equipment OPTIM 2008, 2008, Brasov, Romania, May 2223, pp. 31-36.

Suggest Documents