Development and implementation of an application that translates the ...

4 downloads 111296 Views 317KB Size Report
Android operating system, which allows the interrelation of people with hearing ... from 1 to 10 used in the education of deaf people is shown in. Figure 1 and 2.
Development and Implementation of an Application that Translates the Alphabet and the Numbers from 1 to 10 from Sign Language to Text to Help Hearing Impaired by Android Mobile Devices María Gabriela Vintimilla, Darwin Alulema, Derlin Morocho, Mariela Proaño, Francisco Encalada and Evelio Granizo

Abstract— This article is based on the design and implementation of an application for mobile devices with Android operating system, which allows the interrelation of people with hearing impairment. The application is able to learn and recognize a letter or number sign language with no movement by applying artificial neural networks. The application indicates whether the captured image is part of the letter to be recognized, if the image does not belong to the corresponding letter, the application displays an error message. Index Terms— Android, Neural Networks, Sign Language

I. INTRODUCTION

M

OBILE technologies are considered the most booming

industry in today's technological environment. That is why a large number of technological innovations focus on mobile telephony. [5], [6], [7] Mobile devices, nowadays, are no longer considered just a María Gabriela Vintimilla. She is with Universidad de las Fuerzas Armadas-ESPE, Sangolquí-Ecuador. She is now with the Department of Electric and Electronic ([email protected]).

device to make calls, but it has many everyday uses such as messaging, Internet, music playback, etc. In a way, the mobile phone has become a personal computer. At present new technologies for the integration of people with disabilities are developed to achieve a "plural" society "without digital divides". Over time this integration has improved and has achieved greater interaction in the use of mobile devices and disabled persons. [5], [6], [7] II. SIGN LANGUAGE The sign language is a natural language of space gesture expression and visual perception. With sign language for people with hearing impairments can establish a channel of communication with the social environment, which may be made by others with the same deficiencies or anyone who knows how to use sign language [8]. In almost all the world, signs are used to represent the letters of the alphabet with which the oral language of a country is written. This is called manual alphabet or finger spelling. In the case of Spanishspeaking countries where the Latin alphabet is used, deaf people use a single, common manual alphabet for all countries with some minor variations. The manual representation of the alphabet and numbers from 1 to 10 used in the education of deaf people is shown in Figure 1 and 2.

Darwin Alulema. He is with Universidad de las Fuerzas Armadas-ESPE, Sangolquí-Ecuador. He is now with the Department of Electric and Electronic ([email protected]). Derlin Morocho. He is with Universidad de las Fuerzas Armadas-ESPE, Sangolquí-Ecuador. He is now with the Department of Electric and Electronic ([email protected]). Mariela Proaño. She is with Universidad de las Fuerzas Armadas-ESPE, Sangolquí-Ecuador. She is now with the Department of Electric and Electronic ([email protected]). Francisco Encalada. He is with Universidad de las Fuerzas ArmadasESPE, Sangolquí-Ecuador. He is now with the Department of Electric and Electronic ([email protected]). Evelio Granizo. He is with Universidad de las Fuerzas Armadas-ESPE, Sangolquí-Ecuador. He is now with the Department of Electric and Electronic ([email protected]).

9781509011476/16/$31.00©2016IEEE.

Fig. 1. Alphabet

The project objective is to recognize letters of the sign language, for which there are two phases in all applications of neural networks: Training phase: a data set or training patterns is used to determine the weights that define the neural model, these data are sent to a database server that is hosted in the cloud. Test phase or direct operation: once this model is trained, test patterns which constitute the normal network entry are processed. The application design consists of three layers as shown in Figure 3. Client layer, which is installed on the mobile application. The layer of MySQL database, which is in the cloud is made up of two tables, the first to store the images captured after an image pretreatment, and the second to capture the weight values generated in the learning mode. The Artificial Intelligence server layer, where the Back Propagation algorithm is executed. Fig. 2. Number from 1 to 10

III. NEURAL NETWORKS Artificial Neural Networks are inspired by biological neural networks of the human brain. They consist of elements that behave similarly to the biological neuron in its most common functions and have a number of characteristics of the human brain. For example learn from experience, generalize from previous examples to new examples and they can offer, within a range, right responses to entries that have small variations due to the effects of noise or distortion. One of the main features of neural networks is their ability to learn from previous training. The aim of training is to get the application, to a set of inputs to produce the desired set of outputs or minimally consistent ones. The algorithm used for training is the Back Propagation algorithm which consists in matching each input vector with its corresponding output vector, ie, an input vector is presented, the output of the network is calculated and compared with the desired output, the error or resulting difference it is used to feed the network and change the weights according to the algorithm which tends to minimize the error. The Backpropagation algorithm is a method of training multilayer networks. Its power lies in its ability to train hidden layers. In implementing the backpropagation algorithm, two distinct passes of computing are distinguished. The first pass is referred to as the forward pass, and the second as the backward pass. In the forward pass weights remain unaltered throughout the network, and the functional network signals are computed neuron by neuron. In the backward pass, calculations of modifications of all weights connections start with the output layer and goes backward through all layers of the network to the input layer.

V. TALK TO SIGN ARCHITECTURE

Fig. 3. Talk to Sign Architecture Diagram.

A. Analysis of the Distance to obtain Ideal Information For the training phase, two modes are used in the application interface shown in Figure 4: • Learning Mode (Modo Aprender) • Generating weights Mode (Modo Generar pesos)

IV. APPLICATION DEVELOPMENT Fig. 4. GUI Application.

The aim of training is to have an array of bits of the original image, store them in the database and generate the respective weights of each letter. To achieve this goal an image treatment is performed, the image processing consists of: Obtaining the grayscale: the image conversion to grayscale is performed, with this process an image is obtained on a scale in the range [0,255], where 0 is black and 255 is white. Image= 0.299 x R + 0.587 x G + 0.144 x B

(1)

Where: R: red G: green B: blue - Binarization: process after which a binary (black and white 0 255) image is obtained. This process is done to rule out intermediate colors, and have an image in two ranges only. This process is performed by a threshold value of 40, this value was obtained by tests.

(2) Pixel[i] give the grey value of the pixel number i, being i ε [0, image_size] - Reduction: with the aim of optimizing the input for the neural network the size is reduced to 40x40 pixels, so it does not depend on the size of the captured image because all be reduced to obtain a treatable image to the neural network. This value is determined by trial and it was observed that was the minimum value to reduce an image, as this size could distinguish similar letters. - Obtaining features: obtaining an array of 0 and 1 that contains the pixels of the processed image. The array is comprised of 1600 bits obtained from the binarized 40x40 image. This array will serve as input to the neural network. Reducing the size of the image it is performed for a treatable array for the neural network so that the application is quick and does not have a very big delay. Weight generation is carried out according to the number of samples stored in the database. The greater the number of samples, a higher pressure will have the recognition, this weight is stored in another table in the database. B. Test Phase or Direct Operation Once the training phase is done, the next step is the testing phase. Internally, the neural network is formed by the input layer that are the product of the binarized image, then the data enter the three hidden layer of the neural network in sequential order and finally the output layer, which will have 8 units that return a value between 0 and 1 in each one of them. This vector of 8 components refers to the binary value of the letter or number that can be seen in Table I.

TABLE I. ALPHABET LETTERS AND NUMBERS IN BINARY

A B C D E F G H I L M

Letters 1000001 1000010 1000011 1000100 1000101 1000110 1000111 1001000 1001001 1001100 1001101

N O P R S T U V W X Y

1001110 1001111 1010000 1010010 1010011 1010100 1010101 1010110 1010111 1011000 1011001

Numbers 1 00000001 2 00000010 3 00000011 4 00000100 5 00000101 6 00000110 7 00000111 8 00001000 9 00001001 10 00001010

VI. RESULTS Test were done for letter A, for that correct and incorrect letters A were captured. Keep in mind that the scenario for testing was a black background and without brightness because it can cause problems when performing image processing, generating false zeros and ones in the image binarization. Table II summarizes the obtained test results. TABLE II. Of Test Results

Selected Letter Captured Letter A A A A A A A A A A A O A R A Displays anything A 5 A Displays anything

Answer Letter A Letter A Letter A Letter A Letter A Unknown frame Unknown frame Letter A Unknown frame Unknown frame

VII. CONCLUSIONS The application is able to recognize symbols from previously learned pictures, if the image taken is not recognized, the application does not return any letter, which could be the basis for an application that recognizes whole words. Not every set of symbols of the international sign language has been treated, because the symbols of the letters such as j, k, n, q, x, z require movement so it should be performed in video and adapt the system to be able to identify movement. The variation in lighting and background color can cause difficulty recognizing images because the training does not provide sufficient information to the network. This is because at the time of the image processing brightness or light color background will be displayed as "black" and the application responds as error. The application consists of three levels, a level is the application on the mobile device, and the other two levels are in the cloud, Artificial Intelligence and Database, in this way if they are implemented in the mobile, the processing time increases, and therefore its response time.

[1]. [2]. [3].

[4].

[5].

[6].

[7].

[8].

REFERENCES Sordos Ecuador. (s.f.). Lenguaje de señas. Recuperado el 23 de enero de 2014 http://www.sordosecuador.com/lengua-de-senas/ Fiszelew, A. (2002). Generación automática de redes neuronales con ajuste de parámetros basado en algoritmos genéticos: Universidad de Buenos Aires. Hilera, J.R. (2000). Nuevas técnicas de modelización y predicción de fenómenos complejos. Recuperado el 10 de marzo de 2014 de Redes Neuronales Artificiales y Algoritmos Genéticos. Alcalá de Henares, España. Andes. (2012). Vicepresidencia del Ecuador impulsa el primer Diccionario Oficial de Lengua de Señas Ecuatoriana. (2012). Recuperado el 10 de marzo de 2014 de http://www.andes.info.ec/es/sociedad/7820.h tml P. L. Flavio, A. Darwin, S. Jenny, M. Derlin and I. Alexander, "Design and implementation of an oximetry monitoring device," 2015 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON), Santiago, 2015, pp. 185-189. P. L. Flavio, A. Darwin, V. Mauricio, G. Edwin and M. Derlin, "Prototype for measuring blood pressure on the Android platform for mobile devices," 2015 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON), Santiago, 2015, pp. 199-204. B. Simbaña, D. Alulema, Ch. Vega, and D. Morocho, "Diseño de una Aplicación basada en Realidad Aumentada para el Centro Histórico de Quito," in STSIVA2015, Bogotá, 2015. C. Guerra, D. Novillo, D. Alulema, H. Ortiz, D. Morocho and A. Ibarra, "Electromechanical prototype used for physical deployment of Braille characters for digital documents," 2015 CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON), Santiago, 2015, pp. 191-198.

María Gabriela Vintimilla. First Autor, Electronic and Telecommunication Engineer at the Universidad de las Fuerzas Armadas-ESPE. She works with the Electrical and Electronic Department of the university. His research interests are related to the design and implementation of several mobile applications based in android. Darwin Alulema. Second Autor, Electronic and Telecommunication Engineer at the Universidad de las Fuerzas Armadas-ESPE at 2006. Master at Tele-Informatic and Computer Networks at Universidad Tecnológica Equinoccial at 2009. Actually He is a teacher at Universidad de las Fuerzas Armadas-ESPE, also He is a member of the Electrical and Electronic Department of the university.

Derlin Morocho. Third Autor, Electronic and Telecommunication Engineer at the Universidad de las Fuerzas Armadas-ESPE at 1996. Master at Innovation and Research of TIC’s at Universidad Autónoma de Madrid at 2013. Actually He is a teacher at Universidad de las Fuerzas Armadas-ESPE, also He is a member of the Electrical and Electronic Department of the Universidad de las Fuerzas Armadas-ESPE and works with the ATVS group of the Universidad Autónoma de Madrid in the Biometric Area specially Signature Recognition. Actually he continue his studies in order to obtain PhD degree at Informatic and Telecommunication. Mariela Proaño. Fourth Autor, Electronic and Telecommunication Engineer at the Universidad de las Fuerzas Armadas-ESPE at 2016. She works with the Electrical and Electronic Department of the university. His research interests are related to Signal Processing and Biometric Recognition. Francisco Encalada. Fifth Autor: Electronic and Telecommunication Student at Universidad de las Fuerzas ArmadasESPE. He works with the Electrical and Electronic Department of the university. His research interests are related to the biometry sciencies, data and image processing and wireless communication. Evelio Granizo. Sixth Autor, Electronic at the Universidad de las Fuerzas Armadas-ESPE. Master at Communication Networks at Pontificia Universidad Católica del Ecuador. He has a a Degree in Curricular Design and a Graduate Diploma in Management for University Learning. Actually He is a director of Department of Electric and Electronic of Universidad de las Fuerzas Armadas-ESPE, also He is a member of Wicom energy Group of the Universidad de las Fuerzas ArmadasESPE.

Suggest Documents