sensors Article
Hand-Based Gesture Recognition for Vehicular Applications Using IR-UWB Radar Faheem Khan † , Seong Kyu Leem † and Sung Ho Cho * Department of Electronics and Computer Engineering, Hanyang University, 222 Wangsimini-ro, Seongdong-gu, Seoul 133-791, Korea;
[email protected] (F.K.);
[email protected] (S.K.L.) * Correspondence:
[email protected]; Tel.: +82-(02)-2220-4883 † Co-first authors, these authors contributed equally to this work. Academic Editors: Jesús Ureña, Álvaro Hernández Alonso and Juan Jesús García Domínguez Received: 26 February 2017; Accepted: 6 April 2017; Published: 11 April 2017
Abstract: Modern cars continue to offer more and more functionalities due to which they need a growing number of commands. As the driver tries to monitor the road and the graphic user interface simultaneously, his/her overall efficiency is reduced. In order to reduce the visual attention necessary for monitoring, a gesture-based user interface is very important. In this paper, gesture recognition for a vehicle through impulse radio ultra-wideband (IR-UWB) radar is discussed. The gestures can be used to control different electronic devices inside a vehicle. The gestures are based on human hand and finger motion. We have implemented a real-time version using only one radar sensor. Studies on gesture recognition using IR-UWB radar have rarely been carried out, and some studies are merely simple methods using the magnitude of the reflected signal or those whose performance deteriorates largely due to changes in distance or direction. In this study, we propose a new hand-based gesture recognition algorithm that works robustly against changes in distance or direction while responding only to defined gestures by ignoring meaningless motions. We used three independent features, i.e., variance of the probability density function (pdf) of the magnitude histogram, time of arrival (TOA) variation and the frequency of the reflected signal, to classify the gestures. A data fitting method is included to differentiate between gesture signals and unintended hand or body motions. We have used the clustering technique for the classification of the gestures. Moreover, the distance information is used as an additional input parameter to the clustering algorithm, such that the recognition technique will not be vulnerable to distance change. The hand-based gesture recognition proposed in this paper would be a key technology of future automobile user interfaces. Keywords: gesture recognition; IR-UWB radar; unsupervised learning; motion recognition; radar sensor; distance compensation; user interface
1. Introduction Hand-based gesture recognition is one of the hottest research fields, since it is of great significance in designing artificially intelligent human computer interfaces. Driving a modern car is an extremely difficult task [1]. A driver has to perform multi-tasking, such as observing the road, monitoring the vehicle’s status, Global Positioning System (GPS) monitoring, operating numerous electronic and mechanical devices and using audio entertainment. The gesture interface inside a car can assist the driver to perform various tasks. Different sensors have been used for gesture recognition, such as camera, radio-frequency identification (RFID), data-gloves, etc. [2–9]. Cameras, however, have a number of line of sight-related challenges that may prevent gesture recognition from being effective. For example, poorly-lit environments may have a negative impact on the image quality and in turn degrade the performance of gesture detection through the camera. The other main issue with camera-based gesture recognition is privacy [10]. An alternate method for gesture recognition is Sensors 2017, 17, 833; doi:10.3390/s17040833
www.mdpi.com/journal/sensors
Sensors 2017, 17, 833
2 of 18
glove-based sensors. The data-glove-based methods use sensor devices for digitizing hand and finger motions into multi-parametric data [5]. The extra sensors make it easy to collect hand movement and configuration. However, the devices are quite expensive and bring much cumbersome experience to the users [6]. The environment inside a vehicle is usually dark at night, and it is inconvenient to wear something during driving; therefore, the above-mentioned techniques are not suitable for vehicular applications. To overcome the above problems, radar-based gesture recognition can be used as a user interface inside a vehicle. Radar-based gesture recognition techniques have the advantage of better performance in dark environments, do not have privacy issues and do not require wearing sensors. In [11–15], the researchers have used Doppler radar sensors for gesture recognition. Molchanov et al. [11] used multiple sensors, including a depth camera and Doppler, for gesture recognition inside a vehicle. Portable radar sensors for gesture recognition in smart home applications are discussed in [12]. Kim Youngwook et al. [13] have performed hand-based gesture recognition using Doppler radar using machine learning techniques; however, the results are too dependent on the orientation and distance between hand and radar. In addition to Doppler radars, UWB radars have been in the spotlight in recent years. There are many advantages of using IR-UWB radar, such as high range resolution and robustness to multipath due to the extremely wide bandwidth [16]. The major application areas of UWB radar technology are sensors and communications, localization, tracking and biomedical research [17]. IR-UWB sensor has been used in various radar applications, such as non-invasive vital sign monitoring [18–20], multiple object counting [21] and direction recognition of moving targets [22], and it has the ability to detect stationary or slowly moving targets [23]. However, there is very little reference work available in the literature about gesture recognition based on IR-UWB radar sensors. Ren Nan et al. [24] have presented an algorithm for big gesture recognition through IR-UWB radar, but the gestures detected in that work were simply based on the position difference of the hand and may not be useful in practical applications. Junbum Park et al. [25] used an IR-UWB radar sensor for detecting hand-based gestures through machine learning techniques. Although the results show high accuracy, there was an overfitting problem, and the gestures testing in a real environment showed much lower accuracy. Furthermore, there is no method included for the distance compensation or robustness of the algorithm to a change in distance or the orientation of the hand. The main problem noted in the past radar-based gesture recognition algorithms was that they were vulnerable to distance and orientation; and the feature extraction through machine learning caused the overfitting problem in some cases, which made them error prone. To overcome these problems, we have presented a robust algorithm for hand-based gesture recognition using an IR-UWB radar sensor in this paper. We do not use the completely raw data as an input to the classifier in order to avoid the overfitting problem. We extracted three robust features, i.e., the variance of the pdf of the magnitude histogram, frequency and the variance of time of arrival (TOA) from the pre-processed signal reflected from the human hand. The features extracted were robust and showed better performance even if we changed the orientation of the hand. After the feature extraction, we used the K-means clustering algorithm for classification of the gestures. In order to make the algorithm robust against the distance and orientation variation, we have integrated the TOA-based distance information into the clustering algorithm. In order to differentiate the gesture motion from some random hand or body motion, we included a data-fitting algorithm. Since the gesture motion defined in our work is almost periodic, therefore we fit the received gesture signal into a sinusoid and check the R-square value. If the R-square value is above a certain threshold, then it is supposed to be periodic and, hence, classified as a gesture signal; otherwise, it is classified as a non-gesture motion. The process block diagram of our algorithm is shown in Figure 1.
Sensors 2017, 17, 833 Sensors 2017, 17, 833
3 of 18 3 of 18
Sensors 2017, 17, 833
3 of 18
Process Block Diagram Process Block Diagram
Locate the human body Locate the human body
Raw Signal Raw Signal
Combine the waveforms into matrix Combine the waveforms into matrix
Clutter Removal Clutter Removal
Separate the hand reflected part of signal Separate the hand reflected part of signal
Extract the Features Extract the Features
Compare the received & trained Features Compare the received & trained Features
Output Gesture Output Gesture
Unsupervised Clustering Unsupervised Clustering
Figure 1. 1. Process Process block block diagram diagram for for gesture gesture recognition. recognition. Figure Figure 1. Process block diagram for gesture recognition.
The maincontribution contribution work isitthat is the first real-time IR-UWB-based gesture The main of of ourour work is that is theitfirst real-time IR-UWB-based gesture recognition The main contribution of our the work is that itproblem is the first real-time IR-UWB-based gesture recognition technique, which avoids overfitting and shows robustness when a change technique, which avoids the overfitting problem and shows robustness when a change in distance or recognition technique, which avoids the overfitting problem and shows robustness when a change in distance or of thebecause hand occurs, selection of robust parameters and the orientation of orientation the hand occurs, of thebecause selectionofofthe robust parameters and the integration of in distance or orientation of the hand occurs, because of the selection of robust parameters and the integration of the TOA information into the clustering algorithm. Additionally, we proposed an the TOA information into the clustering algorithm. Additionally, we proposed an algorithm for the integration of the TOA information into the clustering algorithm. Additionally, we proposed an algorithm foronly the intended detectiongestures of only intended gestures ignoring any random movement insensor. front detection of ignoring anywhile random movement in front of the radar algorithm for the detection of onlywhile intended gestures while ignoring any random movement in front of the radar sensor. Considering these advantages, this would be an important technology of Considering advantages, this method would be anmethod important technology of the car user interface of the radarthese sensor. Considering these advantages, this method would be an important technology of the car user interface as one of the core technologies of the future autonomous vehicles. as the onecar of the technologies thecore future autonomous usercore interface as one ofofthe technologies of thevehicles. future autonomous vehicles. The hand-based gestures for our work are shown in Figure 2. The The firstfirst gesture (Gesture The hand-based gestures for our work are shown in Figure 2. The gesture (Gesture 0)the is The hand-based gestures for our work are shown in Figure 2. first gesture (Gesture 0)0) is is the empty gesture when there is no hand movement in front of the radar. Table 1 shows the detailed theempty empty gesture when there is no hand movement in front of the radar. Table 1 shows the detailed gesture when there is no hand movement in front of the radar. Table 1 shows the detailed explanation of the defined broadly classified small gestures, while explanation of of the defined gestures. Gestures 1, and 33 are are broadly broadlyclassified classifiedasas assmall small gestures, while explanation the definedgestures. gestures.Gestures Gestures1, 1,222 and are gestures, while Gestures 44 and 55 are classified asasbig big gestures with larger displacements.The The rest the paper Gestures and are Therest restofof ofthe the paper is Gestures 4 and 5 areclassified classifiedas biggestures gestures with with larger larger displacements. displacements. paper is is organized as follows. In Section 2 of the paper, the feature extraction and classification are discussed. organized as follows. In Section 2 of the paper, the feature extraction and classification are discussed. organized as follows. In Section 2 of the paper, feature extraction and classification are discussed. In 3, results of are presented, and conclusions are given In Section 3, the results gesturetraining trainingand andclassification classification are are given In Section Section 3, the the results ofofgesture gesture training and classification arepresented, presented,and andconclusions conclusions are given in 44 of the paper. are given at the end of the the paper. paper. in Section 4 of the paper.References Referencesare aregiven givenat atthe the end end of in Section Section of the paper. References paper.
Figure our experiments. experiments. Figure2.2.Gesture Gesture set set for our
Figure 2. Gesture set for our experiments.
Sensors 2017, 17, 833
4 of 18
Table 1. Gestures defined for our work. Gesture #
Explanation
Sensors 2017, 17, 833
Inference
4 of 18
0
When there is no hand movement in front of the radar
Empty Gesture
1
Table Gestures defined our work. The finger is moving to1.the left and right for slowly
NO
Thumbs up and the hand moving to and fro quickly, but with Explanation little displacements When there is no hand movement in front of the radar The finger is moving to the left and right slowly Three fingers going upward and downward while thumb and index fingers make an “O” symbolto and fro quickly, but Thumbs up and the hand moving with little displacements The hand palm is open and moving forward and backward 4 Three fingers displacements going upward while and downward while transceiver thumb with greater facing the radar 3 and index fingers make an “O” symbol The hand palm is open and moving backward and forward 5 The hand palm open and moving and backward diagonally withisrespect to the radarforward transceiver 4 with greater displacements while facing the radar transceiver The hand palm is open and moving backward and forward 2. Feature 5Extraction and Classification diagonally with respect to the radar transceiver
Gesture # 2 0 1 3 2
Inference BEST (Thumbs Up) Empty Gesture NO OK BEST (Thumbs Up) STOP OK NEXT STOP
NEXT
2.1. Signal Pre-Processing
2. Feature Extraction and Classification
From the raw signal reflected from the human hand, the clutter has to be removed. The loopback Signal Pre-Processing filter 2.1. is used for removal of the clutter [26]. The loopback filter as represented by Figure 3 works as: From the raw signal reflected from the human hand, the clutter has to be removed. The loopback ck (t) [26]. =∝ cThe ∝)ras (t) + (1−filter ) filter is used for removal of the clutter by Figure 3 works as: k −1loopback k ( trepresented c ( ) =∝ c
( ) + (1−∝) ( )
(1)
yk ( t ) = rk ( t ) − ck ( t )
(2)
(1) (2)
( ) =represents r ( ) − c ( )a constant used for weighting. For our In the above equations, the symboly “∝” In the the0.97. symbol ∝ " represents a constantthe used for weighting. For our experiments, theabove valueequations, of “∝” was The"symbol ck (t) represents clutter signal, which is made ( ) represents the clutter signal, which is experiments, the value of " ∝ " was 0.97. The symbol c until the k-th received sample. yk (t) is the background subtracted signal. From the above equations, ( ) madethat until k-thestimated received sample. the background signal. From the above it is clear thethe new clutter has twois parts: one part issubtracted from the previous estimate, and one equations, it is clear that the new estimated clutter has two parts: one part is from the previous is from the current reflected signal. We need to store each filtered signal waveform and combine them estimate, and one is from the current reflected signal. We need to store each filtered signal waveform into matrix Wmn of size “m × n”. The “m” represents the slow time length, whereas the “n” represents and combine them into matrix of size “ × ” . The “m” represents the slow time length, the fast time length of the matrix. The “n” depends on the measurement distance or range of the whereas the “n” represents the fast time length of the matrix. The “n” depends on the measurement radar.distance The slow time of length “m” The depends on the number of waveforms that weofwant to process or range the radar. slow time length “m” depends on the number waveforms that at a singlewetime. Since the gestures detected are all dynamic gestures, therefore the hand gesture area is want to process at a single time. Since the gestures detected are all dynamic gestures, therefore detected and gesture separated on the maximum index of the signal inindex the fast time domain the hand areabased is detected and separated variance based on the maximum variance of the signal in the fast domain throughout the maximum matrix duration “m”.index The maximum index the in the throughout thetime matrix duration “m”. The variance in the fastvariance time shows biggest fastin time theover biggest induration the values“m”, over the gesture weof assume is change theshows values thechange gesture which we duration assume “m”, is thewhich center the location the center of the location of the hand. We make the gesture matrix by combining the regions at the of the hand. We make the gesture matrix by combining the regions at the left and right side of the left and right side of the maximum variance index in the fast time domain. For example, in Figure 4, maximum variance index in the fast time domain. For example, in Figure 4, the gesture location is the gesture location is from Sample 140–Sample 190 in the fast time domain. The slow time length of from Sample 140–Sample 190 in the fast time domain. The slow time length of the gesture matrix is the gesture matrix is determined by the gesture duration. determined by the gesture duration.
r(t)
(1-α)
c(t) +
-
+ + Backgrounded subtracted signal
Delay α
Figure 3. Clutter-removing filter (loopback filter). Figure 3. Clutter-removing filter (loopback filter).
Sensors 2017, 17, 833 Sensors 2017, 17, 833
5 of 18 5 of 18
Sensors 2017, 17, 833
5 of 18
(a)
(b)
(a) (b) Figure Figure4. 4. Gesture Gesturematrix matrix(a) (a)before before clutter clutter removal removal and and (b) (b) after after clutter clutter removal. removal. Figure 4. Gesture matrix (a) before clutter removal and (b) after clutter removal. After we find the gesture matrix, we need to find whether the motion is due to the gesture signal After we find the gesture matrix, we need to find whether the motion is due to the gesture signal or or due to unintended hand motion. defined the in our work are periodic, so we use After we find the gesture matrix, As we the needgestures to find whether motion is due to the gesture signal due to unintended hand motion. As the gestures defined in our work are periodic, so we use sinusoidal sinusoidal fitting to show how much As thethe received data fit intointhe sinusoid. Forperiodic, the smallsogestures or due to unintended hand motion. gestures defined our work are we use fitting to show how much the received data fit into the sinusoid. For the small gestures (Gestures 1, (Gestures 2 and thehow input datathe used for sinusoidal fitting are the magnitude datagestures at the sinusoidal 1, fitting to 3), show much received data fit into the sinusoid. For the small 2 and 3), the input data used for sinusoidal fitting are the magnitude data at the maximum variance maximum variance the fast timeused index, shown in Figure However, for the big gestures (Gestures 1, 2 and index 3), theininput data forassinusoidal fitting 5. are the magnitude data at the index in the fast time index, as shown in Figure 5. However, for the big gestures (Gestures 4 and 5), (Gestures and 5), the input used for index, sinusoidal fittingin are the TOA of each radar scan, shown maximum4variance index in data the fast time as shown Figure 5. However, for the bigasgestures the input data used for sinusoidal fitting are the TOA of each radar scan, as shown in Section 2.2.3. in Section 2.2.3. R-square value is used for findingfitting the fit are of the which defined (Gestures 4 andThe 5), the input data used for sinusoidal thesignal, TOA of each is radar scan,asasfollows. shown The R-square value is used for finding the fit of the signal, which is defined as follows. in Section 2.2.3. The R-square value is used for finding the fit of the signal, which is defined as follows. ∑ ( − ) = 1 −n (3) 2 ∑ ∑(yi (− yˆ− i ) )) (3) R2== 11− − in=1 (3) ∑ ( y− )2" ) " by the fitting algorithm, whereas ∑i=values 1 ( yi − of In Equation (3), “ " represent the estimated “ ” shows the mean " represent " [27]. Thethe value of R-square liesofbetween zero fitting and one. The higher value In Equation (3),of estimated values " by algorithm, whereas In Equation (3), ““yˆi ”" represent the estimated values of “y"i ” by thethe fitting algorithm, whereas “y” of R-square shows that the prediction model is more accurate, and hence, the motion is due to the “shows ” shows the mean of " " [27]. The value of R-square lies between zero and one. The higher value the mean of “yi ” [27]. The value of R-square lies between zero and one. The higher value of gesture signal, whereas the lower valuemodel of R-square shows unintended hand motion. The following of R-square shows prediction is more accurate, and hence, the motion is the due to the R-square shows thatthat the prediction model is more accurate, and hence, the motion is due to gesture figure 5 signal, shows the fittingthe algorithm resultoffor the gesture signal. The resulting R-squareThe value for the gesture whereas lower value R-square shows unintended hand motion. following signal, whereas the lower value of R-square shows unintended hand motion. The following Figure 5 signal inshows the figure 5 has some higher value, it is a verysignal. accurate model. value for the figure thealgorithm fitting algorithm forasthe gesture Theprediction resulting shows5the fitting result forresult the gesture signal. The resulting R-squareR-square value for the signal in signal in the figure 5 has somevalue, higher as it is a very accurate prediction the Figure 5 has some higher asvalue, it is a very accurate prediction model. model. 6
6 4
Magnitude Magnitude
4 2 2 0 0 -2 -2 -4 -4 -6 -6 -8 0 -8 0
Received Signal Fitted Sinusoid Signal Received Signal Fitted Sinusoid 20 Signal 20
40 40
60 Slow Time 60 Index
80
100
120
80
100
120
Slow Time Index
Figure 5. Gesture signal and the fitted sinusoid signal. Figure 5. 5. Gesture Gesture signal signal and and the the fitted fitted sinusoid sinusoid signal. signal. Figure
2.2. Features Extraction Features Extraction 2.2. Features Extraction The next step is to extract the features of interest from the gesture signal matrix. We extracted threeThe features, i.e., the thefeatures pdf of the gesture from matrix the frequency of extracted the hand The next step step We extracted next is tospread extractofthe of interest thehistogram, gesture signal matrix. We gesture and thei.e., variance of the of TOA theofgesture signal.matrix The above three features are the of parameters three features, features, i.e., the spread spread histogram, frequency of hand three the theof pdf the gesture histogram, the frequency the hand that can represent the characteristics of the human hand gesture. When an ordinary person carries the TOA TOA of of the the gesture gesture signal. signal. The The above above three three features features are are the the parameters parameters gesture and the variance of the out repetitive hand gesture, each of person has his or gesture. her ownWhen unique movements. Thiscarries intrinsic represent the hand an an ordinary person out that acan represent thecharacteristics characteristics ofthe thehuman human hand gesture. When ordinary person carries motion is related to the movement range of the hand gesture, the speed of the hand gesture and the out a repetitive hand gesture, each person has his or her own unique movements. This intrinsic shape size of the hand. The range of motion the gesture, hand gesture is related tohand the variance TOA; motionand is related to the movement range of the of hand the speed of the gesture of and the shape and size of the hand. The range of motion of the hand gesture is related to the variance of TOA;
Sensors 2017, 17, 833
6 of 18
a repetitive hand gesture, each person has his or her own unique movements. This intrinsic motion is Sensorsto 2017, 833 6 of 18 related the17,movement range of the hand gesture, the speed of the hand gesture and the shape and Sensors 2017, 17, 833 6 of 18 size of the hand. The range of motion of the hand gesture is related to the variance of TOA; the speed the hand gesture is related with frequency; the and shape andofsize the hand are related of the thespeed handof gesture is related frequency; and theand shape size theof hand are related to the the speed of the hand gesture with is related with frequency; and the shape and size of the hand are related to the spread of the magnitude histogram. spread of the magnitude histogram. to the spread of the magnitude histogram.
Occurance Occurance
2.2.1. Varianceofofthe the MagnitudeHistogram Histogram 2.2.1. Variance 2.2.1. Variance of theMagnitude Magnitude Histogram The magnitudehistogram histogram of thegesture gesture matrixisisfound, found, and we we use the the data fitting fitting technique to The magnitude The magnitude histogramofofthe the gesturematrix matrix is found,and and weuse use the data data fitting technique technique to find the normal distribution pdf of the histogram. The variance “σ" of the resulting pdf is used as find distribution pdfpdf of the histogram. The pdf is is used usedas as a to the findnormal the normal distribution of the histogram. Thevariance variance“σ” “σ" of ofthe the resulting resulting pdf a feature for the classification of the gestures. feature for the classification of the gestures. a feature for the classification of the gestures. In Figure 6, the magnitude histogram over the gesture duration is shown, and Figure 7 shows Figure themagnitude magnitudehistogram histogram over over the gesture Figure 7 shows In In Figure 6,6,the gestureduration durationisisshown, shown,and and Figure 7 shows the pdf of the magnitude histogram. By using the pdf fitting method, the sigma value turns out to be the pdf of the magnitude histogram. By using the pdf fitting method, the sigma value turns out to to be be the pdf of the magnitude histogram. By using pdf fitting method, the sigma value turns out a specific value. The sigma value is different for different gestures, as shown in the Results section. a specific value.The Thesigma sigmavalue value is different different for different gestures, asasshown inin the Results section. a specific value. forsignal different shown Results section. A large value of sigma means thatisthe received has agestures, higher magnitude overthe a certain period A large value of sigmameans meansthat thatthe thereceived received signal signal has aa higher magnitude over a certain period A of large value of sigma has higher magnitude over a certain period time, which means a large hand gesture. Using the sigma method rather than simply using some of time, which means a large hand gesture. Using the sigma method rather than simply using some of received time, which means a large makes hand gesture. Usingmore the sigma than simply usingthe some signal magnitudes the algorithm robustmethod becauserather it statistically represents received signalmagnitudes magnitudesmakes makes the the algorithm algorithm more robust because ititstatistically represents thethe received signal more robust because statistically represents magnitude characteristics of the reflected signal of each gesture over a certain period of time rather magnitude characteristicsof ofthe thereflected reflected signal signal of each gesture over period ofof time rather magnitude characteristics of The eachconcrete gesturemethod overaacertain certain period time rather than the magnitude at a particular time or distance. of calculating the spread of than the magnitude at a particular time or distance. The concrete method of calculating the spread of than magnitude atgesture a particular or distance. The concrete method of calculating the spread of thethe histogram of the matrixtime is shown in Algorithm 1. the histogram of the gesture matrix is shown in Algorithm 1. the histogram of the gesture matrix is shown in Algorithm 1. 450 450 400 400 350 350 300 300 250 250 200 200 150 150 100 100 50 50 0 02 2
4 4
6 6
8 8
10 10
12 12 Values Values
14 14
16 16
18 18
20 20
22 22
Figure6.6.Magnitude Magnitude histogram for Figure foraaaslow slowtime timeof ofgesture gestureduration. duration. Figure 6. Magnitude histogram histogram for slow time of gesture duration. 0.06 0.06 0.05 0.05
pdf pdf
0.04 0.04 0.03 0.03
Sigma Sigma
0.02 0.02 0.01 0.01 0 00 0
5 5
10 10
Values Values
15 15
20 20
Figure 7. Pdf of magnitude histogram for a slow time of gesture duration. Figure 7. Pdf of magnitude histogram for a slow time of gesture duration. Figure 7. Pdf of magnitude histogram for a slow time of gesture duration.
25 25
Sensors 2017, 17, 833
7 of 18
Algorithm 1. Calculation of the spread of the histogram. 1. 2.
3. 4.
Find the magnitude histogram of the gesture matrix, as shown in Figure 6. Although the small values appear the most in the histogram, we ignore those values, as these smaller values distort the shape of the histogram, and the most important values for classification of gestures are the higher values. We set the threshold on trial and error to discard the smaller values. Fit the resulting histogram to a normal pdf distribution, as shown in Figure 7. The histogram is considered the best way for density estimation, as discussed in reference [28]. Find the spread “σ” of the normal pdf, as shown in Figure 7.
2.2.2. Time of Arrival Variance Time of arrival (TOA) variance represents the range of motion of a hand gesture. A hand gesture has a different moving distance for each hand gesture. A large hand gesture has a large TOA variance, and a small hand gesture has a small TOA variance. The gestures defined in our work have different TOA variances. The gesture set defined in our paper has five gestures, and two of the gestures have large variation in TOA, whereas the three gestures have very small variation of TOA. The specific method for the estimation of TOA is explained in Algorithm 2. The center of mass concept is used to find the centroid index of the waveform in Step 3 of Algorithm 2. The center of mass concept is a more robust feature because it reflects the characteristics of the entire waveform as compared to simply using the peak of the waveform as the centroid index. Algorithm 2. TOA estimation. 1.
Find the index of the maximum variation column in slow time: nmax_var = argmaxn_ f ixed |variation{Wmn (:, n_ f ixed)}|
2.
Wmn is the gesture matrix, and the “m” represents the slow time length, whereas the “n” represents the fast time length of the matrix. Extract the size f ast data around nmax_var from one slow time scan data (the x axis is fast time, and the y axis is magnitude), and apply the Hilbert transform to obtain the envelope of these data. size f ast size f ast : nmax_var + }] rhilbert = abs[hilbert{y nmax_var − 2 2
3.
(4)
(5)
y represents the background-subtracted signal after the loopback filter in Section 2.1. size f ast is basically determined by the length of Gaussian-modulated pulses transmitted and received and adds margins taking into account the slight length changes in Gaussian-modulated pulses that occur during reflection from the main target. Find the center of fast time index ncenter_mass using rhilbert and Equation (6). The equation below is similar to the center of mass concept: ∑ ncenter_mass =
nmax_var +
2 size f ast
n=nmax_var −
∑ 4. 5.
size f ast
nmax_var +
rhilbert (n) ∗ n
2
(6)
size f ast 2 size f ast
n=nmax_var −
rhilbert (n)
2
Find ncenter_mass using the method in Step 3 for each slow time. Find the TOA variance using ncenter_mass data in Step 4: TOA variance = variance{ncenter_mass (1 : m)} The symbol “m” represents the slow time length.
(7)
Sensors 2017, 17, 833 Sensors 2017, 17, 833
8 of 18 8 of 18
2.2.3. Frequency Frequency of of the the Gesture Gesture Signal Signal 2.2.3. Some hand hand gestures gestures are are fast, fast, and and the the other other hand hand gestures gestures are are relatively relatively slow. slow. This This characteristic characteristic Some can be be modeled modeled through through the the frequency frequency of of the the gesture gesture signal. signal. The The other other main main purpose purpose of of introducing introducing can the frequency parameter is to distinguish between gestures and non-gesture motions of the body body or or the frequency parameter is to distinguish between gestures and non-gesture motions of the some other other change change in in the the environment. environment. We We have have defined defined two two kinds kinds of of frequencies, frequencies, i.e., i.e., frequency frequency on on some the basis basis of of magnitude magnitude variation variation and and frequency frequency based based on onthe theTOA TOAvariance. variance. The Themagnitude-based magnitude-based the spectrum is used TOA frequency is used in the casecase of gestures that spectrum used for for the thesmall smallgestures, gestures,while whilethe the TOA frequency is used in the of gestures result in large displacement. The algorithm for obtaining frequency information usingusing IR-UWB radar that result in large displacement. The algorithm for obtaining frequency information IR-UWB is widely used used for the of biological signals, suchsuch as respiration andand pulse [18–20]. In radar is widely formeasurement the measurement of biological signals, as respiration pulse [18–20]. these studies, frequency time In these studies, frequencyinformation informationwas wasobtained obtainedby byusing usingthe themagnitude magnitude change change of the slow time data at at the the fixed fixed fast fast time time point. point. In In this this study, study,frequency frequencyinformation information was was obtained obtained by by applying applying the the data same method to small gestures. However, in the case of a big gesture, the conventional method does same method to small gestures. However, in the case of a big gesture, the conventional method does not produce produce aa satisfactory satisfactory result. result. Figure Figure 88 illustrates illustrates why why existing existing methods methods do do not not work work well well in in big big not gestures. First, First, as as shown shown in in Figure Figure 8a, 8a, when when the the moving moving distance distance of of the the gesture gesture is is small, small, the the waveform waveform gestures. is similar similar to the sine wave. large, asas shown in is wave. However, However,when whenthe themoving movingdistance distanceofofthe thegesture gestureis is large, shown Figure 8b,8b, a adistorted modulated Gaussian Gaussian in Figure distortedwaveform waveformisisgenerated generatedreflecting reflectingthe the waveform waveform of the modulated pulse. Moreover, the the real real hand hand gestures gestures do do not not have have perfect perfect periodicity, periodicity, so so the the degree degree of of distortion distortion pulse. becomes worse. worse. To To solve solve this this problem, problem, we we proposed proposed aa new newfrequency frequency acquisition acquisition method method based based on on becomes TOA, as as shown shown in in Figure Figure 8c. 8c. It It estimates estimates the the optimum optimum TOA TOA for for each each slow slow time time frame frame and and predicts predicts TOA, the frequency by observing inin Algorithm 3 as the observing the the change changeof ofthis thisTOA. TOA.The Theconcrete concretemethod methodisisgiven given Algorithm 3 follows. as follows.
(a)
(b) Figure 8. Cont. Figure 8. Cont.
Sensors 2017, 17, 833 Sensors 2017, 2017, 17, 17, 833 833 Sensors
9 of 18 of 18 18 99 of
(c) (c) Figure 8. 8. Illustrative explanation explanation of the the reason for for introducing aa new new frequency extraction extraction method Figure Figure 8. Illustrative Illustrative explanation of of the reason reason for introducing introducing a new frequency frequency extraction method method based on on TOA. (a) (a) Slow time time domain signal signal by the the conventional magnitude-based magnitude-based method for for a small based based on TOA. TOA. (a) Slow Slow time domain domain signal by by the conventional conventional magnitude-based method method for aa small small gesture; (b) (b) slow time time domain signal signal by the the conventional magnitude-based magnitude-based method for for a big gesture; gesture; gesture; gesture; (b) slow slow time domain domain signal by by the conventional conventional magnitude-basedmethod method foraabig big gesture; (c) slow slow time domain domain signal by by the proposed proposed TOA-based method method for for aa big big gesture. gesture. (c) (c) slow time time domain signal signal by the the proposed TOA-based TOA-based method for a big gesture.
Algorithm 3. 3. Finding Finding the the frequency frequency by by the the TOA-based TOA-based method. method. Algorithm Algorithm 3. Finding the frequency by the TOA-based method.
1. Find Find the the TOA TOA of of every every column column of of the the matrix matrix using the the method method given given by by Algorithm Algorithm 2. 2. 1. using 1. Find the TOA of every column of the matrix Wmn using the method given by Algorithm 2. 2. Mean value subtraction: subtraction: Find Find the the mean mean of of all all of of the the TOA TOA values values and and subtract subtract it it from from each each value. value. 2. Mean value 2. Mean value subtraction: Find the mean of all of the TOA values and subtract it from each value. The resulting signal signal after after mean mean subtraction subtraction is is shown shown in in Figure Figure 9. 9. The resulting The resulting signal after mean subtraction is shown in Figure 9. 3. Find Find the the frequency frequency domain domain signal signal by by using using the the fast fast Fourier Fourier transform transform (FFT) (FFT) algorithm. algorithm. 3. 3. Find the frequency domain signal by using the fast Fourier transform (FFT) algorithm. 4. Search for the peak value of the spectrum as in Figure 10. 4. Search for the peak value of the spectrum as in Figure 10. 4. Search for the peak value of the spectrum as in Figure 10. 5. The The location location of of the the peak peak value value of of the the spectrum spectrum represents represents the the frequency frequency of of the the big big gesture. gesture. 5. 5.
The location of the peak value of the spectrum represents the frequency of the big gesture. 88 66
Magnitude Magnitude
44 22 00 -2 -2 -4 -4 -6 -6 -8 -8 00
10 10
20 20
30 30
40 40
50 60 50 60 Slow Time Time Index Index Slow
70 70
80 80
Figure 9. 9. TOA TOA of of gesture gesture with gesture). Figure 9. TOA of gesture with greater greater displacements displacements (big (big gesture). Figure
90 90
100 100
Sensors 2017, 2017, 17, 17, 833 833 Sensors
10of of18 18 10
300
Spectrum Magnitude
250 200 150 100 50 0
100
200
300
400 500 600 Frequency (per min)
700
800
900
1000
Figure Figure10. 10.Spectrum Spectrumof ofthe theTOA TOAof ofgesture gesture with withgreater greater displacements displacements (big (big gesture). gesture).
2.3. 2.3. Gestures Gestures Classification Classification There There are are two two broad broad classes classes of of learning learning algorithms, algorithms,i.e., i.e.,supervised supervisedand andunsupervised unsupervisedlearning. learning. In supervised learning, each output unit is told what its desired response to the input signals should In supervised learning, each output unit is told what its desired response to the input signals should be. be. However, in unsupervised learning, the algorithm is based on local information, and it does not However, in unsupervised learning, the algorithm is based on local information, and it does not know know the output target output eachIt input. It is alsotoreferred to as a self-organized as it the target for eachfor input. is also referred as a self-organized network, as itnetwork, self-organizes self-organizes the data presented thedetects network detectsproperties any collective in thetodata. In the data presented to the networkto and anyand collective in theproperties data. In order classify order to classify the gestures, the unsupervised learning algorithm (k-means clustering) is used. the gestures, the unsupervised learning algorithm (k-means clustering) is used. Clustering is a popular Clustering a popular the approach to implement partitioning operation [29–31]. The clustering approach toisimplement partitioning operationthe [29–31]. The clustering algorithm partitions a set of algorithm partitions a set of objects into clusters, such that the objects belonging to the same cluster objects into clusters, such that the objects belonging to the same cluster have more similarity among have more than similarity among themselves than clusters[32,33]. based The on main some idea defined themselves with different clusters based on with somedifferent defined criteria is to criteria [32,33]. The main idea is to define K cluster centers. The next step is to associate each point of define K cluster centers. The next step is to associate each point of the given dataset to the nearest the given dataset to the nearest center. After that, we recalculate the centroids by taking the mean of center. After that, we recalculate the centroids by taking the mean of the clusters. Now, a loop has been the clusters. Now, a loop has been generated. We continue this process until the centers do not move generated. We continue this process until the centers do not move their position any more. The cost their position any more. The cost function forby the K-means algorithm is given by Equation (8). function for the K-means algorithm is given Equation (8). ( J ()V )==
c
ci
∑ ∑ (k(∥xi −−v j k)∥)2
(8)(8)
i =1 j =1
where ∥ − ∥ is the Euclidean distance between and and “ " is the number of data points where k x − v j k is the Euclidean distance between xi and v j and “ci ” is the number of data points in in the − ith cluster, while “c” is number of cluster centers. The main advantage of the K-means the i − th cluster, while “c” is number of cluster centers. The main advantage of the K-means algorithm algorithm is that it is fast, robust and very simple to understand. It gives the best result when the is that it is fast, robust and very simple to understand. It gives the best result when the datasets are datasets are distinct and well separated. distinct and well separated. In the case of our classification task, we train the algorithm by using the three input features as In the case of our classification task, we train the algorithm by using the three input features defined in Section 2.2 and then use the newly made gesture to find to which class it belongs. The as defined in Section 2.2 and then use the newly made gesture to find to which class it belongs. number of centroids is the same as the number of gestures, i.e., five. The training result of the The number of centroids is the same as the number of gestures, i.e., five. The training result of the classification is shown in Figure 11, as follows. classification is shown in Figure 11, as follows. In our work, we are focused on gesture recognition within some area (not just a fixed point) so In our work, we are focused on gesture recognition within some area (not just a fixed point) so that that the driver can make gestures freely, which means that the training and testing locations for the driver can make gestures freely, which means that the training and testing locations for gestures gestures might be different. Therefore, we need to compensate the distance change. To this end, we might be different. Therefore, we need to compensate the distance change. To this end, we have have proposed a clustering algorithm, which trains each gesture at two locations (nearest and proposed a clustering algorithm, which trains each gesture at two locations (nearest and farthest) farthest) and use the location information along with the three features defined in Section 2.2 as an and use the location information along with the three features defined in Section 2.2 as an input to input to the clustering algorithm. To explain the concept, we cannot show all four parameters on a the clustering algorithm. To explain the concept, we cannot show all four parameters on a plane plane surface; therefore, we used only magnitude and distance parameters for the explanation, as surface; therefore, we used only magnitude and distance parameters for the explanation, as shown in shown in Figure 12. In Figure 12, we used only two gestures for the explanation of our concept. The Figure 12. In Figure 12, we used only two gestures for the explanation of our concept. The magnitude magnitude parameter changes inversely with distance. The clustering in Figure 12a has fixed point parameter changes inversely with distance. The clustering in Figure 12a has fixed point training, training, whereas Figure 12b has training at two points for every gesture, so it has two clusters for every gesture and uses the distance information to make decisions that are more robust.
Sensors 2017, 17, 833
11 of 18
whereas Figure 12b has training at two points for every gesture, so it has two clusters for every gesture Sensors 2017, 17, 833 11 of 18 and uses the distance information to make decisions that are more robust. Sensors 2017, 17, 833
11 of 18
Figure Figure 11. 11. K-means K-means clustering clustering for for gesture gesture recognition. recognition. Figure 11. K-means clustering for gesture recognition.
Case 1 Case 1
Gesture #1, Distance 1 Gesture #1, (trained ) Distance 1 (trained )
Case 2 Case 2
Gesture Feature Gesture Feature (Magnitude) (Magnitude)
Gesture Feature Gesture Feature (Mangnitude) (Mangnitude)
Test Gesture (Gesture #1) Test Gesture (Gesture #1)
Gesture #2, Distance 1 Gesture #2, (trained ) Distance 1 (trained )
Gesture #1, Distance 1 Gesture #1, (trained) Distance 1 (trained)
Test Gesture (Gesture #1) Test Gesture (Gesture #1)
Gesture #2, Distance 1 Gesture #2, (trained) Distance 1 (trained)
TOA TOA
TOA TOA
(a) Gesture Classification without including TOA in the clustering process (a) Gesture Classification without including TOA in the clustering process Case 1 Case 1
Gesture #1, Distance 1 Gesture #1, (trained ) Distance 1 (trained )
Gesture #2, Distance 1 Gesture #2, (trained ) Distance 1 (trained )
Gesture #1, Distance 2 Gesture #1, (trained) Distance 2 (trained) Gesture #2, Distance 2 Gesture #2, (trained) Distance 2 (trained)
Case 2 Case 2
Gesture Feature Gesture Feature (Magnitude) (Magnitude)
Gesture Feature Gesture Feature (Magnitude) (Magnitude)
Test Gesture (Gesture #1) Test Gesture (Gesture #1)
Gesture #1, Distance 1 Gesture #1, (trained) Distance 1 (trained)
Gesture #2, Distance 1 Gesture #2, (trained) Distance 1 (trained)
Test Gesture (Gesture #1) Test Gesture (Gesture #1)
Gesture #1, Distance Gesture #1, 2(trained) Distance 2(trained) Gesture #2, Distance Gesture #2, 2(trained) Distance 2(trained)
TOA TOA TOA TOA (b) Gesture Classification with including TOA in the clustering process (b) Gesture Classification with including TOA in the clustering process Figure 12. Logical explanation of clustering (a) without TOA information and (b) with TOA 12.Logical Logical explanation of clustering (a) without TOA information and (b)information. with TOA Figure 12. explanation of clustering (a) without TOA information and (b) with TOA information. information.
Sensors 2017, 17, 833
12 of 18
Sensors 2017, 17, 833
12 of 18
As forfor the training and testing set,set, as as in As is is clear clear from from Figure Figure12a, 12a,ififthe theTOA TOAisisnot notmuch muchdifferent different the training and testing Case 1, then the gesture will be classified correctly as Gesture 1, as the test gesture has the nearest in Case 1, then the gesture will be classified correctly as Gesture 1, as the test gesture has the nearest distance However, if if the the training training and and testing set has distance along along the the features’ features’ axis axis to to Gesture Gesture 1. 1. However, testing set has much much difference in TOA, as in Case 2, then the test gesture will be classified as Gesture 2 by the clustering difference in TOA, as in Case 2, then the test gesture will be classified as Gesture 2 by the clustering algorithm The reason algorithm because because it it has has the the nearest nearest distance distance to to Gesture Gesture 22 along along the the features’ features’ axis. axis. The reason for for the the incorrect decision in Case 2 is that training without TOA information with the algorithm does not take incorrect decision in Case 2 is that training without TOA information with the algorithm does not into the decrease in magnitude of the of signal with increasing TOA. InTOA. Figure every take account into account the decrease in magnitude the signal with increasing In 12b, Figure 12b,gesture every is trained at two different distances, and for every gesture, two clusters are made at two different gesture is trained at two different distances, and for every gesture, two clusters are made atTOA two locations. In Case 1 of Figure 12b,1 the test gesture is correctly classified, as it classified, is nearest to different TOA locations. In Case of Figure 12b, the test gesture is correctly as the it isGesture nearest 1-trained cluster1-trained along thecluster gesture features’ axis. Infeatures’ Case 2 ofaxis. Figure gesture is the located to the Gesture along the gesture In 12b, Casethe 2 oftest Figure 12b, test near the second set of clusters along the TOA line. The algorithm will check the TOA information gesture is located near the second set of clusters along the TOA line. The algorithm will check the along with the features’ information; therefore, it is classified as Gesture 1, because it is nearer to the TOA information along with the features’ information; therefore, it is classified as Gesture 1, because second cluster of Gesture 1 as compared to the two clusters of Gesture 2. The gesture classification it is nearer to the second cluster of Gesture 1 as compared to the two clusters of Gesture 2. The gesture results of Figure 12 are clearly in Table 2 as follows. classification results of shown Figure 12 are shown clearly in Table 2 as follows. Table 2. Gesture classification Table 2. Gesture classification results results of of Figure Figure 12. 12.
Clustering without asClassification Classification Input Clustering withoutTaking Taking TOA TOA as Input Case #Case # TestTest Gesture (Original) Classification Result Gesture (Original) Classification Result 01 1 1 01 1 1 02 2 2 02 1 1 Clustering withTaking Taking TOA TOA as Input Clustering with asClassification Classification Input Case #Case # TestTest Gesture (Original) Classification Result Gesture (Original) Classification Result 01 1 1 01 1 1 02 1 1 02 1 1 3. Results and Discussion 3.1. Experimental Setup Setup 3.1. Experimental The setup for for gesture The experimental experimental setup gesture recognition recognition inside inside the the car car is is shown shown in in Figure Figure 13a. 13a. The The radar radar was the driver, driver, and The gesture gesture area area was was was placed placed in in front front of of the and the the gestures gestures were were made made by by the the right right hand. hand. The almost 30 cm. Figure 13b shows the shape for the radar module. The transmit antenna and receiver almost 30 cm. Figure 13b shows the shape for the radar module. The transmit antenna and receiver antenna and a low noise amplifier (LNA) is mounted on the antenna are are connected connectedtotothe theIR-UWB IR-UWBtransceiver, transceiver, and a low noise amplifier (LNA) is mounted on receiver to increase the reception performance. the receiver to increase the reception performance.
(a)
(b)
Figure Figure 13. 13. (a) (a) Experimental Experimental setup setup inside inside the the car; car; (b) (b) IR-UWB IR-UWB radar. radar.
In our experiments, we used the commercially available single-chip impulse radar transceiver In our experiments, we used the commercially available single-chip impulse radar transceiver (part number=NVA6201) made by NOVELDA (Novelda AS, Kviteseid, Norway). The parameter (part number=NVA6201) made by NOVELDA (Novelda AS, Kviteseid, Norway). The parameter specifications are given in Table 3. specifications are given in Table 3.
Sensors Sensors 2017, 2017, 17, 17, 833 833
13 13 of of 18 18
Table 3. Radar parameters’ values. Table 3. Radar parameters’ values.
Parameter Parameter
Centre Frequency Frequency PulseCentre Repetition Frequency Pulse Repetition Frequency Bandwidth Bandwidth Output Power Output Power Slow Time Sampling Frequency Slow Time Sampling Frequency
Value Value
6.8 GHz
6.8 GHz 100 MHz 100 MHz GHz 2.32.3 GHz −53 −53dBm/MHz dBm/MHz 59 samples/s
59 samples/s
3.2. Feature Extraction Extraction Result 3.2. Feature Result We The gesture gesture matrix matrix for for each We removed removed the the clutter clutter from from the the gesture gesture matrix. matrix. The each gesture gesture after after removing the clutter is plotted in Figure 14. removing the clutter is plotted in Figure 14.
Figure 14. (a)(a) Gesture 1; (b) Gesture 2; (c) 3; (d) Figure 14. Gesture Gesturematrices matricesafter afterremoving removingthe theclutter: clutter: Gesture 1; (b) Gesture 2; Gesture (c) Gesture 3; Gesture 4; (e) Gesture 5. (d) Gesture 4; (e) Gesture 5.
After preprocessing the signal, the next step is to extract the features. The size of the matrix After preprocessing the signal, the next step is to extract the features. The size of the matrix Wmn for our experiments is (100 × 256), which means that the length of radar scans to be processed for each for our experiments is (100 × 256), which means that the length of radar scans to be processed for each gesture is 100 (1.69 s), and the detection range is 256 samples or one meter. The pdf graphs for the gesture is 100 (1.69 s), and the detection range is 256 samples or one meter. The pdf graphs for the small and big gestures are shown in Figure 15. It is clear from the figure that the small gesture has a small and big gestures are shown in Figure 15. It is clear from the figure that the small gesture has a low spread value as compared to the big gesture. The average spread values for all of the gestures low spread value as compared to the big gesture. The average spread values for all of the gestures defined in our work are shown in Table 4. defined in our work are shown in Table 4.
Sensors 2017, 17,17, 833 Sensors Sensors 2017, 2017, 17, 833 833
14 of 18 14 14 of of 18 18
0.09 0.09
pdf pdf of of Gesture Gesture #1 #1 pdf pdf of of Gesture Gesture #4 #4
0.08 0.08 0.07 0.07
pdf pdf
0.06 0.06 0.05 0.05
Sigma1 Sigma1
0.04 0.04 0.03 0.03
Sigma2 Sigma2
0.02 0.02 0.01 0.01 00 00
55
10 10
Magnitude Magnitude
15 15
20 20
25 25
Figure 15. pdf graphs Figure small and andbig biggestures. gestures. Figure15. 15.pdf pdf graphs graphs for for small small and big gestures.
The sigma values for every pdf of the magnitude are given in the second column of The sigma valuesfor forevery everypdf pdfof ofthe the magnitude magnitude histogram histogram column of of The sigma values histogramare aregiven givenininthe thesecond second column Table 4. The TOA variance is calculated by using the center of mass concept for finding the centroid Table 4. The TOA variance is calculated by using the center of mass concept for finding the centroid Table 4. The TOA variance is calculated by using the center of mass concept for finding the centroid index of waveform. The third column of 44 shows the variance result for all index of the the waveform. third column of Table Table shows the TOA TOA variance all of of the the index of the waveform. TheThe third column of Table 4 shows the TOA variance resultresult for allfor of the gestures gestures averaged over the slow time. Another important feature is the frequency extraction. gestures averaged over the slow time. Another important feature is the frequency extraction. We We averaged over the slow time. Another important feature is the frequency extraction. We conducted conducted conducted experiments experiments to to find find the the frequency frequency for for each each gesture gesture and and found found that that the the big big gestures gestures have have experiments to find the frequency for each gesture and found that the big gestures have relatively relatively relatively lower lower frequency frequency as as compared compared to to the the smaller smaller gestures gestures due due to to the the greater greater displacements displacements by by lower frequency as compared to the smaller gestures due to the greater graph displacements by the big the the big big gestures. gestures. The The red red line line in in Figure Figure 16 16 shows shows the the frequency frequency information information graph of of Gesture Gesture 33 based based gestures. The red line in Figure 16whereas shows the blue frequency information graph offrequency Gesture 3 based on the on on the the magnitude magnitude of of the the signal, signal, whereas the the blue line line in in Figure Figure 16 16 shows shows the the frequency information information magnitude of the signal, whereas the blue line in Figure 16 showsresults the frequency information graph of graph for graph of of Gesture Gesture 44 based based on on the the TOA TOA variation. variation. The The frequency frequency results for all all gestures gestures are are given given in in Gesture 4 based on the TOA variation. The frequency results for all gestures are given in Table 4. Table 4. Table 4. 20 20
Magnitude Magnitude
10 10 00 -10 -10 -20 -20 -30 -300 0
10 10
20 20
30 30
40 40
50 60 50 60 Slow Slow Time Time Index Index
70 70
80 80
90 90
100 100
Figure represents the variation of 3, line Figure 16. The red line represents the magnitude magnitude variation of Gesture Gesture 3, whereas whereas the blue line Figure 16.16. TheThe redred line line represents the magnitude variation of Gesture 3, whereas the bluethe lineblue represents represents the TOA variation of Gesture 4 across the slow time axis. represents the TOA variation of Gesture 4 across the slow time axis. the TOA variation of Gesture 4 across the slow time axis. Table Magnitude histogram, TOA Frequency results. Table4.4. 4.Magnitude Magnitudehistogram, histogram, TOA TOA variance variance and and Table variance andFrequency Frequencyresults. results. Spread of pdf of the Magnitude Histogram Frequency Spread of pdf of the Magnitude Histogram Frequency (per (per Gesture TOA Gesture ## TOA Variance Variance (Sigma) Minute) Spread of pdf of the (Sigma) Minute) Gesture # TOA Variance Frequency (per Minute) 01 4.2 1.9 57 Magnitude Histogram (Sigma) 01 4.2 1.9 57 02 5.7 3.1 115 02 01 5.7 3.1 115 4.2 1.9 57 03 7.3 2.1 96 03 02 7.3 2.1 5.7 3.1 115 96 04 11.3 5.5 04 03 11.3 5.5 62 7.3 2.1 96 62 05 9.1 4.9 05 04 9.1 4.9 49 11.3 5.5 62 49
05
9.1
4.9
49
Sensors 2017, 17, 833
15 of 18
From the results in Table 4, we noted that the big gestures (4 and 5) had relatively smaller frequency as compared to the smaller gestures. However, the frequency of each gesture depends on the user, and so, it can be determined during the training of gestures. 3.3. The Detection of Only Intended Gestures’ Result The first important thing is to differentiate between the intended gesture motion and unintended random hand motion. To this end, we used the sinusoidal fitting algorithm, and the results of this algorithm are summarized in Table 5. We set the detection threshold of R-square as 0.2 for differentiating between gesture motion and undesired random motion. Table 5 shows that the sinusoid data fitting technique resulted in 100 percentage accuracy for detecting the random motion, and hence, it can ignore the random hand motion, so that we can use only the useful gestures’ motion in the classification step. The number of trails for each gesture in Table 5 was 150, and we used three subjects (50 trails per subject). Table 5. Data fitting results. Gesture #
Mean (R-Square)
Detection Accuracy (% Age)
Gesture #1 Gesture #2 Gesture #3 Gesture #4 Gesture #5 Hand moving for steering Empty gesture Body moving Hand moving for gear change Empty gesture
0.62 0.71 0.61 0.53 0.40 0.03 0.01 0.10 0.08 0.02
100 100 100 100 100 100 100 100 100 100
3.4. Clustering Classification Results From the features extraction section, it is clear that although some gestures overlap, the values for some particular parameters, however, did not overlap in all of the parameters. Therefore, we used the clustering technique in which we used all three parameters as inputs for the classification of the gestures. The K-means classifier is used to cluster the gestures. The input parameters are first normalized and then scaled before clustering by the K-means algorithm. Figure 11 shows the sample result of the clustering in three dimensions by using the K-means algorithm. The K-means algorithm is very sensitive to the initialized values. In our experiments, we initialized the centroids for each gesture by using the mean values of each feature. We conducted every gesture for 150 trials and three subjects (50 trials on one subject), calculated the percentage accuracy of every gesture and noted how much a gesture was overlapped by other gestures, which may result in the wrong detection. The results for the fixed point gesture training and gesture testing are shown in Table 6. The diagonal elements in the table represent that the original and classified gestures are the same. The off-diagonal elements represent the misdetected gestures. Table 6. Fixed point training and testing results.
Gesture #1 Gesture #2 Gesture #3 Gesture #4 Gesture #5
Gesture #1
Gesture #2
Gesture #3
Gesture #4
Gesture #5
100 0 0 0 0
0 99.33 0 0 0
0 0.66 100 0.66 0
0 0 0 97.33 2
0 0 0 2 98
Sensors 2017, 17, 833
16 of 18
As the driver’s hand may move in a certain area, which mean it can change its distance and orientation, therefore we used the training of the parameters at two distant points from the radar and integrated the distance information into the clustering algorithm. As shown in Table 7, the performance of the algorithm deteriorates with the change in distance and orientation; therefore, we integrated the TOA information into the clustering algorithm. Table 8 below shows the results of the algorithm when distance compensation was used. Finally, in Table 8, the results of training the gestures by using the three features and the TOA information are presented. By comparing the results of Table 8 with the results of Table 7, we can note that the diagonal elements in Table 8 have higher values as compared to Table 7, which means that our proposed clustering algorithm based on the three features, as well as TOA information is much more accurate than the algorithm without TOA information. Table 7. Training and testing at different distances (without distance compensation).
Gesture #1 Gesture #2 Gesture #3 Gesture #4 Gesture #5
Gesture #1
Gesture #2
Gesture #3
Gesture #4
Gesture #5
90 2.66 2.66 0 8
5.33 93.33 4.66 20.66 18.66
4.66 4 92.66 10.66 6.66
0 0 0 58.66 20
0 0 0 10.02 46.66
Table 8. Training and testing at different distances and orientation (with distance compensation).
Gesture #1 Gesture #2 Gesture #3 Gesture #4 Gesture #5
Gesture #1
Gesture #2
Gesture #3
Gesture #4
Gesture #5
98.66 0 0 0 0
0.66 97.33 0.66 5.31 4.01
0 1.33 99.33 2.66 0
0.66 0 0 88.66 4.66
0 1.33 0 6 91.33
4. Conclusions We have presented a robust algorithm for gesture recognition. We only used a single IR-UWB radar for our experiments. The three independent features showed better performance under different circumstances inside the vehicle. Although, if one parameter of a gesture overlapped with another gesture, sometimes, it was compensated by the other two parameters, resulting in an accurate result. The magnitude-based frequency was not very accurate for gestures with larger displacements; therefore, we defined another TOA-based frequency for the larger displacement gestures. We also integrated the TOA information along with the features’ information into the clustering algorithm, which resulted in much better performance, although the training and testing locations and orientations were not the same. The unintended motion created by randomly moving of the hands or body is nullified by using a data-fitting algorithm, and it showed accurate results. The confusion matrix showed that the results are very accurate within a certain area, which can cover the range of the driver’s hand motion, and therefore, the hand-based gesture recognition may be useful in practical applications to control electronic equipment inside any vehicle; hence, it can prove as a useful technology for the future user interface inside a vehicle. Acknowledgments: This work was supported by the ICT R&D program of MSIP/IITP (B0126-17-1018, the IoT Platform for Virtual Things, Distributed Autonomous Intelligence and Data Federation/Analysis). Author Contributions: Faheem Khan and Seong Kyu Leem designed and carried out the experiments and computer simulations and wrote the paper under the supervision of Sung Ho Cho. Conflicts of Interest: The authors declare no conflict of interest.
Sensors 2017, 17, 833
17 of 18
References 1. 2.
3. 4.
5. 6. 7. 8. 9.
10. 11.
12.
13. 14.
15. 16. 17.
18. 19.
20.
Francis, J.; Anoop, B.K. Significance of Hand Gesture Recognition Systems in Vehicular Automation-A Survey. Int. J. Comput. Appl. 2014, 99, 50–54. [CrossRef] Zhou, R.; Yuan, J.; Zhang, Z. Robust hand gesture recognition based on finger-earth mover’s distance with a commodity depth camera. In Proceedings of the 19th ACM international conference on Multimedia, Scottsdale, AZ, USA, 28 November–1 December 2011. Ying, W.; Huang, T.S. Vision-Based Gesture Recognition: A Review; International Gesture Workshop; Springer: Berlin/Heidelberg, Germany, 1999. Chen, Q.; Georganas, N.D.; Petriu, E.M. Real-time vision-based hand gesture recognition using haar-like features. In Proceedings of the 2007 IEEE Instrumentation & Measurement Technology Conference, Warsaw, Poland, 1–3 May 2007; pp. 1–6. Sturman, D.J.; David, Z. A survey of glove-based input. IEEE Comput. Graph. Appl. 1994, 14, 30–39. [CrossRef] Pragati, G.; Aggarwal, N.; Sofat, S. Vision based hand gesture recognition. World Acad. Sci. Eng. Technol. 2009, 49, 972–977. Sushmita, M.; Acharya, T. Gesture recognition: A survey. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2007, 37, 311–324. Parvin, A.; Kulik, L.; Tanin, E. Gesture recognition using RFID technology. Pers. Ubiquitous Comput. 2012, 16, 225–234. Michael, B.; Prasad, R.; Philipose, M.; Wetherall, D. Recognizing daily activities with RFID-based sensors. In Proceedings of the 11th International Conference on Ubiquitous Computing, Orlando, FL, USA, 30 September–3 October 2009. Sprenger, M.E.; Paul, J.G. Radar-Based Gesture Recognition. U.S. Patent No. 14/229,727, 1 October 2015. Molchanov, P.; de Mello, S.; Kim, K.; Pulli, K. Multi-sensor system for driver’s hand-gesture recognition. In Proceedings of the 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), Ljubljana, Slovenia, 4–8 May 2015; Volume 1. Wan, Q.; Li, Y.; Li, C.; Pal, R. Gesture recognition for smart home applications using portable radar sensors. In Proceedings of the 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Chicago, IL, USA, 26–30 August 2014. Youngwook, K.; Toomajian, B. Hand Gesture Recognition Using Micro-Doppler Signatures with Convolutional Neural Network. IEEE Access 2016, 4, 7125–7130. Chuan, Z.; Hu, T.; Qiao, S.; Sun, Y.; Huangfu, J.; Ran, L. Doppler bio-signal detection based time-domain hand gesture recognition. In Proceedings of the 2013 IEEE MTT-S International Microwave Workshop Series on RF and Wireless Technologies for Biomedical and Healthcare Applications (IMWS-BIO), Singapore, 9–11 December 2013. Pavlo, M.; Gupta, S.; Kim, K.; Pulli, K. Short-range FMCW monopulse radar for hand-gesture sensing. In Proceedings of the 2015 IEEE Radar Conference (RadarCon), Arlington, VA, USA, 10–15 May 2015. Woo, C.J.; Nam, S.S.; Cho, S.H. Multi-Human Detection Algorithm based on an Impulse Radio Ultra-Wideband Radar System. IEEE Access 2017, 4, 10300–10309. Raúl, C.; Khaleghi, A.; Balasingham, I.; Ramstad, T.A. Architecture of an ultra wideband wireless body area network for medical applications. In Proceedings of the 2009 2nd International Symposium on Applied Sciences in Biomedical and Communication Technologies (ISABEL 2009), Bratislava, Slovak, 24–27 November 2009. Khan, F.; Cho, S.H. A Detailed Algorithm for Vital Sign Monitoring of a Stationary/Non-Stationary Human through IR-UWB Radar. Sensors 2017, 17, 290. [CrossRef] [PubMed] Faheem, K.; Choi, J.W.; Cho, S.H. Vital sign monitoring of a non-stationary human through IR-UWB radar. In Proceedings of the 2014 4th IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC), Beijing, China, 19–21 September 2014. Antonio, L.; Girbau, D.; Villarino, R. Techniques for clutter suppression in the presence of body movements during the detection of respiratory activity through UWB radars. Sensors 2014, 14, 2595–2618.
Sensors 2017, 17, 833
21.
22.
23.
24. 25.
26.
27. 28. 29. 30. 31. 32. 33.
18 of 18
Woo, C.J.; Kim, J.H.; Cho, S.H. A counting algorithm for multiple objects using an IR-UWB radar system. In Proceedings of the 2012 3rd IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC), Beijing, China, 21–23 September 2012. Xuanjun, Q.; Choi, J.W.; Cho, S.H. Direction recognition of moving targets using an IR-UWB radar system. In Proceedings of the 2014 4th IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC), Beijing, China, 19–21 September 2014. Sinan, G.; Tian, Z.; Biannakis, G.B.; Kobayashi, H.; Molisch, A.F.; Poor, H.V.; Sahinoglu, Z. Localization via ultra-wideband radios: A look at positioning aspects for future sensor networks. IEEE Signal Process. Mag. 2005, 22, 70–84. Nan, R.; Quan, X.; Cho, S.H. Algorithm for Gesture Recognition Using an IR-UWB Radar Sensor. J. Comput. Commun. 2016, 4, 95–100. Junbum, P.; Cho, S.H. IR-UWB Radar Sensor for Human Gesture Recognition by Using Machine Learning. In Proceedings of the IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), 2016 IEEE 18th International Conference on High Performance Computing and Communications, Sydney, Australia, 12–14 December 2016. Hyeon, Y.D.; Cho, S.H. An Equidistance Multi-human Detection Algorithm Based on Noise Level Using Mono-static IR-UWB Radar System. In Future Communication, Information and Computer Science: Proceedings of the 2014 International Conference on Future Communication, Information and Computer Science (FCICS 2014), May 22–23, 2014, Beijing, China; CRC Press: New York, NY, USA, 2015. Douglas, S.R.G.; Torrie, J.H. Principles and Procedures of Statistics with Special Reference to the Biological Sciences; Mcgraw-Hill Book Company: New York, NY, USA, 1960. Silverman, B.W. Density Estimation for Statistics and Data Analysis; CRC Press: New York, NY, USA, 1986; Volume 26. Anderberg, M.R. Cluster Analysis for Applications. Monographs and Textbooks on Probability and Mathematical Statistics; Academic Press: New York, NY, USA, 1973. K, J.A.; Dubes, R.C. Algorithms for Clustering Data; Prentice-Hall, Inc.: Upper Saddle River, NJ, USA, 1988. Leonard, K.; Rousseeuw, P.J. Finding Groups in Data: An Introduction to Cluster Analysis; John Wiley & Sons: New York, NY, USA, 2009; Volume 344. Everitt, B. Cluster Analysis Heinemann Educational London; Heinemann Educational: London, UK, 1974. Jyoti, Y.; Sharma, M. A Review of K-Mean Algorithm. Int. J. Eng. Trends Technol. 2013, 4, 2972–2975. © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).