Human Action Recognition from Multiple Views Based on View - MDPI

applied sciences Article

Human Action Recognition from Multiple Views Based on View-Invariant Feature Descriptor Using Support Vector Machines Allah Bux Sargano 1,2, *, Plamen Angelov 1 and Zulfiqar Habib 2 1 2

*

School of Computing and Communications Infolab21, Lancaster University, Lancaster LA1 4WA, UK; [email protected] Department of Computer Science, COMSATS Institute of Information Technology, Lahore 54000, Pakistan; [email protected] Correspondence: [email protected]; Tel.:+44-152-451-0525

Academic Editor: David He Received: 5 September 2016; Accepted: 13 October 2016; Published: 21 October 2016

Abstract: This paper presents a novel feature descriptor for multiview human action recognition. This descriptor employs the region-based features extracted from the human silhouette. To achieve this, the human silhouette is divided into regions in a radial fashion with the interval of a certain degree, and then region-based geometrical and Hu-moments features are obtained from each radial bin to articulate the feature descriptor. A multiclass support vector machine classifier is used for action classification. The proposed approach is quite simple and achieves state-of-the-art results without compromising the efficiency of the recognition process. Our contribution is two-fold. Firstly, our approach achieves high recognition accuracy with simple silhouette-based representation. Secondly, the average testing time for our approach is 34 frames per second, which is much higher than the existing methods and shows its suitability for real-time applications. The extensive experiments on a well-known multiview IXMAS (INRIA Xmas Motion Acquisition Sequences) dataset confirmed the superior performance of our method as compared to similar state-of-the-art methods. Keywords: computer visions; human action recognition; view-invariant feature descriptor; classification; support vector machines

1. Introduction In recent years, automatic human-activity recognition (HAR) based on computer vision has drawn much attention of researchers around the globe due to its promising results. The major applications of HAR include; human–computer interaction (HCI), intelligent video surveillance, ambient assisted living, human–robot interaction, entertainment, video indexing, and others [1]. Depending on the complexity and duration, human activities can be categorized into four levels: gestures, actions, interactions, and group activities. Gestures are represented by movement of the person’s body parts, such as stretching an arm; actions are single-person activities such as walking, punching, kicking, running and so forth; interactions are activities of two or more than two persons, such as two person fighting with each other; and a group having a meeting is an example of group activity [2]. A large amount of work has already been done for human action recognition, but it is still a challenging problem. The major challenges and issues in HAR are as follows: (1) occlusion; (2) variation in human appearance, shape, and clothes; (3) cluttered backgrounds; (4) stationary or moving cameras; (5) different illumination conditions; and (6) viewpoint variations. Among these challenges, viewpoint variation is one of the major problems in HAR since most of the approaches for human activity

Appl. Sci. 2016, 6, 309; doi:10.3390/app6100309

www.mdpi.com/journal/applsci

Appl. Sci. 2016, 6, 309

2 of 14

classification are view-dependent and can recognize the activity from one fixed view captured by a single camera. These approaches are supposed to have the same camera view during training and testing. This condition cannot be maintained in real world application scenarios. Moreover, if this condition is not met, their accuracy decreases drastically because the same actions look quite different when captured from different viewpoints [3]. A single camera-based approach also fails to recognize the action when an actor is occluded by an object or when some parts of the action are hidden due to unavoidable self-occlusion. To avoid these issues and get the complete picture of an action, more than one camera is used to capture the action—this is known as action recognition from multiple views or view-invariant action recognition [4]. There are two major approaches for recognition of human actions from multiviews: 3D approach and 2D approach [5]. In the first approach, a 3D model of a human body is constructed from multiple views and motion representation is formed from it for action recognition. This model can be based on cylinders, ellipsoids, visual hulls generated from silhouettes, or surface mesh. Some examples of motion representation are 3D optical flow [6], shape histogram [7], motion history volume [8], 3D body skeleton [9], and spatiotemporal motion patterns [10]. Usually, the 3D approach provides higher accuracy than a 2D approach but at higher computational cost, which makes it less applicable for real time applications. In addition to this, it is difficult to reconstruct a good-quality 3D model because it depends on the quality of extracted features or silhouettes of different views. Hence, the model is exposed to deficiencies which might have occurred due to segmentation errors in each viewpoint. Moreover, a good 3D model of different views can only be constructed when the views overlap. Therefore, a sufficient number of viewpoints have to be available to reconstruct a 3D model. However, recently some 3D cameras have been introduced for capturing images in 3D form. Among these, 3D time-of-flight (ToF) cameras and Microsoft Kinect have become very popular for 3D imaging. These devices overcome the difficulties that are faced by the classical 3D multiview action-recognition approaches when reconstructing a 3D model. However, these sensors also have several limitations. For example, in contrast to a fully reconstructed 3D model from multiple views, these sensors only capture the frontal surfaces of the human and other objects in the scene. In addition to this, these sensors also have limited range about 6–7 m, and data can be distorted by scattered light from the reflective surfaces [5]. Due to the limitations of the 3D approach, researchers prefer to employ a 2D approach for human-action recognition from multiple views [11]. The methods based on 2D models extract features from 2D images covering multiple views. Different methods have been proposed for multiview action recognition based on 2D models. However, three important lines of work are mentioned here. The first approach handles it at feature level, achieves view-invariant action representation using appropriate feature descriptor(s) or fusion of different features [12,13], and then action recognition is performed using an appropriate classifier. The second one handles it at a classification level by determining the appropriate classification scheme. The classification is carried out either by a single universal classifier or multiple classifiers are trained, and later on their results are fused to get the final result [14,15]. The third one utilizes the learning-based model, such as deep learning and its variations, to learn the effective and discriminative features directly from the raw data for multiview action recognition [16,17]. Our method falls under the first category: we learn the discriminative and view-invariant features from the human silhouette. The human silhouette contains much less information than the original image, but our approach shows that this information is sufficient for action representation and recognition. By taking advantage of this simple representation, our approach not only provides high accuracy but also high recognition speed of 34 frames per second on a challenging multiview IXMAS (INRIA Xmas Motion Acquisition Sequences) dataset. Due to its high accuracy and speed our approach is suitable for real-time applications. The rest of the paper is organized as follows. Related work is discussed in Section 2, proposed approach is described in Section 3, experimentation and results are explained in Section 4, finally discussions and conclusion are presented in Section 5.

Appl. Sci. 2016, 6, 309

3 of 14

2. Related Works This section presents state-of-the-art methods for multiview action recognition based on a 2D approach. These methods extract features from 2D image frames of all available views and combine these features for action recognition. Then, classifier is trained using all these viewpoints. After training the classifier, some methods use all viewpoints for classification [18], while others use a single viewpoint for classification of a query action [19–21]. In both cases, the query view is part of the training data. However, if the query view is different than the learned views, this is known as cross-view action recognition. This is even more challenging than the multiview action recognition [22,23]. Different types of features—such as motion features, shape features, or combination of motionand shape-based features—have been used for multiview action recognition. In [20], silhouette-based features were acquired from five synchronized and calibrated cameras. The action recognition from multiple views was performed by computing the R transform of the silhouette surfaces and manifold learning. In [24], contour points of the human silhouette were used for pose representation, and multiview action recognition was achieved by the arrangements of multiview key poses. Another silhouette-based method was proposed in [25] for action recognition from multiple views; this method used contour points of the silhouette and radial scheme for pose representation. Then, model fusion of multiple camera streams was used to build the bag of key poses, which worked as a dictionary for known poses and helped to convert training sequences into key poses for a sequence-matching algorithm. In [13], a view-invariant recognition method was proposed, which extracted the uniform rotation-invariant local binary patterns (LBP) and contour-based pose features from the silhouette. The classification was performed using a multiclass support vector machine. In [26], scale-invariant features were extracted from the silhouette and clustered to build the key poses. Finally, classification was done using a weighted voting scheme. An optical flow and silhouette-based features were used for view-invariant action recognition in [27], and principal component analysis (PCA) was used for reducing the dimensionality of the data. In [28], coarse silhouette features, radial grid-based features and motion features were used for multiview action recognition. Another method for viewpoint changes and occlusion-handling was proposed in [19]. This method used histogram of oriented gradients (HOG) features with local partitioning, and obtained the final results by fusing the results of the local classifiers. A novel motion descriptor based on motion direction and histogram of motion intensity was proposed in [29] for multiview action recognition followed by a support vector machine used as a classifier. Another method based on 2D motion templates, motion history images, and histogram of oriented gradients was proposed in [30]. A hybrid CNN–HMM model which combines convolution neural networks (CNN) with hidden Markov model (HMM) was used for action classification [17]. In this method, the CNN was used to learn the effective and robust features directly from the raw data, and HMM was used to learn the statistical dependencies over the contiguous subactions and conclude the action sequences. 3. Proposed System In recent years, various methods have been published for multiview action recognition, but very few are actually suitable for real-time applications due to their high computational cost. Therefore, the cost of the feature extraction and action classification has to be reduced as much as possible. The proposed system has been designed around these parameters. It uses the human silhouette as input for feature extraction and formulating the region-based novel feature descriptor. Human silhouette can easily be extracted using foreground detection techniques. At the moment our focus is not on foreground segmentation; rather, we are focused on later phases of the action recognition such as feature extraction and classification. Although the human silhouette is a representation that contains much less information than the original image, we show that it contains sufficient information to recognize the human actions with high accuracy. This work proposes a novel feature descriptor

histograms were used as a feature descriptor—and [25], where the contour points of each radial bin were considered as features, and model fusion was used to achieve the multiview action recognition by obtaining key poses for each action through K‐means clustering. Our method computes efficient and discriminative geometrical and Hu‐moment features from each radial bin of the silhouette itself. Then, from all bins are concatenated into a feature vector for multiview action Appl. Sci.these 2016, 6,features 309 4 of 14 recognition using support vector machine. This approach has three major advantages. Firstly, it divides the silhouettes into radial bins, for multiview human-action recognition based human an silhouette. When a silhouette partitioned which cover almost all viewing angles, thus on provides easy way to compute the isfeatures for into radial bins, then region descriptors are formed which are represented as a triangle or quadrangle. different views of an action. Secondly, it uses the selected discriminative features and avoids extra Accordingly, it makes the pose description process simple and accurate. The descriptive detail of the computation such as fusion or clustering as used in [25]. Thirdly, due to employing selected features proposed system is given in Algorithm 1 and block diagram is shown in Figure 1. from each bin, it does not require any dimensionality reduction technique.

Figure 1. Block diagram of the proposed system. Figure 1. Block diagram of the proposed system.

Algorithm 1 We consider the centroid of a silhouette as a point of origin, and divide the human silhouette into R Step 1 Input video from multiple cameras radial bins with the interval of same degree. Here the value of the R is selected as 12, which constitutes Step 2 Extraction of silhouettes from the captured video 12 radial bins with the interval of 30◦ each. This value has been selected with experimentation to cover Step 3 Feature Extractions: the maximum viewing angles. This is unlike [31]—where the radial histograms were used as a feature a) Division of the human silhouette into radial bins descriptor—and [25], where the contour points of each radial bin were considered as features, and b) Computation of region‐based geometrical features from each radial bin model fusion was used to achieve the multiview action recognition by obtaining key poses for each c) Computation of Hu‐moments features from each radial bin action through K-means clustering. Our method computes efficient and discriminative geometrical Step 4 Action classifications by multi‐class support vector machine and Hu-moment features from each radial bin of the silhouette itself. Then, these features from all bins Step 5 Output recognized action are concatenated into a feature vector for multiview action recognition using support vector machine. This approach has three major advantages. Firstly, it divides the silhouettes into radial bins, 3.1. Pre‐Processing which cover almost all viewing angles, thus provides an easy way to compute the features for different views of an action. Secondly, it uses the selected discriminative features and avoids extra computation such as fusion or clustering as used in [25]. Thirdly, due to employing selected features from each bin, it does not require any dimensionality reduction technique.

Appl. Sci. 2016, 6, 309

5 of 14

Algorithm 1 Step 1 Step 2 Step 3

Input video from multiple cameras Extraction of silhouettes from the captured video Feature Extractions: a) b) c)

Step 4 Step 5

Division of the human silhouette into radial bins Computation of region-based geometrical features from each radial bin Computation of Hu-moments features from each radial bin

Action classifications by multi-class support vector machine Output recognized action

Appl. Sci. 2016, 6, 309

5 of 14

3.1. Pre-Processing

Usually, some lines or small size regions are formed around the silhouette due to segmentation Usually, some lines or small size regions are formed around the silhouette due to segmentation errors, loose clothing of the subject under consideration, and other noise. These unnecessary regions errors, looseany clothing of the information subject underfor consideration, and otherbut noise. These unnecessary regions do not offer important action recognition, rather create problems for the do not offer any important information for action recognition, but rather create problems for the feature‐extraction algorithm and increase the complexity. By removing these small regions, feature-extraction algorithm and increase the complexity. By removing these small regions, complexity complexity can be reduced and the feature‐extraction process can be made more accurate. In our case, can be reduced and the feature-extraction process can be made more accurate. In our case, a region a region was considered as small and unnecessary if its area is less than 1/10 of the silhouette. This was considered as small and unnecessary if its area is less than 1/10 of the silhouette. This threshold threshold was set based on an observation on 1000 silhouette images of different actors and actions. was set based on an observation on 1000 silhouette images of different actors and actions. An example An example of the silhouette before and after noise removal is shown in first row of Figure 1. of the silhouette before and after noise removal is shown in first row of Figure 1.

3.2. Multiview Features Extraction and Representation 3.2. Multiview Features Extraction and Representation The of any any recognition system mainly depends on feature proper feature selection and Thesuccess success of recognition system mainly depends on proper selection and extraction extraction mechanism. For action recognition from different views, a set of discriminative and view‐ mechanism. For action recognition from different views, a set of discriminative and view-invariant invariant features have to be extracted. Our feature descriptor is based on two types of features: (1) features have to be extracted. Our feature descriptor is based on two types of features: (1) region-based region‐based geometric features and (2) Hu‐moments features extracted from each radial bin of the geometric features and (2) Hu-moments features extracted from each radial bin of the silhouette. The overview of the feature extraction process is show in Figure 2. silhouette. The overview of the feature extraction process is show in Figure 2.

Figure 2. Overview of feature extraction process. Figure 2. Overview of feature extraction process.

3.2.1. Region‐Based Geometric Features The shape of a region can be described by two types of features. One is the region‐based and the other is boundary‐based features. The region‐based features are less affected by noise and occlusion than boundary‐based features. Therefore, region‐based features are better choice to describe the

Appl. Sci. 2016, 6, 309

6 of 14

3.2.1. Region-Based Geometric Features The shape of a region can be described by two types of features. One is the region-based and the other is boundary-based features. The region-based features are less affected by noise and occlusion than boundary-based features. Therefore, region-based features are better choice to describe the shape of a region [32]. Generally, a shape is described by a set of numbers known as descriptors. A good descriptor is one which has the ability to reconstruct the shape from the feature points. The resultant shape should be an approximation of the original shape and should yield similar feature values. The proposed feature descriptor has been designed around these parameters. These features have been computed as follows: (1)

First of all, we calculate the centroid of a human silhouette using Equation (2), as shown in Figure 3. Cm = ( xc , yc ) (1) where xc =

∑in=1 xi ∑ n yi and yc = i=1 n n

(2)

(2) After computing the centroid by Equation (2), a radius of the silhouette is computed. The silhouette is divided into 12 radial bins with respect to its centroid using a6 of 14 mask; (3)Appl. Sci. 2016, 6, 309 ◦ this division has been made with intervals of 30 . These bins are shown in Figure 4. (4)(2) After computing the centroid by Equation (2), a radius of the silhouette is computed. The following region-based features are computed for each bin of the silhouette. (3) The silhouette is divided into 12 radial bins with respect to its centroid using a mask; this division (a) Critical points: As we move the mask on the silhouette in counter-clockwise direction, a triangle has been made with intervals of 30°. These bins are shown in Figure 4. or quadrangle shape is formed in each bin. We compute the critical points (corner points) of (4) The following region‐based features are computed for each bin of the silhouette. each shape and their distances. There can be different numbers of critical points for each shape; (a) Critical points: As we move the mask on the silhouette in counter‐clockwise direction, a triangle or therefore, the mean and variance of these points have been computed as features. It gives us quadrangle shape is formed in each bin. We compute the critical points (corner points) of each 2 × 12 = 24 features for each silhouette. shape and their distances. There can be different numbers of critical points for each shape; Area: The simple and natural property of a region is its area. In the case of a binary image, (b) therefore, the mean and variance of these points have been computed as features. It gives us 2 × it12 = 24 features for each silhouette. is a measure of size of its foreground. We have computed the area of each bin, which provides 12 important features for each human silhouette. (b) Area: The simple and natural property of a region is its area. In the case of a binary image, it is a (c) Eccentricity: The ratio of the major and minor axes of an object is known as eccentricity [33]. measure of size of its foreground. We have computed the area of each bin, which provides 12 Inimportant features for each human silhouette. our case, it is ratio of distance between the major axis and foci of the ellipse. Its value is (c) Eccentricity: The ratio of the major and minor axes of an object is known as eccentricity [33]. In our between 0 and 1, depending upon the shape of the ellipse. If its value is 0, then actually it is acase, it is ratio of distance between the major axis and foci of the ellipse. Its value is between 0 circle; if its value is 1, then it is a line segment. We have computed eccentricity for each of and 1, depending upon the shape of the ellipse. If its value is 0, then actually it is a circle; if its the 12 bins. value is 1, then it is a line segment. We have computed eccentricity for each of the 12 bins. (d) Perimeter: This is also an important property of a region. The distance around the boundary (d) Perimeter: This is also an important property of a region. The distance around the boundary of a of a region can be measured by computing the distance between each pair of pixels. We have region can be measured by bin computing distance each pair of pixels. We have computed perimeter of each forming the a triangle or between quadrangle. computed perimeter of each bin forming a triangle or quadrangle. (e) Orientation: This is an important property of a region, which specifies the angle between x-axis (e) Orientation: This is an important property of a region, which specifies the angle between x‐axis and major axis of the ellipse. Its value can be between −90◦ and 90◦ . and major axis of the ellipse. Its value can be between −90° and 90°.

Figure 3. Example image of the human silhouette with centroid. Figure 3. Example image of the human silhouette with centroid.

Appl. Sci. 2016, 6, 309

7 of 14

Figure 3. Example image of the human silhouette with centroid.

Figure 4. Example of division of silhouette into radial bins. Figure 4. Example of division of silhouette into radial bins.

3.2.2. Hu‐Moments Invariant Features 3.2.2. Hu-Moments Invariant Features The use ofof invariant moments for binary representation was proposed in [34]. The The use invariant moments for binary shapeshape representation was proposed in [34]. The moments moments which are invariant with respect to rotation, scales, and translations, are known as Hu‐ which are invariant with respect to rotation, scales, and translations, are known as Hu-moments moments invariants. We have computed seven moments for each radial bin of the silhouette. The invariants. We have computed seven moments for each radial bin of the silhouette. The moment (p,q) Appl. Sci. 2016, 6, 309 7 of 14 moment (p,q) of an image f(x,y) of size M × N is defined as: of an image f (x,y) of size M × N is defined as: M −1 N −1

m, p,q =

∑ ∑

f ,( x, y) x p y q

(3) (3)

x =0 y =0

Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309

7 of 14 7 of 14

Here, p is order of x and q is order of y. We can calculate the central moment in the same way as x and q is order of y. We can calculate the central moment in the same way as7 of 14 7 of 14 7 of 14 these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 7 of 14 7 of 14 7 of 14 7 of 14 , , (3) (3) Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 , 7 of 14 7 of 14 7 of 14 7 of 14 , M −1 N −1 , (4) , p q ) (3) 7 of 14 (3) 7 of 14 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 7 of 14 7 of µ, p,q = , ∑ ∑ f, ,( x, y)(,x − x avg ), (y − y avg (4) (3) 7 of 14 , , , , x = 0 y = 0 (3) (3) (3) (3) Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 7 of 14 7 of 14 7 of 14 7 of 14 7 of 14 7 of 14 , , , , Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as where , , (3) 7 of 14 (3) 7 of 14 (3)7 of 14 (3) 7 of 14 7 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 7 of 14 these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: , , Appl. Sci. 2016, 6, 309 ,, , , Appl. Sci. 2016, 6, 309 Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as where , , , , , , (3) (3) (3) (3) Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 7 of 14 7 of 14 7 of 14 7 of 14 7 of 14 (3 7 these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: , Appl. Sci. 2016, 6, 309 , Appl. Sci. 2016, 6, 309 , m , ,m , Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as x avg = 10and and y avg = 01 (5) (5) , , , , , , , , (3) (3) (3) (3) (3) (3) m m Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 7 of 14 7 of 14 7 of 14 7 of 14 7 of 14 7 of 14 these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: , , , , , , , , Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as 00 00 , , (4) (4) , , , moments , , are , are , Hence, , normalized , central , (4) (3) , , (3) ,, Hence, normalized , , central (3) (3) these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: , , , , , , , , (3) Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way Here, p is order of x and q is order of y. We can calculate the central moment in the same , , , ByHere, p is order of x and q is order of y. We can calculate the central moment in the same way as applying normalization, scale-invariant moments obtained. (4) (4) By applying normalization, scale‐invariant obtained. , , , , , , , , , , ,, , , (3) (4) (3) (3) (3) (4)(3) these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: , ,, , , , , Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as moments are defined as follows [35]. , ,Here, p is order of x and q is order of y. We can calculate the central moment in the (4) moments are defined as follows [35]. (4) , Here, p is order of x and q is order of y. We can calculate the central moment in the same way as , ,Here, p is order of x and q is order of y. We can calculate the central moment in the same , where where , , , , , , , , (3) (3) (3) (3) these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: , Here, p is order of x and q is order of y. We can calculate the central moment in th , Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same wa , ,these moments, except the value of x and y is displaced by the mean values as follows: , these moments, except the value of x and y is displaced by the mean values as follows: , , (4) (4) (4) (3) (4) (3) , Here, p is order of x and q is order of y. We can calculate the central moment in the same way as , ,μ , ,Here, p is order of x and q is order of y. We can calculate the central moment q, , Here, p is order of x and q is order of y. We can calculate the centr where where where p pHere, p is order of x and q is order of y. We can calculate the central mo 2 2, , Here, p is order of x and q is order of y. We can calculate the µp,q , + q + ɳ p,q , p, p q and,,, γ γ and these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows these moments, except the value of x and y is displaced by the mean values as fol , = Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment = these moments, except the value of x and y is displaced by the mean v + q 2,3, = ,2,… 3, ,Here, p is order of x and q is order of y. We can calculate the central moment in the ,these moments, except the value of x and y is displaced by the mean values ,2, these moments, except the value of x and y is displaced by the m these moments, except the value of x and y is displaced by the mean values as follows: (5) (4) (4) (5) (4)(6) (6) (4) (4 ,Here, p is order of x and q is order of y. We can calculate the central moment in the same way as ,Here, p is order of x and q is order of y. We can calculate the central moment in the same wa , μ00 , . . ,. where Here, p is order of x and q is order of y. We can calculate the central moment in the same way as where where where γ µ00and 2 and and (5) (5) (5) these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as fol Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same way as Here, p is order of x and q is order of y. We can calculate the central moment in the same Here, p is order of x and q is order of y. We can calculate the central moment in the , , , , , , , , (4) (4) (4) (4) (4) (4) , , , , , , , , where where where where Based on these central moments, [34] introduced seven Hu‐moments as linear combination of By applying normalization, By applying normalization, scale‐invariant scale‐invariant moments moments are obtained. are obtained. Hence, Hence, normalized normalized central central and and and and (5) (5), (4) these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: these moments, except the value of x and y is displaced by the mean values as follows: Based on these central of , (5) , , , combination these moments, except the value of x and y is displaced by the mean values as follows: moments, , introduced ,, , , , as ,linear , (4) (5) (4) (4) (4) ,where ,where , , , [34] , , , , seven Hu-moments where where where where central moments defined as follows: moments are defined as follows [35]. moments are defined as follows [35]. By applying normalization, By applying normalization, By applying normalization, scale‐invariant scale‐invariant scale‐invariant moments moments are obtained. moments are obtained. Hence, are obtained. Hence, normalized normalized Hence, central normalized central central and and and and (5) (5) (5) (5) central moments defined as follows: , , , , , , , , (4) (4) (4) (4) (4) , , , , , , , , where where where where where where where where moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. By applying normalization, By applying normalization, By applying normalization, By applying normalization, scale‐invariant scale‐invariant moments scale‐invariant moments scale‐invariant are obtained. moments are and obtained. moments Hence, are obtained. Hence, normalized are and obtained. normalized Hence, central Hence, normalized central normalized central central μ μ p q p 2 q 2 and and and and ɳ ɳ (5) (5) (5) (5) , , where , q, p 2,3, , … 2,3, ɳ , ,, γ where (6)(4) (4) (4) (4) (5 (4) (6) (4) , ,where , ,where , q , , … , , where where where where where where γ,scale‐invariant , p, where scale‐invariant , scale‐invariant h1, By applying normalization, = ɳ 20, ,+ moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. By applying normalization, By applying normalization, By applying normalization, moments scale‐invariant moments are obtained. moments are obtained. moments Hence, are obtained. Hence, are normalized obtained. normalized Hence, central Hence, normalized central 02 μ μ μ p q p 2 q 2 p q 2 2and 2and and and and μ00, μ00 and and and normalized (5) (5) (5) central (5) central (5) (5) , ɳ where ɳ where where where where where where where , , ,pγ q, p 2,3, , γ4ɳ 2scale‐invariant q … 2,3, , are pHence, … q moments 2,3, … moments (6) obtained. (6) (6) ɳ 02, ) 2scale‐invariant By applying normalization, , , γ h2By applying normalization, = ( ɳ 4ɳ + moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. By applying normalization, By applying normalization, By applying normalization, scale‐invariant By applying normalization, moments scale‐invariant moments are moments obtained. scale‐invariant are moments scale‐invariant obtained. obtained. are Hence, normalized obtained. Hence, normalized are central obtained. Hence, normalized are central normalized Hence, central Hence, normalized central normalized 20, − 11 μ μ μ μ p q p 2 q 2 p q p 2 q 2 2 2 2 μ00 μ00 μ00 and and and and and and and and and and and centr (5) (5) (5) (5) (5) , , , , Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of 2where 2p , γq , p,2,3, where where where where where where where ɳ ɳ ɳ ɳ , γ , γ , γ q … 2,3, , p … q , p 2,3, q … 2,3, … (6) (6) (6) (6) , , , , h = ( − 3 ) + ( 3 − ) ɳ 3ɳ 3ɳ ɳ moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. By applying normalization, By applying normalization, By applying normalization, scale‐invariant scale‐invariant scale‐invariant scale‐invariant moments scale‐invariant are moments scale‐invariant are obtained. scale‐invariant obtained. Hence, scale‐invariant normalized obtained. Hence, normalized moments are Hence, normalized central obtained. moments Hence, normalized central are Hence, central are normalized normalized Hence, Hence, normalized central norm 3 By applying normalization, 30By applying normalization, 03 21and By applying normalization, moments μ 12, By applying normalization, μμ00 pBy applying normalization, q μpμ00 2 q 2 pmoments p2Hence, 2moments 2moments 2 2μobtained. 2and μ00 μ00 and and qare and qare are and obtained. and and obtained. central central (5) (5)obtained. (5) (5) (5) central moments defined as follows: central moments defined as follows: Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of 2 , γ ɳ , , γ ɳ ,2 , ,pγ , q, ,pγ 2,3, ɳ 30By applying normalization, q obtained. … ,2,3, pHence, … qscale‐invariant , pHence, 2,3, q … 2,3, … (6) (6) (6) (6) , + ɳ , )scale‐invariant + , By applying normalization, moments , ) moments h = ( ( + moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. By applying normalization, By applying normalization, By applying normalization, By applying normalization, scale‐invariant By applying normalization, scale‐invariant scale‐invariant By applying normalization, By applying normalization, are scale‐invariant moments obtained. By applying normalization, are scale‐invariant moments By applying normalization, are obtained. moments are normalized scale‐invariant obtained. moments Hence, are normalized scale‐invariant obtained. Hence, scale‐invariant moments are normalized central obtained. moments scale‐invariant normalized central Hence, are moments central obtained. Hence, are moments normalized obtained. central are moments normalized Hence, obtained. are central Hence, obtaine are norm ce Ho 03 4 12 21 ɳ ɳ ɳ μ00 μɳ , μ00 μ ,p and μ μq ,p and qμ00 2and 2 q 2 pμ2and 2, p q pand 2 2μ00 and 2 μand 2 q and 2 (5) (5) (5) (5) (5) (5) , p ,2 q central moments defined as follows: central moments defined as follows: central moments defined as follows: Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of 2, γ ɳ ɳ ɳ ɳ ɳ ɳ , γ , γ , p q , γ , p 2,3, q … , p , 2,3, γ q … , p , 2,3, γ q … 2,3, , p … q , p 2,3, q … 2,3, … (6) (6) (6) (6) (6 ɳ ɳ ɳ ɳ , , , , , , −scale‐invariant 3 μ00 + 3moments − 32moments are defined as follows [35]. ( are )+ h5 =moments are defined as follows [35]. ( By applying normalization, By applying normalization, By applying normalization, moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. By applying normalization, By applying normalization, By applying normalization, scale‐invariant scale‐invariant By applying normalization, By applying normalization, moments moments obtained. scale‐invariant moments are scale‐invariant obtained. are Hence, scale‐invariant obtained. moments normalized are Hence, obtained. are normalized obtained. moments normalized central are obtained. central Hence, normalized are obtained. central normalized Hence, central Hence, norma cen 30 moments are defined as follows [35]. 30μ 03 12μ)(( 21 + μscale‐invariant μqscale‐invariant μHence, qmoments are defined as follows [35]. 212 q)moments are defined as follows [35]. p qare p2)obtained. qμ00 pmoments q2normalized p moments 2Hence, qcentral 2 Hence, 2 , pμ00 , p , pɳ 2μ00 ,2 2 2μ ,μ00 ,22 q μ 2,2p ɳ central moments defined as follows: 3ɳ ɳ ɳ μ ,2 3ɳ 3 ɳ ,μ00 central moments defined as follows: central moments defined as follows: central moments defined as follows: Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of ɳ ɳ ɳ ɳ ɳ ɳ ɳ , γ , γ γ , p , γ q , p 2,3, q , γ , … p 2,3, q , γ , … p 2,3, q … , , p γ 2,3, q … , , p γ 2,3, q … , 2,3, p … q , p 2,3, q … 2,3, … (6) (6) (6)central (6) central (6)central (6) ɳ ɳ By applying normalization, ɳ ɳ ɳ , − ( 21By applying normalization, ,scale‐invariant moments are defined as follows [35]. ,scale‐invariant moments moments ,scale‐invariant obtained. ,Hence, obtained. + )( 3μ(moments + )2 − μobtained. (2pɳ + )moments )q2μ00 By applying normalization, (3By applying normalization, moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. By applying normalization, scale‐invariant are are Hence, moments are normalized are Hence, normalized obtained. moments are Hence, central normalized obtained. moments Hence, normalized are Hence, central are normalized normalized Hence, Hence, normalized 03, )scale‐invariant 03By applying normalization, 30 03 21 12 21 ɳ By applying normalization, ɳ ɳ ɳ 4ɳ 4ɳ μ 2obtained. ɳ p moments are defined as follows [35]. q p,are qobtained. pcentral q 2 obtained. q 2 norma 2p,scale‐invariant μ00 μ00 scale‐invariant 3ɳ By applying normalization, ɳ ɳ μμ00 μmoments ɳ μ00 ɳ pscale‐invariant ɳ ɳ μ00 , Based on these central moments, [34] introduced seven Hu‐moments as linear combin , 2 ,22 ,μ00 ,22 q μ 2,2 , qμ ,22 pμ ,2q , pμ , p 2q p 2 q Based on these central moments, [34] introduced seven Hu‐moments as linear combination 2μ3 qμ00 pμ 2 central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ , γ , γ , γ , p , γ q , p 2,3, q , , … p γ 2,3, q , , … p γ 2,3, q … , 2,3, p , γ … q , γ p 2,3, q , γ … 2,3, , p , γ … , q p , γ 2,3, q , p … 2,3, q , p … 2,3, q , p … 2,3, q… (6) (6) (6) (6) (6) ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ , , , , , , , , , , , − (μ22q21p+22 +μ 4p211 (q 230μpμ00 + q2μ00 h6 = − 4ɳ )((moments are defined as follows [35]. moments are defined as follows [35]. moments are defined as follows [35]. 02μ 30μ+ pμ00 03p) 2) 2 032)2 12 12 21 + ɳ moments are defined as follows [35]. ɳ ɳmoments are defined as follows [35]. ɳ moments are defined as follows [35]. ɳ 4ɳ 4ɳ ɳ( 20 μ2)ɳ μqμ00 μ)(ɳ ɳ 3ɳ 3ɳ moments are defined as follows [35]. ɳ moments are defined as follows [35]. q 2ɳ q 2pμ00 q μ00 22 2 2 μ00 μ00 ,Based on these central moments, [34] introduced seven Hu‐moments as linear combin ,Based on these central moments, [34] introduced seven Hu‐moments as linear co , pμ00 , , 2 ɳ qμ00 , ɳ ,2 ,2p μ00 ɳ Based on these central moments, [34] introduced seven Hu‐moments as linear combination of 3ɳ ɳ ɳ 3ɳ ɳ central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of ɳ ɳ ɳ 4ɳ ɳ ,12p, ) 2,3, ɳ 2,3, ɳ γ, )( , γɳ 30,, p− γq ,− qγ, p… q, p γ… 2,3, q … ,)− p,2,3, γ q… ,,pγ2,3,q… (6) ,2,3, p … q(6) , p 2,3, q … 2,3, (6) (6)… (6) ɳ ɳ ɳ ɳ ɳɳ ɳ ɳ ɳ 4ɳ ɳ ɳ (4ɳ ɳ ɳ ɳ 03 , )( ɳ ,21 ,4ɳ ɳ ,03 3( ,3 3ɳ ɳ 21 + (q h 7 = 03 21 ɳ 4ɳ γ μ , ,− μ μ μq , p2 μp, ,+ μ, ),2 3ɳ ɳ q pμ00 2 μ q 2μq p 2 2q μ00 p 2μ00 q p 2 q2 2 2 2Based on these central moments, [34] introduced seven Hu‐moments as lin 2Based on these central moments, [34] introduced seven Hu‐moments 2Based on these central moments, [34] introduced seven Hu‐mom 2Based on these central moments, [34] introduced seven Hu μ00 μ00 central moments defined as follows: μ00 ,p , pμ00 ,2 2μ00 ɳ Based on these central moments, [34] introduced seven Hu‐moments as linear combination of ɳ 3ɳ ɳ Based on these central moments, [34] introduced seven Hu‐moments as linear combination of 3ɳ ɳ 3ɳ ɳ γBased on these central moments, [34] introduced seven Hu‐moments as linear combinatio ɳ 3ɳ central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear c 2 … Based on these central moments, [34] introduced seven Hu‐moments as linear combination of ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ 3central moments defined as follows: γ , ɳ , p ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ , , , γ , q γ , p 2,3, , q γ , … p 2,3, q , γ , p … 2,3, q , , p γ 2,3, q … , , p γ 2,3, q … , 2,3, p … q , p 2,3, q … 2,3, … (6) (6) (6) (6) (6) (6) ɳ 3ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ central moments defined as follows: ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3 12ɳ , )( 4ɳ , + 4ɳ 12, ) − ( 21, + )( , 4ɳ ( 30 3( 30 , + 03,2 03, ) ) 2μ00 ɳ ɳ ɳ 4ɳ ɳ− ɳ ɳ ɳ 3ɳ 21 2 central moments defined as follows: 2central moments defined as follows: 2central moments defined as follows: 2 μ00 2 2 μ00 μ00 μ00 μ00 μ00 μ00 central moments defined as follows: central moments defined as follows: ɳ ɳ 3ɳ ɳ Based on these central moments, [34] introduced seven Hu‐moments as linear combination of 3ɳ 3ɳ ɳ 3ɳ ɳ 3ɳ ɳ 3ɳ 3ɳ ɳ central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combinatio Based on these central moments, [34] introduced seven Hu‐moments as linear co Based on these central moments, [34] introduced seven Hu‐moments as lin ɳ 3 ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ ɳ 3 ɳ 4ɳ ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ 4ɳ ɳɳ 4ɳ 4ɳɳ ɳ ɳ ɳ 3 ɳ ɳ ɳ 3ɳ 3ɳ ɳ ɳ3ɳ 4ɳ ɳ ɳ 4ɳ ɳ ɳ Based on these central moments, [34] introduced seven Hu‐moments as linear combina central moments defined as follows: ɳ 3ɳ ɳ ɳ 3ɳ ɳ 3ɳ 3ɳ central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear combination of Based on these central moments, [34] introduced seven Hu‐moments as linear co ɳ ɳ Based on these central moments, [34] introduced seven Hu‐moments as linear combination of ɳ 3ɳ Based on these central moments, [34] introduced seven Hu‐moments as linear combination of central moments defined as follows: 3ɳ central moments defined as follows: ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ 3 ɳ 3 ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳɳ ɳ 3 ɳ ɳ 4ɳ 4ɳ 3ɳ 3ɳ 4ɳ 4ɳ 3ɳ ɳ ɳɳ ɳ 4ɳ ɳ 3ɳ ɳ ɳ 3ɳ ɳ 4ɳ ɳ ɳ ɳ ɳ ɳ ɳ 4ɳ3ɳ 4ɳ ɳ ɳ ɳ ɳ ɳ 3 ɳ 3ɳ 3ɳ 3ɳ 3ɳ 3ɳ central moments defined as follows: ɳ 3.3. Action Classification with SVM Multiclass Classifier ɳ 3ɳ ɳ ɳ ɳ 3ɳ 3ɳ ɳ ɳ 3ɳ 3ɳ ɳ 3ɳ ɳ 3ɳ central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: central moments defined as follows: ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ 3ɳ 3ɳ 3 3 ɳ 3 ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 4ɳɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ ɳ 3ɳ ɳ 3ɳ ɳ ɳ 3ɳ 3ɳ 3ɳ 3 ɳ ɳ ɳ 3 3ɳ ɳ ɳ ɳ 3 ɳ 3 ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 4ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 4ɳ ɳ ɳ ɳ ɳ ɳɳ 4ɳ 4ɳ ɳ 4ɳ ɳ ɳ 4ɳ ɳ 4ɳɳ ɳ ɳ 4ɳɳ 4ɳ ɳ 4ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 4ɳ 4ɳ 3ɳ ɳ ɳ ɳɳ 3ɳ 3ɳ ɳ ɳ ɳ 3ɳ ɳ 3ɳ ɳ 3ɳ ɳ 3ɳ 3ɳ ɳ ɳ ɳ ɳ 3ɳ ɳ ɳ 3ɳ 3ɳ 3ɳ ɳ ɳ ɳ 3ɳ ɳ ɳ ɳ After 3ɳ ɳ 3ɳ ɳ 3ɳ features ɳ ɳ 3ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ 3 ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ computing the from video, an appropriate classifier for 3ɳ 3ɳ 3 3 ɳ 3 ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ to ɳ used ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳɳ ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ 4ɳ ɳ 4ɳ ɳ ɳ ɳ has ɳ be ɳ ɳ ɳ ɳ ɳ 3ɳ ɳ 4ɳ 3ɳ 3ɳ ɳ ɳ ɳ ɳ 3ɳ 3ɳ 3ɳ 3ɳ ɳ ɳ 3ɳ 3 ɳ ɳ 3 3 ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 4ɳ ɳ ɳ ɳ 4ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 4ɳ 4ɳ 4ɳ 4ɳ 4ɳ ɳ ɳ 4ɳ ɳ 3 ɳ 3 ɳ 3ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ 3ɳ ɳ ɳ ɳ ɳ 3ɳ 3ɳ ɳ 3ɳ 3ɳ ɳ 3ɳ 3ɳ ɳ ɳ ɳ 3ɳ 3ɳ ɳ ɳ 3ɳ ɳ 3ɳ 3ɳ 3ɳ ɳ 3ɳ ɳ 3ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ 3ɳ 3ɳ 3ɳ ɳ ɳ ɳ 3ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ In discriminative are more effective 3ɳ 3ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ 4ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3 ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ ɳ actions. ɳ ɳ ɳ 3ɳ supervised 3ɳ 3 3 ɳ ɳ ɳ 3 3ɳ ɳ ɳ ɳ ɳ 3 ɳ 4ɳ ɳ ɳ ɳ 3 4ɳ ɳ models ɳ ɳ ɳ ɳ ɳ ɳ 4ɳ 3ɳ ɳ of ɳ 3ɳ 3ɳ ɳ ɳ ɳ 3ɳ ɳ 3ɳ 3ɳ 3ɳ learning, ɳ 33ɳ ɳ ɳ 3 ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ than ɳ ɳ ɳɳ classification 4ɳ ɳ 3ɳ ɳ ɳ 4ɳ 4ɳ ɳ 4ɳ 3ɳ ɳ 3ɳ 3 ɳ ɳ ɳ ɳ 3 ɳ ɳ ɳ 4ɳ ɳ ɳ 3 ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3 ɳ ɳ ɳ ɳ ɳ 4ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ 3 4ɳ 4ɳ ɳ ɳ ɳ ɳ 3 3 3ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ ɳ ɳ 3ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ generative ɳ 3ɳ 3ɳ 3ɳ ɳ 3ɳ ɳ ɳ 3ɳ 3ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ ɳ ɳ 3ɳ 3ɳ ɳ ɳ most ɳ ɳ 3ɳ ɳ [36]. ɳ 3ɳ 3ɳ ɳ ɳ ɳ ɳ 3 ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3successful models Support vector machine (SVM) is one of the classifiers in ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ 3ɳ 3ɳ 3ɳ 3ɳ 3 ɳ ɳ 3 ɳ 3 3 ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ ɳ 3ɳ ɳ 3ɳ ɳ 3ɳ 3ɳ ɳ ɳ 3ɳ ɳ 3 3ɳ ɳ ɳ 3ɳ 3 3ɳ 3ɳ ɳ ɳ 3 ɳ ɳ ɳ 3ɳ 3ɳ 3 ɳ ɳ 3ɳ 3ɳ ɳ 3 ɳ ɳ ɳ 3ɳ 3 ɳ ɳ 3ɳ ɳ 3 ɳ 3 ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 4ɳ ɳ ɳ 4ɳ ɳ ɳ ɳ ɳ 4ɳ ɳ ɳ 4ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3 ɳ 3ɳ 3ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ 3 ɳ 3 3ɳ 3 ɳ 3ɳ ɳ 3ɳ ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ 3 ɳ 3ɳ ɳ 33ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ 3ɳ ɳ discriminative models category. It has shown better performance than some old‐style classifiers such 3ɳ 3ɳ 3ɳ ɳ ɳ 3ɳ 3ɳ ɳ 3ɳ 3ɳ 3ɳ 3ɳ ɳ 3ɳ ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3 ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3ɳ 3ɳ ɳ 3ɳ 3ɳ 3 3ɳ ɳ ɳ 3 3 3 3 ɳ ɳ 3ɳ ɳ ɳ ɳ 3ɳ ɳ ɳ 3ɳ ɳ ɳ ɳ ɳ ɳ 33ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ 3 ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ ɳ Here, p is order of Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309 Appl. Sci. 2016, 6, 309

Appl. Sci. 2016, 6, 309

8 of 14

3.3. Action Classification with SVM Multiclass Classifier After computing the features from video, an appropriate classifier has to be used for classification of actions. In supervised learning, discriminative models are more effective than generative models [36]. Support vector machine (SVM) is one of the most successful classifiers in discriminative models category. It has shown better performance than some old-style classifiers such as backpropagation neural networks, Naïve Bayes, and k-nearest neighbors (KNNs) in many classification problems [37]. SVM was first proposed in [38], and, originally, it was developed for binary classification but later on extended to the multiclass classification problem. There are two main approaches for multiclass SVM. The first approach considers all classes of data directly into one optimization formulation, while the second approach constructs and combines binary classifiers in some manners to build a multiclass classifier. The second approach is computationally less expensive and easy to implement. Many algorithms have been derived for multiclass classification using this approach, such as one-against-all [39], one-against-one [40], Directed Acyclic Graph-Support Vector Machine (DAG-SVM) [41], Error-Correcting Output Codes Support Vector Machine (ECOC-SVM) [42], and Support Vector Machines with Binary Tree Architecture (SVM-BTA) [43]. Among these, one-against-all [39] and one-against-one [40] are two commonly used methods for multiclass classification. The one-against-all needs N SVM binary classifiers for an N class classification problem, N ( N −1) while one-against-one method needs binary classifiers for an N number of classes, each trained 2 from samples of two corresponding classes. As compared to one-against-all method, one-against-one is better in terms of accuracy for many classification problems [44]. We used multiclass SVM classifier implementation in [45] for multiview action recognition, which uses the one-against-one method with radial basis function (RBF) kernel. Moreover, to estimate the best parameters for classifier, we conducted grid search to know the best value for parameter γ and C. Here, γ represents the width of the RBF kernel and C represents the weight of error penalty. The appropriate set of (C, γ) increases the overall accuracy of the SVM classifier [46]. 4. Experimentations For the evaluation of proposed method, comprehensive experimentations have been conducted on a well-known multiview IXMAS [47] dataset. The leave-one-sequence-out (LOSO) scheme has been used for view-invariance evaluation. In this scheme, the classifier is trained on all sequences except one, which is used for testing. This process is repeated for all possible combinations and results are averaged. This is a common strategy used by different researchers such as [19,29] for evaluation of their methods. This is helpful to compare our results with these state-of-the-art methods. 4.1. Evaluation on Multiview Action Recognition Dataset The IXMAS is a challenging and well-known dataset with multiple actors and camera views. This dataset is popular among the human-action recognition methods for testing view-invariant action recognition algorithms, including both multiview and cross-view action recognition. It includes 13 daily life action classes with 5 different cameras, including one top-view and four side cameras, as shown in Figure 5. Each action is performed 3 times by 12 different subjects while actors keep changing orientations in each sequence during action execution. The change in orientation is indicated by action labels, and no additional information is provided other than these labels. Most of the existing methods use selected action classes and actors for experimentation [19,29]. For comparison, we have selected 11 action classes performed by 12 actors as shown in Figure 6. The name and the label index of these actions are shown in Table 1.

cameras, as shown in Figure 5. Each action is performed 3 times by 12 different subjects while actors keep changing orientations in each sequence during action execution. The change in orientation is indicated by action labels, and no additional information is provided other than these labels. Most of the existing methods use selected action classes and actors for experimentation [19,29]. For comparison, we have selected 11 action classes performed by 12 actors as shown in Figure 6. The Appl. Sci. 2016, 6, 309 9 of 14 name and the label index of these actions are shown in Table 1.

Appl. Sci. 2016, 6, 309

Figure 5. Five cameras views of the same action (check watch). Figure 5. Five cameras views of the same action (check watch).

9 of 14

Figure 6. Selected 11 actions in IXMAS dataset. Figure 6. Selected 11 actions in IXMAS dataset. Table 1. Action class names used in experimentation. Table 1. Action class names used in experimentation.

Index

Action Name

Index 1

Action Name Checking watch

12 23 3 4 4 5 5 66

Checking watch Crossing arms Crossing arms Scratching head Scratching head Sitting down Sitting down Getting up Getting up Turning around Turning around

Index Index 7 7 8 8 9 9 10 10 11 11 ‐

Action Name Action Name Walking Walking Waving Waving Punching Punching Kicking Kicking Picking up Picking up - ‐

4.2. Comparison with Similar Methods on IXMAS Dataset Our method achieves a recognition rate of 89.75% with leave‐one‐sequence‐out (LOSO) cross‐ validation on 11 actions. The recognition rate of individual action is presented in a confusion matrix, shown in Figure 7. The results confirm that our method outperforms the similar state‐of‐the‐art 2D based methods such as [12,19,21,24,29,48–52] recorded in Table 2. It is important to be mentioned here that the number of classes, actors, and views used in experimentations vary among these

Appl. Sci. 2016, 6, 309

10 of 14

Appl. Sci. 2016, 6, 309

10 of 14

Table 2. Comparison with state‐of‐the‐art methods on IXMAS (INRIA Xmas Motion Acquisition Sequences) dataset.

4.2. Comparison with Similar Methods on IXMAS Dataset Year

Method

Accuracy (%)

Picking up

Kicking

Punching

Waving

Walking

Turning around

Getting up

Sitting down

Scratching head

Crossing arms

Checking watch

Our method achieves rate of 89.75% with leave-one-sequence-out (LOSO) ‐ a recognition Proposed method 89.75 cross-validation on 11 actions. The recognition rate of individual action is presented in a confusion 2016 Chun et al. [29] 83.03 matrix, shown in Figure 7.2013 The results confirm that our method outperforms Chaaraoui et al. [24] 85.9 the similar state-of-the-art 2013 Burghouts et al. [53] 2D based methods such as [12,19,21,24,29,48–52] recorded in Table 2. It96.4 is important to be mentioned 89.4 here that the number of2011 classes, actors, Wu et al. [52] and views used in experimentations vary among these 2011 Junejo et al. [12] 74 methods. For example, in [52], 89.4% accuracy has been reported but they excluded camera 4 from 2010 Weinland et al.[19] 83.4 experimentation. Likewise, [51] also excluded the top camera and considered only the remaining 2009 Reddy et al. [48] 72.6 4 cameras. Moreover, most of the published methods are not appropriate for real-time application 2008 Liu and Shah [21] 82.8 due to their high computational cost. TheCherla et al. [51] proposed method considers 80.1 all views of the IXMAS dataset 2008 including the top view for recognition. The results indicate that the proposed method is superior to 2008 Vitaladevuni et al. [50] 87.0 the similar 2D methods, not accuracy but also in recognition speed as well. 2007 only in recognition Lv and Nevatia [49] 80.6

Checking watch

72

1

1

0

0

2

0

0

0

0

0

9

71

2

0

0

0

0

0

0

2

0

0

6

69

0

0

1

0

1

1

2

0

0

1

2

90

4

1

0

0

1

1

3

1

0

0

1

85

1

0

0

1

1

1

0

0

0

0

1

89

6

1

1

5

0

0

0

0

1

2

8

142

0

0

4

0

0

1

0

0

1

1

0

77

2

1

0

0

0

1

0

0

1

1

3

53

2

1

0

0

0

0

0

0

0

0

5

81

0

0

0

0

1

0

0

1

0

1

3

65

Crossing arms Scratching head Sitting down Getting up Turning around walking waving punching kicking Picking up

Figure 7. Confusion matrix of IXMAS (INRIA Xmas Motion Acquisition Sequences) dataset with 11 Figure 7. Confusion matrix of IXMAS (INRIA Xmas Motion Acquisition Sequences) dataset actions.

with 11 actions.

The resolution of IXMAS dataset is only 390 × 291, which is very low resolution as compared to Comparison with state-of-the-art on IXMAS (INRIA Xmas Acquisition Table many 2.other action recognition datasets. methods We performed experiments using Motion MATLAB R2015b ® Sequences) dataset. implementation on Intel Core i7‐4770 CPU with 8 cores @ 3.4 GHz, 8 GB RAM, and Windows 10 operating system. However, only 4 cores of the CPU were utilized during experimentation. Year Method Accuracy (%) Moreover, we did not use any optimization techniques in the code. The average testing time of our approach is 0.0288 per frame, which is almost 34 frames per second (FPS); this is much better than Proposed method 89.75 2016 Chun et al. [29] 83.03 the existing methods as recorded in Table 3. 2013 Chaaraoui et al. [24] 85.9 Burghouts et al. [53] 2013 96.4 2011 2011 2010 2009 2008 2008 2008 2007

Wu et al. [52] Junejo et al. [12] Weinland et al. [19] Reddy et al. [48] Liu and Shah [21] Cherla et al. [51] Vitaladevuni et al. [50] Lv and Nevatia [49]

89.4 74 83.4 72.6 82.8 80.1 87.0 80.6

Appl. Sci. 2016, 6, 309

11 of 14

The resolution of IXMAS dataset is only 390 × 291, which is very low resolution as compared to many other action recognition datasets. We performed experiments using MATLAB R2015b implementation on Intel® Core i7-4770 CPU with 8 cores @ 3.4 GHz, 8 GB RAM, and Windows 10 operating system. However, only 4 cores of the CPU were utilized during experimentation. Moreover, we did not use any optimization techniques in the code. The average testing time of our approach is 0.0288 per frame, which is almost 34 frames per second (FPS); this is much better than the existing methods as recorded in Table 3. Table 3. Comparison of average testing speed on IXMAS dataset. Method

Average FPS

Accuracy (%)

Proposed method Chaaraoui et al. [24] Cherla et al. [51] Lv and Nevatia [49]

34 26 20 5.1

89.75 85.9 80.1 80.6

5. Conclusions In this paper, a multiview human-action recognition method based on a novel region-based feature descriptor is presented. These methods are divided into two categories: 2D approach- and 3D approach-based methods. Usually, the 3D approach provides better accuracy than a 2D approach, but it is computationally expensive, which makes it less applicable for real-time applications. The proposed method uses the human silhouette as input to the features extraction process. Although the human silhouette contains much less information than the original image, our experimentation confirms that it is sufficient for action recognition with high accuracy. Moreover, to get the view-invariant features, the silhouette is divided into radial bins with the interval of 30◦ each. Then, carefully selected region-based geometrical features and Hu-moment features are computed for each bin. For action recognition, a multiclass support vector machine with RBF kernel is employed. The proposed method has been evaluated on the well-known IXMAS dataset. This dataset is a challenging benchmark available for evaluation of multiview action recognition methods. The results indicate that our method outperforms the state-of-the-art 2D methods both in terms of efficiency and recognition accuracy. The testing time for our approach is 34 frames second, which makes it very much suitable for real-time applications. As far as more complex datasets such as HMDB-51, YouTube, and Hollywood-II are concerned, we have not tested our approach with these datasets. However, we believe our approach should also produce good accuracy with these datasets as well, because once the silhouette is extracted the scene complexities will not matter very much. However, perfect silhouette extraction in complex scenarios is still a challenging task which may further affect the feature extraction process as well. As a future work we would like to extend our method for cross-view action recognition—a special case of multiview action recognition—where a query view is different than the learned views. We will also evaluate our approach on more complex datasets, as mentioned above. Acknowledgments: This study is supported by the research grant of COMSATS Institute of Information Technology, Pakistan. Author Contributions: Allah Bux Sargano, Plamen Angelov, and Zulfiqar Habib conceived and designed the experiments; Allah Bux Sargano performed the experiments; Allah Bux Sargano and Zulfiqar Habib analyzed the data; Allah Bux Sargano and Plamen Angelov contributed reagents/materials/analysis tools; Allah Bux Sargano wrote the paper. Conflicts of Interest: The authors declare no conflict of interest.

References 1.

Poppe, R. A survey on vision-based human action recognition. Image Vis. Comput. 2010, 28, 976–990. [CrossRef]

Appl. Sci. 2016, 6, 309

2. 3. 4. 5.

6.

7. 8. 9. 10. 11.

12. 13. 14. 15. 16.

17. 18.

19. 20.

21.

22. 23.

12 of 14

Aggarwal, J.K.; Ryoo, M.S. Human activity analysis: A review. ACM Comput. Surv. (CSUR) 2011, 43, 16. [CrossRef] Rudoy, D.; Zelnik-Manor, L. Viewpoint selection for human actions. Int. J. Comput. Vis. 2012, 97, 243–254. [CrossRef] Saghafi, B.; Rajan, D.; Li, W. Efficient 2D viewpoint combination for human action recognition. Pattern Anal. Appl. 2016, 19, 563–577. [CrossRef] Holte, M.B.; Tran, C.; Trivedi, M.M.; Moeslund, T.B. Human pose estimation and activity recognition from multi-view videos: Comparative explorations of recent developments. IEEE J. Sel. Top. Signal Proc. 2012, 6, 538–552. [CrossRef] Holte, M.B.; Moeslund, T.B.; Nikolaidis, N.; Pitas, I. 3D human action recognition for multi-view camera systems. In Proceedings of the 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT), Hangzhou, China, 16–19 May 2011; pp. 324–329. Huang, P.; Hilton, A.; Starck, J. Shape similarity for 3D video sequences of people. Int. J. Comput. Vis. 2010, 89, 362–381. [CrossRef] Weinland, D.; Ronfard, R.; Boyer, E. Free viewpoint action recognition using motion history volumes. Comput. Vis. Image Underst. 2006, 104, 249–257. [CrossRef] Slama, R.; Wannous, H.; Daoudi, M.; Srivastava, A. Accurate 3D action recognition using learning on the Grassmann manifold. Pattern Recognit. 2015, 48, 556–567. [CrossRef] Ali, S.; Shah, M. Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 288–303. [CrossRef] [PubMed] Holte, M.B.; Tran, C.; Trivedi, M.M.; Moeslund, T.B. Human action recognition using multiple views: A comparative perspective on recent developments. In Proceedings of the 2011 Joint ACM Workshop on Human Gesture and Behavior Understanding, Scottsdale, AZ, USA, 28 November–1 December 2011; pp. 47–52. Junejo, I.N.; Dexter, E.; Laptev, I.; Perez, P. View-independent action recognition from temporal self-similarities. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 172–185. [CrossRef] [PubMed] Kushwaha, A.K.S.; Srivastava, S.; Srivastava, R. Multi-view human activity recognition based on silhouette and uniform rotation invariant local binary patterns. Multimedia Syst. 2016. [CrossRef] Iosifidis, A.; Tefas, A.; Pitas, I. View-invariant action recognition based on artificial neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2012, 23, 412–424. [CrossRef] [PubMed] Iosifidis, A.; Tefas, A.; Pitas, I. Multi-view action recognition based on action volumes, fuzzy distances and cluster discriminant analysis. Signal Proc. 2013, 93, 1445–1457. [CrossRef] Wang, L.; Qiao, Y.; Tang, X. Action recognition with trajectory-pooled deep-convolutional descriptors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 4305–4314. Lei, J.; Li, G.; Zhang, J.; Guo, Q.; Tu, D. Continuous action segmentation and recognition using hybrid convolutional neural network-hidden Markov model model. IET Comput. Vis. 2016, 10, 537–544. [CrossRef] Gkalelis, N.; Nikolaidis, N.; Pitas, I. View indepedent human movement recognition from multi-view video exploiting a circular invariant posture representation. In Proceedings of the IEEE International Conference on Multimedia and Expo 2009 (ICME 2009), New York, NY, USA, 28 June–3 July 2009; pp. 394–397. Weinland, D.; Özuysal, M.; Fua, P. Making Action Recognition Robust to Occlusions and Viewpoint Changes. In Computer Vision–ECCV 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 635–648. Souvenir, R.; Babbs, J. Learning the viewpoint manifold for action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2008 (CVPR 2008), Anchorage, AK, USA, 23–28 June 2008; pp. 1–7. Liu, J.; Shah, M. Learning human actions via information maximization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2008 (CVPR 2008), Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. Zheng, J.; Jiang, Z.; Chellappa, R. Cross-View Action Recognition via Transferable Dictionary Learning. IEEE Trans. Image Proc. 2016, 25, 2542–2556. [CrossRef] [PubMed] Nie, W.; Liu, A.; Li, W.; Su, Y. Cross-View Action Recognition by Cross-domain Learning. Image Vis. Comput. 2016. [CrossRef]

Appl. Sci. 2016, 6, 309

24. 25. 26.

27.

28. 29. 30. 31.

32. 33. 34. 35.

36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46.

47.

13 of 14

Chaaraoui, A.A.; Climent-Pérez, P.; Flórez-Revuelta, F. Silhouette-based human action recognition using sequences of key poses. Pattern Recognit. Lett. 2013, 34, 1799–1807. [CrossRef] Chaaraoui, A.A.; Flórez-Revuelta, F. A Low-Dimensional Radial Silhouette-Based Feature for Fast Human Action Recognition Fusing Multiple Views. Int. Sch. Res. Not. 2014, 2014, 547069. [CrossRef] [PubMed] Cheema, S.; Eweiwi, A.; Thurau, C.; Bauckhage, C. Action recognition by learning discriminative key poses. In Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain, 6–13 November 2011; pp. 1302–1309. Ahmad, M.; Lee, S.-W. HMM-based human action recognition using multiview image sequences. In Proceedings of the 18th International Conference on Pattern Recognition 2006 (ICPR 2006), Hong Kong, China, 20–24 August 2006; pp. 263–266. Pehlivan, S.; Forsyth, D.A. Recognizing activities in multiple views with fusion of frame judgments. Image Vis. Comput. 2014, 32, 237–249. [CrossRef] Chun, S.; Lee, C.-S. Human action recognition using histogram of motion intensity and direction from multiple views. IET Comput. Vis. 2016, 10, 250–257. [CrossRef] Murtaza, F.; Yousaf, M.H.; Velastin, S. Multi-view Human Action Recognition using 2D Motion Templates based on MHIs and their HOG Description. IET Comput. Vis. 2016. [CrossRef] Hsieh, C.-H.; Huang, P.S.; Tang, M.-D. Human action recognition using silhouette histogram. In Proceedings of the Thirty-Fourth Australasian Computer Science Conference, Perth, Australia, 17–20 January 2011; Australian Computer Society, Inc.: Darlinghurst, Australia, 2011; pp. 213–223. Rahman, S.A.; Cho, S.-Y.; Leung, M.K. Recognising human actions by analysing negative spaces. IET Comput. Vis. 2012, 6, 197–213. [CrossRef] Sonka, M.; Hlavac, V.; Boyle, R. Image Processing, Analysis, and Machine Vision; Cengage Learning: Belmont, CA, USA, 2014. Hu, M.-K. Visual pattern recognition by moment invariants. IRE Trans. Inform. Theory 1962, 8, 179–187. Huang, Z.; Leng, J. Analysis of Hu’s moment invariants on image scaling and rotation. In Proceedings of the 2010 2nd International Conference on Computer Engineering and Technology (ICCET), Chengdu, China, 6–19April 2010; pp. 476–480. Jordan, A. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Adv. Neural Inform. Proc.g Syst. 2002, 14, 841. Qian, H.; Mao, Y.; Xiang, W.; Wang, Z. Recognition of human activities using SVM multi-class classifier. Pattern Recognit. Lett. 2010, 31, 100–111. [CrossRef] Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [CrossRef] Vapnik, V.N.; Vapnik, V. Statistical Learning Theory; Wiley: New York, NY, USA, 1998. Kreßel, U.H.-G. Pairwise classification and support vector machines. In Advances in Kernel Methods; MIT Press: Cambridge, MA, USA, 1999. Platt, J.C.; Cristianini, N.; Shawe-Taylor, J. Large Margin DAGs for Multiclass Classification. Nips 1999, 12, 547–553. Dietterich, T.G.; Bakiri, G. Solving multiclass learning problems via error-correcting output codes. J. Artif. Intell. Res. 1995, 2, 263–286. Cheong, S.; Oh, S.H.; Lee, S.-Y. Support vector machines with binary tree architecture for multi-class classification. Neural Inform. Proc.-Lett. Rev. 2004, 2, 47–51. Hsu, C.-W.; Lin, C.-J. A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 2002, 13, 415–425. [PubMed] Chang, C.-C.; Lin, C.-J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2011, 2, 27. [CrossRef] Manosha Chathuramali, K.; Rodrigo, R. Faster human activity recognition with SVM. In Proceedings of the 2012 International Conference on Advances in ICT for Emerging Regions (ICTER), Colombo, Sri Lanka, 12–15 December 2012; pp. 197–203. Weinland, D.; Boyer, E.; Ronfard, R. Action recognition from arbitrary views using 3D exemplars. In Proceedings of the 2007 IEEE 11th International Conference on Computer Vision, Rio de Janeiro, Brazil, 14–20 October 2007; pp. 1–7.

Appl. Sci. 2016, 6, 309

48.

49.

50.

51.

52.

53.

14 of 14

Reddy, K.K.; Liu, J.; Shah, M. Incremental action recognition using feature-tree. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 27 September–4 October 2009; pp. 1010–1017. Lv, F.; Nevatia, R. Single view human action recognition using key pose matching and viterbi path searching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2007 (CVPR’07), Minneapolis, MN, USA, 17–22 June 2007; pp. 1–8. Vitaladevuni, S.N.; Kellokumpu, V.; Davis, L.S. Action recognition using ballistic dynamics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2008 (CVPR 2008), Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. Cherla, S.; Kulkarni, K.; Kale, A.; Ramasubramanian, V. Towards fast, view-invariant human action recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops 2008 (CVPRW’08), Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. Wu, X.; Xu, D.; Duan, L.; Luo, J. Action recognition using context and appearance distribution features. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, 20–25 June 2011; pp. 489–496. Burghouts, G.; Eendebak, P.; Bouma, H.; Ten Hove, J.M. Improved action recognition by combining multiple 2D views in the bag-of-words model. In Proceedings of the 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Krakow, Poland, 27–30 August 2013; pp. 250–255. © 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Human Action Recognition from Multiple Views Based on View - MDPI

Human Action Recognition from Multiple Views Based on View - MDPI

Suggest Documents

Human Action Recognition using Multiple Views: A Comparative ...

Model-based Human Action Recognition

An Efficient Approach for Multi-view Human Action Recognition Based

View-invariant action recognition based on Artificial Neural ... - Google

Action Recognition based on View Invariant Spatio ... - Semantic Scholar

Human action recognition from RGB-D frames

3D Reconstruction From Multiple Views Based on Scale-Invariant ...

Human Action Recognition based on Spectral ... - Science Direct

A survey on vision-based human action recognition - CiteSeerX

A survey on vision-based human action recognition - CiteSeerX

Human Action Recognition based on MSVM and Depth Images

Human Action Recognition based on Estimated Weak Poses - Uab

Human Action Recognition Based on Fusion Features Extraction of ...

Action Recognition from Arbitrary Views using 3D Exemplars

Multiple Human Tracking Based on Multi-View Upper ... - CiteSeerX

Shape and View Independent Reflectance Map from Multiple Views â

A Survey on Human Action Recognition

Multi-View Human Action Recognition Using Wavelet ...

View Invariants for Human Action Recognition Abstract 1 ... - CiteSeerX

Real-time multi-view human action recognition using a wireless ...

Towards Fast, View-Invariant Human Action Recognition - CiteSeerX

Multi-view Human Action Recognition System Employing 2DPCA ...

View Invariance for Human Action Recognition - Springer Link

Multi-View Human Action Recognition Using Wavelet ... - ScienceDirect

Human Action Recognition from Multiple Views Based on View - MDPI