A Fully Complex-valued Fast Learning Classifier - Semantic Scholar

Neurocomputing 149 (2015) 198–206

Contents lists available at ScienceDirect

Neurocomputing journal homepage: www.elsevier.com/locate/neucom

A Fully Complex-valued Fast Learning Classifier (FC-FLC) for real-valued classification problems M. Sivachitra a,n, R. Savitha b, S. Suresh b, S. Vijayachitra c a

Department of Electrical and Electronics Engineering, Kongu Engineering College, Perundurai, India School of Computer Engineering, Nanyang Technological University, Singapore c Department of Electronics and Instrumentation Engineering, Kongu Engineering College, Perundurai, India b

art ic l e i nf o

a b s t r a c t

Article history: Received 5 September 2013 Received in revised form 21 April 2014 Accepted 22 April 2014 Available online 16 September 2014

This paper presents a Fully Complex-valued Fast Learning Classifier (FC-FLC) to solve real-valued classification problems. FC-FLC is a single hidden layer network with a nonlinear input and hidden layer, and a linear output layer. The neurons at the input layer of the FC-FLC employ the circular transformation to convert the real-valued input features to the Complex domain. At the hidden layer, the complexvalued input features are projected onto a hyper-dimensional complex plane (Cm -CK ) using the K hidden neurons employing Gudermannian (Gd) activation function. To investigate the suitability of the Gd as an activation function for a fully complex-valued network, we formulate the activation function in two forms. The output layer is linear. The input weights of the FC-FLC are chosen randomly and the output weights are estimated analytically. The best input weights corresponding to the best generalization performance of the FC-FLC are obtained by a k-fold cross validation. The performance of the proposed classifier is evaluated in comparison to other complex-valued and a few best performing realvalued classifiers on a set of benchmark classification problems from the UCI machine learning repository and a practical human emotion recognition problem. Statistical analysis is carried out to validate the performance of the classifier. Performance results show that FC-FLC has better classification ability than the other classifiers in the literature. & 2014 Elsevier B.V. All rights reserved.

Keywords: Complex-valued neural network Classification Gudermannian function Circular transformation Extreme learning machine

1. Introduction Signals in applications like communication [1], signal processing [2–5], image processing [6,7], etc., are inherently complexvalued and change with time. In order to preserve their physical characteristics, these signals and their nonlinear transformations have to be represented in the Complex domain. Therefore, it is imperative that adaptively learning algorithms can be developed in the Complex domain. Therefore, researchers began focussing on the development of neural networks and their learning algorithms in the Complex domain. The main challenge in the development of a complex-valued neural network is the lack of an entire and bounded activation function as Liouville's theorem states that an entire and bounded function is a constant in the Complex domain [8]. Initially, this constraint was overcome with the use of split complex-valued networks, where the complex-valued signals were divided into their real and imaginary/magnitude and phase components and real-valued networks were used to operate on these signals [9].

n

Corresponding author. E-mail address: [email protected] (M. Sivachitra).

http://dx.doi.org/10.1016/j.neucom.2014.04.075 0925-2312/& 2014 Elsevier B.V. All rights reserved.

However, the splitting of complex-valued signals into their real components is not an efficient representation and results in loss of significant complex-valued information [10]. Therefore, Kim et al. reduced the desired properties of a complex-valued activation function to be analytic and bounded almost everywhere [11] and suggested a set of elementary transcendental functions with singularities [12] as activation functions for the fully complexvalued multi-layer perceptron networks. The convergence of these activation functions and the effect of their singularities were experimentally studied in detail in [13]. On the other hand, the Gaussian function that maps the complex-valued input features to the Real domain (C-R) was employed in the Complex-valued Radial Basis Function (CRBF) network [14]. Therefore, although the centers of the Gaussian activation function and the output weights of the CRBF network were complex-valued, the responses of the neurons in the hidden layer are real-valued. Moreover, during the backward computation for parameter updates, the correlation between the real and imaginary parts of error was lost. Hence, the CRBF is not capable of approximating phase of the complex-valued signal accurately. To overcome this, in [15] a Fully Complex-valued Radial Basis Function network (FC-RBF) using the fully complex-valued ‘sech’ activation function and its learning algorithm was developed.

M. Sivachitra et al. / Neurocomputing 149 (2015) 198–206

It was also shown that the FC-RBF network with the fully complexvalued ‘sech’ activation function has better phase approximating abilities than the CRBF network. However, the ‘sech’ activation function has its singularities in the finite region of the Complex domain. This imposes strict restrictions on the operating region of the network so as to avoid hitting these singular points. Hence, there is a need to identify better activation functions that do not restrict the operating region of the complex-valued neural network. It has been recently shown that the complex-valued networks have better computational power than the real-valued networks [16] and their orthogonal decision boundaries [17] render them better decision makers than their real-valued counterparts. This decision making ability of complex-valued networks motivated researchers to develop efficient classifiers in the Complex domain. Classifiers developed in the Complex domain include the Multi Layered Network with Multi-Valued Neurons (MLMVN) [18], the single layered complex-valued classifier referred to as, ‘Phase Encoded Complex-valued Neural Network (PE-CVNN)’ [19], the Bilinear Branchcut Complex-valued Extreme Learning Machine (BB-CELM), the Phase Encoded Complex-valued Extreme Learning Machine (PE-CELM) [20], the Fully Complex-valued Radial Basis Function network (FC-RBF) classifier [21] and the Circular Complex-valued Extreme Learning Machine (CC-ELM) [22]. A brief summary of these classifiers is discussed in Section 2. A complete survey of supervised learning algorithms in complex valued neural networks and recent advancements in complex valued neural network can be found in [23,24]. In this paper, we present a Fully Complex-valued Fast Learning Classifier (FC-FLC) to solve real-valued classification problems. The classifier is a single hidden layer network with a nonlinear input and hidden layer and a linear output layer. The neurons in the input layer of the FC-FLC employ a circular transformation [22] to map the real-valued input features to the Complex domain. The neurons at the hidden layer of FC-FLC employ the fully complexvalued Gudermannian function (‘Gd’) as the activation function, as it is shown to be a good discriminator in [25]. This function quantitatively characterize the Hubble sequence of spiral galaxies including grand design and barred spirals. Special shapes such as ring galaxies with inward and outward arms are also described by the analytic continuation of the same Gd function and hence, we investigate the suitability of the function as a complex-valued activation function to solve real-valued classification problems. We formulate two activation functions using the Gd function to develop the FC-FLC. The input weights of the FC-FLC are chosen randomly and the output weights are estimated as a solution to an unconstrained linear programming problem. Therefore, the FC-FLC requires minimal computational effort during the training process. It should be noted here that the optimal input weights of FC-FLC are chosen using a k-fold cross-validation approach, presented in [26]. The performance of the FC-FLC with the proposed Gudermannian activation functions at the hidden layer is evaluated using a set of benchmark real-valued problems with balanced and unbalanced data sets from the UCI machine learning repository [27]. In all these problems, the performance of FC-FLC is compared against other complex-valued classifiers and a few best performing real-valued classifiers available in the literature. Performance results show that the FC-FLC with the Gd activation function performs better than the best performing real-valued classifiers and are comparable with other complex-valued classifiers available in the literature. The paper is organized as follows: a brief review on the existing complex-valued classifiers is presented in Section 2. In Section 3, the real-valued classification problem is defined in the Complex domain and the network architecture, and description about Gd

199

function and learning algorithm of FC-FLC is described in detail. Section 4 presents the performance study results of the proposed classifier using a set of multi-category and binary benchmark problems from UCI machine learning repository. Performance of the FC-FLC classifier is analyzed using a practical human emotion recognition problem. Statistical tests are also performed to validate the performance of the classifier. Finally, in Section 5, the summary of the work and the observations from the study are presented.

2. A literature survey on complex-valued classifiers In 2003, Nitta showed that the complex-valued networks have better computational power than the real-valued networks [16]. He also showed that the XOR problem that cannot be solved with a single real-valued neuron can be solved with a single complexvalued neuron [28]. It was later shown by him that the complexvalued neural networks with split-type of activation functions have two decision boundaries and that these decision boundaries are orthogonal to each other [17]. Thereafter, researchers focussed on developing efficient complex-valued classifiers to solve realvalued classification problems. It must be noted that in order to solve real-valued classification problems in the Complex domain, the input features have to be transformed to the Complex domain. The multi-layered network with multi-valued neurons [18] was the first classifier developed in the Complex domain. The MLMVN employs multi-valued neurons that uses a piecewise continuous activation function to map the complex-valued inputs to C discrete class labels (i.e., C-valued threshold logic, where C is the number of classes). In MLMVN, the normalized real-valued input features (x) are mapped onto a unit circle using expði2πxÞ transformation and the class labels are encoded by the roots of unity in the Complex plane. However, as the input features are mapped to a unit circle, this mapping results in the same complex-valued features for the real-valued features with values 0 and 1 (i.e., the transformation is not unique). This affects the classification performance of the MLMVN. Moreover, as the number of classes increases, the region of sector within the unit circle for each output neuron decreases, which might further affect the classification performance of MLMVN in multi-category classification problems. Further, the learning algorithm in MLMVN uses a derivative free error correcting rule to estimate the parameters of the network. Therefore, MLMVN requires significant computational effort during the training process. In the single layer phase encoded complex-valued neural network [19,29], the real-valued input features (x) are phase encoded in ½0; π, using the transformation ‘expðiπxÞ’ to obtain the complexvalued input features. Therefore, the transformation of the PE-CVNN used to transform the real-valued input features to the Complex domain performs one-to-one mapping of features from R-C and is, hence, unique. However, the activation functions used in the PE-CVNN are similar to those used in the split complex-valued neural networks. Therefore, the gradients used in the parameter update of the PE-CVNN are not fully complexvalued, and, hence, this might result in a significant loss of complex-valued information [11] resulting in misclassification. Moreover, the PE-CVNN uses a gradient descent based learning algorithm to estimate the parameters of the network, and hence, requires huge computational effort during training. A multi-layer classifier, namely ‘Fully Complex-valued Radial Basis Function classifier (FC-RBF)’, has been introduced in [21]. An FC-RBF network is the basic building block of the FC-RBF classifier, with the phase encoded transformation introduced in [19,29] at the input layer, to convert the real-valued features to the Complex domain. However, the FC-RBF classifier uses a gradient descent

200


based batch learning algorithm to estimate the parameters of the network and requires huge computational effort during training. Hence, two fast learning three-layered classifiers, referred to as, ‘Bilinear Branch-cut Complex-valued Extreme Learning Machine (BB-CELM)’ and the ‘Phase Encoded Complex-valued Extreme Learning Machine (PE-CELM)’, were developed in [20]. The classifiers are single hidden layer complex-valued neural networks. In BB-CELM, a bilinear transformation with a branchcut around 2π was employed at the input layer to transform the real-valued input features to the Complex domain, while in PE-CELM, the neurons in the input layer employed the phase encoded transformation. The fully complex-valued ‘sech’ function was used at the hidden layer of these classifiers and the neurons in the output layer are linear. Although these classifiers are fast and use fully complex-valued activation function at the hidden layer, the BB-CELM and PE-CELM classifiers use the bilinear transformation and the phase encoded transformation used in MLMVN and PE-CVNN classifiers, respectively. Hence, the transformation used in BB-CELM and PE-CELM does not completely exploit the orthogonal decision boundaries of the fully complex-valued classifiers. Therefore, a circular transformation that uniquely maps the real-valued input features to all the four quadrants of the Complex domain has been proposed and a ‘Circular Complex-valued Extreme Learning Machine (CC-ELM)’ has been developed in [22], and this transformation has been used in [30,31]. For complete details on the complex-valued neural networks in the literature, one must refer to [32]. The sechð:Þ activation function used in FC-RBF, PE-CELM, BBCELM, and CC-ELM classifiers has its singularity in the finite region of the Complex domain. These singularities interfere with the operating region of the network and affect the convergence of the network during the learning process. Hence, there is a need to identify an analytic and almost bounded complex-valued activation function with singularities that do not interfere with the convergence of the network. To overcome these issues, in the next section, we introduce a Fully Complex-valued Fast Learning Classifier to solve real-valued classification problems.

3. Description of the Fully Complex-valued Fast Learning Classifier (FC-FLC) In this section, we first define the real-valued classification problem in the Complex domain. Next, we describe the architecture of the FC-FLC and finally, we present the learning algorithm of FC-FLC in detail.

3.2. Architecture of FC-FLC The FC-FLC is a single hidden layer fully complex-valued network with a nonlinear input and hidden layer, and a linear output layer, as shown in Fig. 1. In order to solve the real-valued classification problem in the Complex domain, the real-valued input features should be transformed to the Complex space (Rm -Cm ). The neurons in the input layer of FC-FLC perform this transformation by employing the circular transformation proposed in [22] and the complex-valued input features are given by zt ¼ ½zt1 ⋯ztj ⋯ztm T . The circular transformation for the jth feature of the tth sample is given by t

ztj ¼ sin ðaxtj þ ibxj þ αj Þ;

ð2Þ

where a; b A ½0; 1 are randomly chosen scaling constants and αj A ½0; 2π is used to shift the origin to enable effective usage of the four quadrants of the Complex plane. As the circular transformation performs one-to-one mapping of the real-valued input features to the Complex domain, it uses all the four quadrants of the Complex domain effectively and overcomes the issues due to transformation in the existing complex-valued classifiers. The neurons at the hidden layer of the FC-FLC use the Gd function whose magnitude and phase responses have an inverted Gaussian characteristic to map the complex-valued input features to a hyper-dimensional Complex domain. It has been shown in [25] that the Gd function is a good discriminator. In this paper, we use the Gd to formulate two fully complex-valued activation functions. This function has the following properties: 1. Gd function directly relates hyperbolic angles with circular angles (provides inner transformations). 2. The magnitude response plot of Gd function has inverse gaussian function characteristics. 3. The Gd function provides a smooth transition between extremities. 4. It is continuously differentiable. This function satisfies the desirable properties of a complexvalued activation function. The magnitude response plot for the Gd

Inset x1

Circular Transformation sin ( ax1t+ibx1t+ α1 )

3.1. Definition of the real-valued classification problem in the Complex domain Let x1 ; c1 ; …; xt ; ct ; …; xN ; cN be the training data set with N samples, where xt ¼ ½xt1 ; ⋯xtm T A Rm is the m-dimensional realvalued input feature vector of the tth sample, and ct A f1; 2; …; Cg is its class label. The coded class label of the tth sample yt ¼ ½yt1 ; ⋯ytC T A CC in the Complex domain is given by ( 1 þ i1 if ct ¼ l; l ¼ 1; …; C ð1Þ ytl ¼ 1 i1 otherwise; pffiffiffiffiffiffiffiffi where i ¼ 1 is a Complex operator. The classification problem in the Complex domain is defined as the following. 1 1 Given the training data set x ; y ; …; xt ; yt ; …; xN ; yN , estimate the decision function DF : xm -yC , such that the network is capable of predicting the class label of new, unseen samples as accurately as possible.

j ¼ 1; …; m

T

z1

h1 = atan (sinh (p1 z )) OR

Input neuron

x1

h2

z1

w11

w

y^1

C2

xm

zm

T

h1 = atan (sinh ( u1 (z −v1) ))

y^C

hK−1

hK

wCK

Fig. 1. The architecture of the FC-FLC. The inset shows the circular transformation used at the input layer.


201

t

OR hj ¼ AF 2 ðuj ; vj ; zt Þ

ð6Þ

The output neurons are linear and the predicted output at the lth output neurons of the FC-FLC is given by K

btl ¼ ∑ wlj htj ; y j¼1

l ¼ 1; …; C

ð7Þ

where wlj A C is the output weight connecting the jth hidden neuron and the lth output neuron. Hence, P A CCK are the C K output weights of the network. The predicted class label is obtained using t b btl ð8Þ c ¼ max real y l ¼ 1;2;…;C

In the next section, we present the learning algorithm of FC-FLC in detail. 3.3. Learning algorithm of FC-FLC Similar to the C-ELM [33], the input weights of the FC-FLC are chosen randomly and the output weights are estimated as a solution to a set of linear equations. The response of the neurons in the hidden layer (H A CNK ) is given by 2 3 AF 1 ðp1 ; z1 Þ ⋯ AF 1 ðp1 ; zN Þ 6 7 ⋮ ⋮ ⋮ HðP; ZÞ ¼ 4 ð9Þ 5 1 N AF 1 ðpK ; z Þ ⋯ AF 1 ðpK ; z Þ ð10Þ

OR 2 6 HðU; V ; ZÞ ¼ 4

Fig. 2. Magnitude and phase response plots of Gd function. (a) Magnitude response plot for Gd of FC-FLC network; (b) phase response plot for Gd of FC-FLC network.

function is shown in Fig. 2(a) and the phase response plot is shown in Fig. 2(b). The magnitude response plot follows the inverse Gaussian characteristics. At the origin, the magnitude is zero and as the input increases, the magnitude of the Gd function increases. Moreover, we can observe that the magnitude of Gd function does not reach infinity and is bounded. Similar characteristics can be observed in the phase response plots also. As the singularities of the Gd function does not interfere with the operating region of a complex-valued neural network, we investigate the suitability of the Gd function in performing realvalued classification by formulating two activation functions. The two activation functions, referred as AF1 and AF2 henceforth, are defined as AF 1 ðpj ; zt Þ ¼ atanðsinhðpTj zt ÞÞ;

j ¼ 1; …; K

AF 2 ðuj ; vj ; zt Þ ¼ atanðsinhðuTj ðzt vj ÞÞ;

j ¼ 1; …; K

ð3Þ ð4Þ

where K represents the total number of hidden neurons, pj A Cm are the K m input weights, uj A Cm is the complex-valued scaling factor, vj A Cm is the complex-valued center of the jth hidden neuron. t Hence, the response of the hidden layer ðhj Þ of the FC-FLC is given by t

hj ¼ AF 1 ðpj ; zt Þ;

j ¼ 1; …; K

ð5Þ

AF 2 ðu1 ; v1 ; z1 Þ

⋯

AF 2 ðu1 ; v1 ; zN Þ

⋮

⋮

⋮

AF 2 ðuK ; vK ; z1 Þ

⋯

AF 2 ðuK ; vK ; zN Þ

3 7 5

ð11Þ

Here H is a K N hidden layer output matrix. The jth row of the matrix H represents the response of the hidden neuron hj for the inputs (zt ; t ¼ 1; …; N). Hence, the output weights can be obtained as a least square solution to the set of linear equations at the output layer, and is given by W ¼ YH †

ð12Þ †

where H is the Moore–Penrose pseudo-inverse of the hidden layer output matrix, and Y is the complex-valued coded class label of all the N samples. The number of hidden neurons and the choice of input parameters influence the performance of the FC-FLC. Therefore, a k-fold cross-validation, similar to that presented in [26], is used to choose the number of hidden neurons and the input weights corresponding to the best generalization performance of the FC-FLC. In such a k-fold cross-validation, the training data set is divided into k equal subsets. Then the FC-FLC algorithm is called k times, each time leaving one of the subsets from training. The generalization accuracy is calculated based on the omitted subset. The input parameter corresponding to the best generalization accuracy is chosen as the optimal parameter of the FC-FLC.

4. Performance evaluation of the FC-FLC In this section, we evaluate the performance of the FC-FLC on a set of benchmark classification problems from the UCI machine learning repository and a practical human emotion recognition problem. The number of classes, the number of features, the number of samples in the training and the testing data sets for the various benchmark real-valued classification problems are presented in Table 1. The availability of small number of samples,

202


Table 1 Description of benchmark real-valued classification data sets selected from the UCI machine learning repository for performance study. Type of data set

Multi-categ.

Binary

Problem

Image segmentation Vehicle classification Glass identification Acoustic emission Liver Disorder PIMA Data Breast Cancer Ionosphere

No. of features

No. of classes

No. of samples

I:F:

Training

Testing

19 18 9 5

7 4 6 4

210 424 109 62

2100 422 105 137

0 0.1 0.68 0.1

6

2

200

145

0.17

8

2

400

368

0.225

9

2

300

383

0.26

34

2

100

251

0.28

the sampling bias, and the overlap between classes may affect the classification performance of the classifier [34]. Hence, we also consider the Imbalance Factor (I:F:) of the training data sets to study the effect of these factors. The I:F: is defined as

minl ¼ 1⋯C N l ðI:F:Þ ¼ 1 nC ð13Þ N where Nl is the number of samples belonging to a class l. Note that N ¼ ∑Cl¼ 1 N l . The imbalance factor gives a measure of the sample imbalance in the various classes of the training data set, and the imbalance factor of each data set considered in this study is also presented in Table 1. It can be observed from the table that the imbalance factors of the unbalanced data sets considered in the study vary widely from 0.1 to 0.68, and the image segmentation problem with a balanced data set has an imbalance factor of 0. 4.1. Performance measures The average (ηa) (Eq. (14)) and over-all (ηo) (Eq. (15)) classification efficiencies [35] of the classifiers derived from their confusion matrices are used as the performance measures for comparison in this study ηa ¼

1 C qii ∑ 100% C i ¼ 1 Ni

ð14Þ

ηo ¼

∑Ci¼ 1 qii 100% ∑Ci¼ 1 N i

ð15Þ

where qii is the total number of correctly classified samples in the class ci and Ni is the total number of samples belonging to a class ci in the data set. In this paper, an appropriate number of hidden neurons is selected using the addition/deletion of neurons to obtain an optimal performance, as discussed in [36] for realvalued networks. 4.2. Performance study on multi-category benchmark data sets First we present the results of the FC-FLC on the multi-category benchmark classification problems, viz., the image segmentation problem, the vehicle classification problem, glass identification and the acoustic emission problem. In these problems, the performance results of the FC-FLC are compared with other complex-valued classifiers, viz., CSRAN [37], FC-RBF [21] and CC-ELM [22] classifiers. To highlight the advantages of using a complex-valued classifier, the performance of the FC-FLC is also compared with a few best performing real-valued classifiers, viz., the Support Vector Machines (SVM) [38], the Self-regulatory Resource Allocation Network (SRAN)

[39], and the Extreme Learning Machines (ELM) [40]. The number of hidden neurons, the training time and the testing performance results of the FC-FLC, in comparison with other complex-valued and real-valued classifiers for the four multi-category benchmark problems, are presented in Table 2. The results for the real-valued and complex-valued classifiers are reproduced from [30]. From the table, it can be seen that the performance of the FC-FLC (AF2) is better than the real-valued classifiers chosen for comparison, especially in the vehicle classification and glass identification problems with unbalanced data set. In the vehicle classification, the overall efficiency and average efficiencies are better than the other classifiers considered, whereas for glass identification problem, the average efficiency is better than all other classifiers taken for comparison. Comparing the performance of the FC-FLC (AF2) with the complex-valued classifiers, its performance is better than CSRAN and FC-RBF classifiers and is comparable with CC-ELM classifier. Moreover, the FC-FLC requires minimal hidden neurons and minimal computational effort compared to other classifiers. 4.3. Performance study on binary benchmark classification problems In this section, we present the performance results of the FC-FLC on the binary benchmark data sets presented in Table 1. The performance of these classifiers in comparison with the other complex-valued and best performing real-valued classifiers is presented in Table 3. From the table, it can be observed that the FC-FLC outperforms all the real valued classifiers, CSRAN and FC-RBF classifiers and its performance is comparable to those of CC-ELM classifier. In the PIMA data set, the performance of FC-FLC is slightly better than CC-ELM classifier. Hence, it can be inferred that the proposed classifier has better decision making capabilities compared to other real-valued and complex-valued classifiers. Next, we perform statistical tests to validate the performance of the FC-FLC. First, the Friedman test [41] is conducted based on the overall and average efficiencies of the classifiers ranked in Tables 2 and 3. Based on the overall efficiencies of the classifiers, the average ranks of the SVM, ELM, SRAN, CSRAN, FC-RBF, CC-ELM, FC-FLC (AF1) and FC-FLC (AF2) classifiers can be calculated as 6.25, 6.5, 4.437, 6.5, 4.937, 1.5, 3.68 and 2.1875, respectively. Under the null hypothesis which states that all the classifiers are equivalent and so their ranks must be equal, the Friedman statistic ðχ 2F Þ is 34.6192. A better statistic which is derived by Iman and Davenport [42] follows the F-distribution and it is 11.33. The modified statistic follows the F-distribution with 7 and 49 degrees of freedom and the critical value for rejecting the null hypothesis at a significance level of 0.05 is 2.20. Since the modified Friedman statistic ðF F Þ is


203

Table 2 Performance results for the multi-category benchmark data sets. Problem

Image segmentation

Classifier Domain

Real

Complex

Vehicle classification

Real

Complex

Glass identification

Real

Complex

Acoustic emission

Real

Complex

a

Classifier

No. of parameters

Time (s)

Testing ηðrankÞ o

ηðrankÞ a

SVMa

127

721

91:38ð6Þ

91:38ð6Þ

ELM

1323

0.25

90:23ð7Þ

90:23ð7Þ

SRAN

1296

22

93

ð3Þ

88

ð8Þ

93ð3Þ 88ð8Þ

CSRAN

4860

339

FC-RBF

3420

421

92:33

ð4Þ

92:33ð4Þ

CC-ELM

3120

0.03

93:52ð1Þ

93:52ð1Þ

FC-FLC(AF1)

3640

0.2

FC-FLC(AF2)

2600

0.18

93:3ð2Þ 92.1(5)

93:3ð2Þ 92.1(5)

SVMa

340

550

70:62ð8Þ

68:51ð8Þ

ELM

3450

0.4

77:01ð5:5Þ

77:59ð5Þ

SRAN

2599

55

75:12

ð7Þ

76:86ð7Þ

CSRAN

6400

352

79:15ð4Þ

79:16ð4Þ

FC-RBF

5600

678

77:01ð5:5Þ

77:46ð6Þ

CC-ELM

3740

0.1084

82:23ð2Þ

82:52ð2Þ

FC-FLC(AF1)

4400

1.09

FC-FLC(AF2)

3080

0.67

81:5ð3Þ 83.18(1)

81:95ð3Þ 83.29(1)

SVMa

183

320

70:47ð8Þ

75:61ð8Þ

ELM

1280

0.05

81:31ð7Þ

87:43ð3Þ

SRAN

944

28

86:2

ð2Þ

80:95ð5:5Þ

CSRAN

3840

452

83:5ð5Þ

78:09ð7Þ

FC-RBF

4320

452

CC-ELM

3000

0.08

83:76ð4Þ 94:44ð1Þ

84:52ð4Þ

80:95ð5:5Þ

FC-FLC(AF1)

2400

0.34

FC-FLC(AF2)

2400

0.1716

82:85ð6Þ 84.76(3)

90:65ð2Þ 92.756(1)

SVMa

22

–

98:54ð6Þ

97:95ð6:5Þ

ELM

100

0.05

98:54ð6Þ

97:95ð6:5Þ

SRAN

100

22

99:27ð3Þ

98:91ð3:5Þ

CSRAN

224

32.6

98:54ð6Þ

98:08ð5Þ

FC-RBF

280

129.6

96:35ð8Þ

95:2ð8Þ

CC-ELM

180

0.031

99:27ð3Þ

99:17ð2Þ

FC-FLC(AF1)

180

0.03

FC-FLC(AF2)

180

0.03

99:27ð3Þ 100(1)

98:91ð3:5Þ 100(1)

Support vectors.

greater than the critical value ð11:33 o 2:20Þ, we can reject the null hypothesis and it can be inferred that the classifiers used in this paper are not similar. Next, we conduct the post ad hoc Bonferroni–Dunn test [43] to emphasize the significance of the FC-FLC classifier. The test assumes that the performances of two classifiers are basically different if their corresponding average ranks are different by a critical difference. As the FC-FLC (AF2) performs better than the FC-FLC (AF1), the FC-FLC (AF2) is used as the key classifier for the statistical study. The critical difference for the classifiers and the various data sets used in this study is calculated as 3.0 for a significance level of 0.10 (critical value q0:1 ¼ 3.0). The differences in the average rank between the FC-FLC (AF2) classifier and the average ranks of the SVM, ELM, SRAN, CSRAN, FC-RBF, CC-ELM and FC-FLC (AF1) classifiers can be calculated as 4.0625, 4.3125, 2.2495, 4.3125, 2.7495, 0.6875 and 1.4925, respectively. Thus, out of the seven classifiers taken for comparison, the differences in average ranks between the FC-FLC classifier (AF2) and the SVM classifier (4.0625), ELM classifier (4.3125) and CSRAN classifier (4.3125) are greater than the critical difference (3.0). Although the differences in average ranks between the FC-FLC classifier and the SRAN classifier (2.2495), FC-RBF classifier (2.7495) and with CC-ELM classifier (0.6875) are less than the

critical difference, it can be observed from Tables 2 and 3 that the FC-FLC (AF2) classifier performs better than SRAN and FC-RBF classifiers and is comparable with CC-ELM and FC-FLC(AF1) classifiers.

4.4. Performance study on a practical human emotion recognition problem In this section, we study the decision making ability of the FC-FLC on a practical Human Emotion recognition problem. Emotion recognition is a challenging problem due to the similarity in the features for various facial expressions. In the literature, there are many benchmark facial expression data sets that are available for evaluating the performance, such as JAFFE database, MMI and Cohn–Kanade database. In order to evaluate the performance of FC-FLC for emotion recognition, we extract LBP based features from JAFFE database with eight sampling points and radius of four as given in [44]. Table 4 presents the 10-fold cross validation performance comparison results of the FCFLC for JAFFE Data set. The results for SVM and Mc-FIS are reproduced from the paper [44]. It can be inferred that the FC-FLC performs slightly better than SVM and Mc-FIS classifiers.

204


Table 3 Performance comparison on benchmark binary classification problems. Problem

Liver disorders

Classifier Domain

Real-valued

Complex-valued

PIMA data

Real-valued

Complex-valued

Breast cancer

Real-valued

Real-valued

Training time (s)

Testing ηðrankÞ o

ηðrankÞ a

SVMa

141

0.0972

71:03ð6Þ

70:21ð6Þ

ELM

900

0.1685

72:41ð5Þ

71:41ð5Þ

SRAN

819

3.38

CSRAN

560

38

67:59ð7Þ

69:24ð7Þ

FC-RBF

560

133

ð4Þ

75:41ð1Þ

CC-ELM

160

0.059

FC-FLC(AF1)

800

0.18

75:17

ð2:5Þ

74:23ð3Þ

FC-FLC(AF2)

768

0.078

75:17ð2:5Þ

74:03ð4Þ

66:9

ð8Þ

74:46

75:5ð1Þ

65:8ð8Þ

74:83ð2Þ

SVMa

221

0.205

77:45ð7Þ

76:33ð5Þ

ELM

1100

0.2942

76:63ð8Þ

75:25ð6Þ

SRAN

2067

12.24

78:53ð4:5Þ

74:9ð7Þ

CSRAN

720

64

77:99ð6Þ

76:73ð4Þ

FC-RBF

720

130.3

78:534:5Þ

68:49ð8Þ

CC-ELM

400

0.073

81:25ð2Þ

76:86ð3Þ

FC-FLC(AF1)

300

0.09

80:97ð3Þ

77:4ð2Þ

FC-FLC(AF2)

300

0.0624

81:5ð1Þ

77:79ð1Þ

ð6Þ

97:06ð5:5Þ

SVM

24

0.1118

ELM

792

0.1442

96:35

ð7Þ

96:5ð8Þ

SRAN

84

0.17

96:87ð4Þ

97:3ð4Þ

96:6

ð8Þ

96:86ð7Þ

CSRAN

800

60

FC-RBF

400

158.3

CC-ELM

330

0.0811

97:39

FC-FLC(AF1)

660

0.29

96:86ð5Þ

97:06ð5:5Þ

FC-FLC(AF2)

330

0.0156

97:13

ð2Þ

97:45ð2:5Þ

a

Complex-valued

a

No. of parameters

a

Complex-valued

Ionosphere

Classifier

96:08

97:12ð3Þ ð1Þ

97:45ð2:5Þ 97:84ð1Þ

SVM

43

0.0218

91:24ð3Þ

88:51ð4Þ

ELM

1184

0.0396

89:64ð6:5Þ

87:52ð7Þ

SRAN

777

3.7

90:84ð4Þ

91:88ð1Þ

CSRAN

420

86

88:05ð8Þ

85:78ð8Þ

FC-RBF

1400

186.2

89:64ð6:5Þ

88:01ð5Þ

CC-ELM

1080

0.0312

92:43ð1Þ

89:93ð2Þ

FC-FLC(AF1)

1800

0.09

90:83

ð5Þ

87:95ð6Þ

FC-FLC(AF2)

1800

0.078

91:63ð2Þ

89:31ð3Þ

Support vectors.

Table 4 Performance comparison results for the human emotion recognition problem (JAFFE data set). Classifier domain

Real Valued Complex-valued

Classifier

SVM Mc-FIS FC-FLC

Testing No: of parameters

ηo

50,730 23,763 37,380

88.09 7 7.18 89.05 7 3.214 89.437 0.7314

5. Conclusion In this paper, we have presented a Fully Complex-Valued Fast Learning Classifier to solve real-valued classification problems. The FC-FLC is a single hidden layer network with a nonlinear input layer to transform the real-valued input features to the Complex domain. The neurons in the input layer of FC-FLC employ a circular transformation to convert the real-valued input features to the Complex domain. At the hidden layer, the complex-valued input features are projected onto a hyper-dimensional Complex plane (Cm -CK ) using the K hidden neurons employing the Gudermannian (Gd) activation function. To investigate the suitability of the

Gd as an activation function for a fully complex-valued network, we have formulated the activation function in two forms (AF1) and FC-FLC(AF2). The neurons at the output layer are linear. The input weights of the FC-FLC are chosen randomly and the output weights are estimated as a solution to an unconstrained linear programming problem. The optimum input weights are chosen using a k-fold cross-validation. The performance of the FC-FLC is evaluated on a set of benchmark classification problems from the UCI machine learning repository and a practical human emotion recognition problem. The experimental and statistical studies on the various benchmark data sets show that the FC-FLC with the Gd activation function of the form AF2 outperforms other best performing classifiers in the literature and the FC-FLC with the Gd activation of the form AF1. Performance study in comparison with other complex-valued and the best performing real-valued classifiers clearly indicate that the proposed classifier has a better classification ability. References [1] M.B. Li, G.B. Huang, P. Saratchandran, N. Sundararajan, Complex-valued growing and pruning RBF neural networks for communication channel equalisation, IEE Proc.—Vis. Image Signal Process. 153 (4) (2006) 411–418. [2] J.C. Bregains, F. Ares, Analysis, synthesis and diagnosis of antenna arrays through complex-valued neural networks, Microw. Opt. Technol. Lett. 48 (8) (2006) 1512–1515.


[3] R. Savitha, S. Vigneshwaran, S. Suresh, N. Sundararajan, Adaptive beamforming using complex-valued radial basis function neural networks, in: IEEE Region 10 Conference, 2009 (TENCON 2009), November 23–26 2009, pp. 1–6. [4] C. Shen, H. Lajos, S. Tan, Symmetric complex-valued RBF receiver for multipleantenna-aided wireless systems, IEEE Trans. Neural Netw. 19 (9) (2008) 1659–1665. [5] D. Jianping, N. Sundararajan, P. Saratchandran, Complex-valued minimal resource allocation network for nonlinear signal processing, Int. J. Neural Syst. 10 (2) (2000) 95–106. [6] I. Aizenberg, D.V. Paliy, J.M. Zurada, J.T. Astola, Blur identification by multilayer neural network based on multivalued neurons, IEEE Trans. Neural Netw. 19 (5) (2008) 883–898. [7] N. Sinha, M. Saranathan, K.R. Ramakrishna, S. Suresh, Parallel magnetic resonance imaging using neural networks, in: IEEE International Conference on Image Processing, 2007 (ICIP 2007) vol. 3, 2007, pp. 149–152. [8] R. Remmert, Theory of Complex Functions, Springer-Verlag, New York, USA, 1991. [9] H. Leung, S. Haykin, The complex backpropagation algorithm, IEEE Trans. Signal Process. 39 (September) (1991) 2101–2104. [10] A.J. Noest, Phasor neural networks, Neural Inf. Process. Syst. 2 (1989) 584–591 (Online: 〈http://books.nips.cc/nips02.html〉). [11] T. Kim, T. Adali, Fully complex multi-layer perceptron network for nonlinear signal processing, J. VLSI Signal Process. Syst. Signal Image Video Technol. 32 (1/2) (2002) 29–43. [12] T. Kim, T. Adali, Approximation by fully complex multi-layer perceptrons, Neural Comput. 15 (7) (2003) 1641–1666. [13] R. Savitha, S. Suresh, N. Sundararajan, P. Saratchandran, A new learning algorithm with logarithmic performance index for complex-valued neural networks, Neurocomputing 72 (16–18) (2009) 3771–3781. [14] S. Chen, S. McLaughlin, B. Mulgrew, Complex valued radial basis function network. Part I: network architecture and learning algorithms, EURASIP Signal Process. J. 35 (1) (1994) 19–31. [15] R. Savitha, S. Suresh, N. Sundararajan, A fully complex-valued radial basis function network and its learning algorithm, Int. J. Neural Syst. 19 (4) (2009) 253–267. [16] T. Nitta, The computational power of complex-valued neuron, in: Artificial Neural Networks and Neural Information Processing ICANN/ICONIP. Lecture Notes in Computer Science, vol. 2714, 2003, pp. 993–1000. [17] T. Nitta, Orthogonality of decision boundaries of complex-valued neural networks, Neural Comput. 16 (1) (2004) 73–97. [18] I. Aizenberg, C. Moraga, Multilayer feedforward neural network based on multi-valued neurons (MLMVN) and a backpropagation learning algorithm, Soft Comput. 11 (2) (2007) 169–183. [19] M.F. Amin, K. Murase, Single-layered complex-valued neural network for realvalued classification problems, Neurocomputing 72 (4–6) (2009) 945–955. [20] R. Savitha, S. Suresh, N. Sundararajan, H.J. Kim, Fast learning fully complexvalued classifiers for real-valued classification problems, in: D. Liu, et al. (Eds.), ISNN 2011, Part I, Lecture Notes in Computer Science (LNCS), vol. 6675, 2011, pp. 602–609. [21] R. Savitha, S. Suresh, N. Sundararajan, H.J. Kim, A fully complex-valued radial basis function classifier for real-valued classification, Neurocomputing 78 (1) (2012) 104–110. [22] R. Savitha, S. Suresh, N. Sundararajan, Fast learning circular complex-valued extreme learning machine (CC-ELM) for real-valued classification problems, Inf. Sci. 187 (1) (2012) 277–290. [23] S. Suresh, N. Sundararajan, R. Savitha, Supervised Learning with ComplexValued Neural Networks, Studies in Computational Intelligence, Vol. 421, Springer Verlag, Berlin Heidelberg, 2013. [24] A. Hirose, Complex-valued Neural Networks, Studies in Computational Intelligence, Vol. 400, Springer Verlag, Berlin Heidelberg, 2006. [25] H.I. Ringermacher, L.R. Mead, A new formula describing the scaffold structure of spiral galaxies, Mon. Not. R. Astron. Soc. 397 (2009) 164–171. [26] S. Suresh, R.V. Babu, H.J. Kim, No-reference image quality assessment using modified extreme learning machine classifier, Appl. Soft Comput. 9 (2) (2009) 541–552. [27] C. Blake, C. Merz, UCI Repository of Machine Learning Databases, Department of Information and Computer Sciences, University of California, Irvine, URL: 〈http://archive.ics.uci.edu/ml/〉, 1998. [28] T. Nitta, Solving the xor problem and the detection of symmetry using a single complex-valued neuron, Neural Netw. 16 (8) (2003) 1101–1105. [29] M.F. Amin, M.M. Islam, K. Murase, Ensemble of single-layered complex-valued neural networks for classification tasks, Neurocomputing 72 (10–12) (2009) 2227–2234. [30] R. Savitha, S. Suresh, N. Sundararajan, Projection-based fast learning fully complex-valued relaxation neural network, IEEE Trans. Neural Netw. Learn. Syst. 24 (2013) 529–541. [31] R. Savitha, S. Suresh, N. Sundararajan, A meta-cognitive learning algorithm for a fully complex-valued relaxation network, Neural Netw. 32 (2012) 209–218. [32] K. Subramanian, R. Savitha, S. Suresh, A Metacognitive Complex-Valued Interval Type-2 Fuzzy Inference System, IEEE Trans. Neural Networks Learn. Syst. 25 (9) (2014) 1659–1672. [33] M.B. Li, G.-B. Huang, P. Saratchandran, N. Sundararajan, Fully complex extreme learning machine, Neurocomputing 68 (1–4) (2005) 306–314. [34] S. Suresh, N. Sundararajan, P. Saratchandran, Risk-sensitive loss functions for sparse multi-category classification problems, Inf. Sci. 178 (12) (2008) 2621–2638.

205

[35] S. Suresh, N. Sundararajan, P. Saratchandran, A sequential multi-category classifier using radial basis function networks, Neurocomputing 71 (7–9) (2008) 1345–1358. [36] S. Suresh, S.N. Omkar, V. Mani, T.N.G. Prakash, Lift coefficient prediction at high angle of attack using recurrent neural network, Aerosp. Sci. Technol. 7 (8) (2003) 595–602. [37] S. Suresh, R. Savitha, N. Sundararajan, A sequential learning algorithm for complex-valued self-regulating resource allocation network—CSRAN, IEEE Trans. Neural Netw. 22 (2011) 1061–1072. [38] N. Cristianini, J.S. Taylor, An Introduction to Support Vector Machines, Cambridge University Press, Cambridge, UK, 2000. [39] S. Suresh, K. Dong, H.J. Kim, A sequential learning algorithm for self-adaptive resource allocation network classifier, Neurocomputing 73 (16–18) (2010) 3012–3019. [40] G.-B. Huang, D.H. Wang, Y. Lan, Extreme learning machines: a survey, Int. J. Mach. Learn. Cybern. 2 (2011) 107–122. [41] M. Friedman, A comparison of alternative tests of significance for the problem of m rankings, Ann. Math. Stat. 11 (1940) 86–92. [42] J. Demsar, Statistical comparisons of classifiers over multiple data sets, J. Mach. Learn. Res. 7 (2006) 1–30. [43] O.J. Dunn, Multiple comparisons among means, J. Am. Stat. Assoc. 56 (1961) 52–64. [44] K. Subramanian, S. Suresh, R.V. Babu, Meta-cognitive neuro-fuzzy inference system for human emotion recognition, in: International Joint Conference on Neural networks (IJCNN), 2012, pp. 1–7.

M. Sivachitra received B.E Degree in Electrical and Electronics Engineering from Bharathiyar University in the year 2000 and M.E (Applied Electronics) from Anna University, Chennai in the year 2006. She is currently pursuing research in Anna University, Chennai in the area of complex valued neural networks. She has a total teaching experience of 14 years. At present she is working as an Assistant professor (Selection Grade) in Kongu Engineering College, Autonomous institution affiliated to Anna University Chennai. She has attended International conferences and published papers in International journals. She has conducted workshops and seminars in her research area.

Ramasamy Savitha (M’11) received the B.E. degree in electrical and electronics engineering from Manonmaniam Sundaranar University, Thirunelveli, India, the M. E. degree in control and instrumentation engineering from Anna University, Chennai, India, and the Ph.D. degree from the School of Electrical and Electronics Engineering, Nanyang Technological University, Singapore, in 2000, 2002, and 2011, respectively. She was a lecturer at MSEC, affiliated to Anna University from 2002-2005. She is currently a Post-Doctoral Research Fellow with the School of Computer Engineering, Nanyang Technological University. Her current research interests include neural networks, metacognition, and bioinformatics

Sundaram Suresh received the B.E degree in Electrical and Electronics Engineering from Bharathiyar University in 1999, and M.E (2001) and Ph.D. (2005) degrees in Aerospace Engineering from Indian Institute of Science, India. He was post-doctoral researcher in School of Electrical Engineering, Nanyang Technological University from 2005 to 2007. From 2007–2008, he was in INRIA-Sophia Antipolis, France as ERCIM research fellow. He was in Korea University for a short period as a visiting faculty in Industrial Engineering. From January 2009 to December 2009, he was in Indian Institute of Technology-Delhi as an Assistant Professor in Department of Electrical Engineering. Currently, he is working as an Assistant Professor in School of Computer Engineering, Nanyang Technological University, Singapore since 2010. His research interest includes flight control, unmanned aerial vehicle design, machine learning, optimization and computer vision.

206


S. Vijayachitra received the B.E Degree in Electronics and Communication Engineering from Mookambigai College of Engineering, Pudukkottai, the M.E degree from Annamalai University, Annamalai Nagar, Chidambaram, and the Ph.D. Degree from Anna University, Chennai in 1994, 2001 and 2009 respectively.Since 1994, she started her teaching career at M.P. Nachimuthu M. Jaganathan Polytechnic, Chennimalai as Lecturer in the Department of Electronics and Communication Engineering and served for about six years. Since February 2002, she is with Kongu Engineering College, Perundurai, Erode District, Tamil Nadu State, India, occupying teaching position and she is now the Professor in the Department of Electronics and Instrumentation Engineering. She has conducted many seminars, conferences and short term courses funded by reputed agencies like ICMR, DBT, CSIR, BRNS, AICTE-ISTE and etc. She has published 30 papers in International journals and 53 papers in conference proceedings.

A Fully Complex-valued Fast Learning Classifier - Semantic Scholar

A Fully Complex-valued Fast Learning Classifier - Semantic Scholar

Suggest Documents

Fast Rule Matching for Learning Classifier ... - Semantic Scholar

Learning Classifier Systems - Semantic Scholar

16S Classifier: A Tool for Fast and Accurate ... - Semantic Scholar

Accuracy-based Learning Classifier System ... - Semantic Scholar

Consideration of using Learning Classifier ... - Semantic Scholar

A Fast and Fully Automatic Ear Recognition ... - Semantic Scholar

A Discriminative Classifier Learning Approach to ... - Semantic Scholar

Consolidated Tree Classifier Learning in a Car ... - Semantic Scholar

a coevolutionary learning classifier based on ... - Semantic Scholar

Hierarchical classifier - Semantic Scholar

Fast and Fully Automatic Ear Detection Using ... - Semantic Scholar

Fast Fully-Distributed and Threshold RSA Function ... - Semantic Scholar

Fast Learning of Relational Kernels - Semantic Scholar

Speeding-up Pittsburgh Learning Classifier Systems - Semantic Scholar

'Random Forests' Classifier - Semantic Scholar

Online Adaptation in Learning Classifier Systems - Semantic Scholar

Special issue on advances in learning classifier ... - Semantic Scholar

Latent Tree Classifier - Semantic Scholar

A Fully Automatic Crossword Generator - Semantic Scholar

Implementing a Fully Distributed Certificate ... - Semantic Scholar

Machine Learning Methods for Fully Automatic ... - Semantic Scholar

Fully Distributed Scrum - Semantic Scholar

A Learning Classifier Systems Bibliography Tim ...

A Separability-Entanglement Classifier via Machine Learning