Gradient-based back-propagation dynamical ... - Semantic Scholar

4 downloads 29210 Views 244KB Size Report
(2) Department of Electrical and Computer Engineering, University of Birjand, Birjand, Iran. (3) Department of Biomedical Engineering, Science and Research Branch, Islamic Azad University, Tehran, ..... important factor in online computations is considered as ..... Omid Khayat received his Bachelor and Master degree in.
Article

DOI: 10.111/exsy12131

Gradient-based back-propagation dynamical iterative learning scheme for the neuro-fuzzy inference system Hadi Chahkandi Nejad,1* Mohsen Farshad,2 Fereidoun Nowshiravan Rahatabad3 and Omid Khayat4 (1) Electrical Engineering Department, Birjand Branch, Islamic Azad University, Birjand, Iran Email: [email protected] (2) Department of Electrical and Computer Engineering, University of Birjand, Birjand, Iran (3) Department of Biomedical Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran (4) Young Researchers and Elite Club, South Tehran Branch, Islamic Azad University, Tehran, Iran

Abstract: In this paper, a gradient-based back propagation dynamical iterative learning algorithm is proposed for structure optimization and parameter tuning of the neuro-fuzzy system. Premise and consequent parameters of the neuro-fuzzy model are initialized randomly and then tuned by the proposed iterative algorithm. The learning algorithm is based on the first order partial derivative of the output with respect to the structure parameters. The first order derivative of the model output with respect to the structure parameters determines the sensitivity of the model to structure parameters. The sensitivity values are then used to set the tuning factors and parameters updating step sizes. Therefore, an adaptive dynamical iterative scheme is achieved which adapts the learning procedure to the current state of the performance during the optimization process. Larger tuning step sizes make the convergence speed higher and vice versa. In this regard, this parameter is treated according to the calculated sensitivity of the model to the parameter. The proposed learning algorithm is compared with the least square back propagation method, genetic algorithm and chaotic genetic algorithm in the neuro-fuzzy model structure optimization. Smaller mean square error and shorter learning time are sought in this paper, and the performance of the proposed learning algorithm is versified regarding these criteria.

Keywords: gradient-descent back propagation, iterative learning scheme, adaptive neuro-fuzzy inference system, structure parameter, tuning factor

1. Introduction An important issue involved in system modeling is the identification of structure and function of a system. The aim of system identification is to identify a predefined simulation model that approximates a real world system (Abiyev & Kaynak, 2008). Hence, the process of system identification can be treated as a kind of function approximation. This process is commonly encountered in the systems where a set of input–output pairs is also available. Neural networks have demonstrated great potential for system modeling even where the system dynamics is nonlinear (Lapedes & Farber, 1987; Yager, 1994; Abiyev & Kaynak, 2008). Lapedes and Farber (1987) first proposed to use a multilayer perceptron neural network (MLP) for nonlinear time series prediction. However, conventional neural networks process signals only on their finest resolutions. Fuzzy neural networks (FNNs) are hybrid systems that combine the theories of fuzzy logic and neural networks, thus can make effective use of easy interpretability of fuzzy logic, as well as superior learning ability and adaptive capability of neural networks (Studer & Masulli, 2007;

© 2015 Wiley Publishing Ltd

Qiaoa & Wang, 2008; Khayat et al., 2009; Khayat et al., 2013). Such integration overcomes the drawbacks of fuzzy systems mentioned above and renders neuro-fuzzy systems more powerful than either one alone. FNNs are widely used in areas of adaptive control, adaptive signal processing, nonlinear system identification (Jang, 1992; Jang et al., 1997), nonlinear function approximation and prediction (Wang & Chen, 2007; Qiaoa & Wang, 2008) and so on. The design of FNNs consists of structure and parameter identification. Parameter identification involves determining parameters of premises and consequences. Structure identification comprises the partitioning of input–output space and determination of the rule number for the desired performance. Neuro-fuzzy systems have made used into broad span of commercial and industrial applications that require analysis of indefinite and indecisive information (Akcayol, 2004; Chen & Linkens, 2006; Kruse et al., 2007; Studer & Masulli, 2007). Hybrid integrated neuro-fuzzy is the major interest of research as it makes use of the complementarities strength of artificial neural network and fuzzy inference systems (Akcayol, 2004). Adaptive Neuro-Fuzzy Inference System (ANFIS), a neuro-fuzzy based model, is used in this study,

Expert Systems, xxxx 2015, Vol. 00, No. 00

which is the hybrid technology of integrated neuro-fuzzy model and a part of MATLAB’s Fuzzy Logic Toolbox (MATLAB, 1999). To represent fuzzy inference system, fixed number of layers is presented structurally. ANFIS is the best function approximator among the neuro-fuzzy models and its fast convergence comparable to the other neuro-fuzzy models, although it was one of the first integrated hybrid neuro-fuzzy models (Akcayol, 2004). Besides, ANFIS affords superior results when applied without any pre-training (Altug et al., 2004). Most of the neuro-fuzzy inference systems are based on Takagi–Sugeno or Mamdani type. For model-based applications, Takagi–Sugeno fuzzy inference system is usually used (Sugeno & Tanaka, 1991; Sugeno & Yasukawa, 1993). However, the Mamdani fuzzy inference system is used for faster heuristics but with a low performance (Thiesing & Vornberger, 1997). There are two adaptive layers in the ANFIS structure, the first and fourth layer. There are two modifiable matrices of parameters which shape the input Gaussian membership functions. These parameters are the so-called premise parameters. In the fourth layer, there is also a modifiable matrix of parameters pertaining to the first order polynomial. These parameters are so-called consequent parameters (Jang, 1992; Guler & Ubeyli, 2005). Both the premise and consequent matrices of parameters are adjusted during the learning procedure aiming to make the ANFIS output match the training data. The least squares method can be used to identify the optimal values of these parameters easily. When the premise parameters are not fixed, the search space becomes larger, and the convergence of the training becomes slower. A hybrid algorithm combining the least squares method and the gradient descent method has commonly been adopted to solve this problem. The hybrid algorithm is composed of a forward pass and a backward pass. The least squares method (forward pass) is used to optimize the consequent parameters with the premise parameters fixed. Once the optimal consequent parameters are found, the backward pass starts immediately. The gradient descent method (backward pass) is used to adjust optimally the premise parameters corresponding to the fuzzy sets in the input domain. The output of the ANFIS is calculated by employing the consequent parameters found in the forward pass. The output error is used to adapt the premise parameters by means of a standard back propagation algorithm. It has been proven that this hybrid algorithm is efficient in training the ANFIS (Jang, 1992; Akcayol, 2004; Altug et al., 2004; Guler & Ubeyli, 2005; Chen & Linkens, 2006; Studer & Masulli, 2007). However in large dimensions of the ANFIS where the number of input variables and/or the number of fuzzy sets per input variables raises the number of ANFIS parameters largely increases which requires a more efficient and faster learning procedure for the model. In this paper, a learning algorithm is proposed for parameter identification and optimization of the structure of an adaptive neuro-fuzzy inference system (ANFIS). A

Expert Systems, xxxx 2015, Vol. 00, No. 00

gradient-based back propagation algorithm with dynamical adjustment of the learning coefficients in an iterative scheme is proposed in this paper for structure parameter tuning and optimization of the neuro-fuzzy model. Fast convergence and accurate tuning of the structure parameters are sought in this proposal. Some benchmark multi-variable nonlinear functions are used to verify the efficiency of the proposed learning algorithm in comparison with some commonly used learning methods.

2. Adaptive neuro-fuzzy inference system 2.1. Structure Once the ANFIS is structured and learnt, the parameters are optimized. Therefore, the network can be employed to approximate the partial derivatives of a function. Because the accuracy of approximating a function and its partial derivatives is closely influenced by the network structure optimization, an adaptive learning scheme is developed to tune the structure parameters according to their impact on the output. In this scheme, the sensitivity of the output with respect to each parameter will determine the updating step size and also the tuning factor. ANFIS structure has been described in details in several researches (Jang, 1992; Akcayol, 2004; Altug et al., 2004; Guler & Ubeyli, 2005; Chen & Linkens, 2006; Kruse et al., 2007; Studer & Masulli, 2007; Khayat et al., 2013; Khayat, 2014). The first order derivative of the ANFIS structure has been proposed by Khayat et al. (2013) and Khayat (2014). For the convenience of interpretations, some necessary expressions of the first order derivative of ANFIS structure have been also given in this paper (Khayat et al., 2013). The structure of interest has five layers consisting of layer 1) fuzzification layer, layer 2) fuzzy intersection layer, layer 3) rule firing normalization layer, layer 4) rule contribution calculation layer and layer 5) rule activation layer. Mathematical expressions of the five layers, described above, can be given as follows. It should be noted that some notations and mathematical expressions have been represented according to MATLAB programming language. However, detailed explanations have been given accordingly. Assume an input vector of Xi from a set of observations {Xi} is applied to the first layer of the network. A number of NR Gaussian membership functions are set for each input element, xi, of the D-dimensional input vector Xi. The output of the first layer is   2 ðxi Cij Þ    2δi 2 j (1) Y 1 i; j ¼ f i;Dj xi ; Cij ; δij ¼ e where Cij and δij are the center and standard deviation for the Gaussian functions. It is noted that these scalars are elements of the center and standard deviation vectors which form the premise parameters of the neuro-fuzzy model. Output of the layer 2 is a columnar vector where its ((id1  1)NR + id2)th array is the product of the id1th rule of

© 2015 Wiley Publishing Ltd

the d1th input variable and the id2th rule of the d2th input variables. The general description of the second layer’s output is given as   i; ji i i (2) Y r2 ¼ ΠD i¼1 f D xi ; C ji ; δji ji ¼1:N R

D

in which r ¼ ∑ ðji  1ÞN R Di þ 1 and r = 1 : ND R for the case i¼1

all input variables have the same number of fuzzy membership functions. It should be noted that symbol ‘:’ means the range of variable. The third layer which acts as a rule firing normalization layer will give Y3

r

wr Y 2 r ¼ where ðr ¼ 1 : r max Þ ∑wr 1rmax

(3)

where wr 1rmax is the row vector of weights applying to the nodes of layer 2 for rule firing normalization. Output vector of the fourth layer can be calculated simply as the product of the normalized firing strength vector and a first order polynomial (for a first order Sugeno model) as Y 4 r ¼ Y 3 r :ψ ðY 3 r Þ where ðr ¼ 1 : r max Þ

Y5 ¼ ∑ Y4

δY 5 δY 5 δY 4 r δY 3 r δY 2 r δY 1 i; j ¼ μP δP δY 4 r δY 3 r δY 2 r δY 1 i; j δP xi

(5)

r

The output of the network is a scalar value which is a nonlinear function of the premise parameters and a linear function of the consequent parameters. Hence, the contribution of the network structure parameters can be computed by calculating the first order partial derivative of the output relative to each parameter. 2.2. Learning procedure

(7)

Otherwise, if parameter P refers to the consequent parameters, it can be written ζ PConsequent

δY 5 δY 5 δY 4 r ¼ μP ¼ μP δP δY 4 r δP xi

(8)

Note that both equations (7) and (8) depend on the input vector and they can be determined using equations (1)–(5). Suppose that in the iteration t the output Y5 tends to the desired value Y ̂5 where the parameter P tends to its optimum value P̂. Therefore, it can be expressed as ΔPt :ζ P t ¼ ΔY 5 t

(9)

which is resulted in jP̂ Pt j:ζ P t ¼ jY 5̂  Y 5 t j

(10)

On the other hand, for the previous iteration (t  1) the difference between the desired output and the calculated output corresponding to parameters in that state Pt  1 can be written as ΔPt1 :ζ P t1 ¼ ΔY 5 t1

r¼1

(11)

As of the necessity of the convergence to the global optimum, the current differences of the outputs should be smaller than the differences in the previous iteration as ΔY 5 t < ΔY 5 t1

(12)

Hence jP̂ Pt j:ζ P t < P̂ Pt1 :ζ P t1

(13a)

or

The contribution of each parameter on the network structure performance will determine the tuning factor for that parameter as

j

δY 5 ζ P ¼ μP δP xi

(6)

in which the ζ P is interpreted as the structure sensitivity with respect to parameter P and μP is the sensitivity coefficient (initially set to unit value). Parameter P refers to the premise parameters and consequent parameters of the model which should be determined and tuned during a learning procedure. The learning scheme for the structure is, therefore, proposed as an iterative parameter adjustment process with the initial state of randomization which proceeds with a variable step size tuning towards the global optimum structure.

© 2015 Wiley Publishing Ltd

ζ PPremise ¼ μP

(4)

 where ψ Y r3 is a linear function in the first order Sugeno model. The parameters in this linear function constitute the consequent parameters of the neuro-fuzzy model. The single node of layer five perform a simple summation as r max

If parameter P refers to the premise parameters, then the equation (6) can be rewritten as

jP̂ Pt j:ζ P t ¼ σΔY 5 t

(13b)

Using equations (13a) and (13b) the following relation is obtained Ptþ1 ¼ Pt þ

σΔY 5 t ζ Pt þ ε

(14)

in which σ and ε are parameter updating step coefficient and tuning threshold, respectively. Applying equation (14) for parameter updating, or structure learning, comprises two main principles. One is the effectiveness of tuning factor, ζ P, in parameter updating rate which is an inverse relation. In another words, higher sensitivity of the structure to a parameter P will result in smaller updating step for the parameter tuning. Another principle in this definition

Expert Systems, xxxx 2015, Vol. 00, No. 00

(equation (14)) is the impact of the output difference with the desired value. Closer outputs to the desired values also will result in smaller updating step for the parameter tuning. Fast convergence to the optimum structure with dynamic learning procedure is the key feature of the proposed learning scheme. Parameters σ and ε are initially set as σ = 0.1 and ε = 0.1ζ tP at t = 0, respectively.

observations. As the forth function, one system can be described as yðt þ 1Þ ¼ tϵ½1; 200

yðt Þyðt  1Þ½yðt Þ þ 2:5 þ uðt Þ 1 þ y 2 ðt Þ þ y 2 ðt  1 Þ y ð0 Þ ¼ 0

y ð1 Þ ¼ 0

  2πt uðtÞ ¼ sin 25 (19)

3. Simulation study

The model is identified in series-parallel mode defined as

In this section, approximation capability of the neuro-fuzzy inference system learnt by the proposed learning algorithm is investigated for some benchmark mathematical functions. The proposed structure parameter tuning algorithm is compared to some other learning algorithms in accuracy of nonlinear function approximation. Four nonlinear multivariable mathematical functions, commonly used as the benchmarks, are used to verify the convergence speed and structure optimization capability of the proposed learning algorithm. 3.1. Nonlinear mathematical functions As the first function, the Mackey Glass time series is the first case of comparison. A total of 1000 input–output data pairs as ½xðt  18Þ; xðt  12Þ; xðt  6Þ; xðtÞ; xðt þ 6Þ

(15)

are extracted from the following delay differential equation dxðtÞ 0:2xðt  τ Þ  0:1xðt Þ ¼ dt 1 þ x10 ðt  τ Þ

(16)

for which τ = 17 and x(0) = 1.2. This consideration means the embedding dimension and the lag are 4 and 6 respectively. The first 500 pairs are the training data set, while the remaining 500 pairs are the testing data set. The Rossler map is considered as the second case x_ 1 ¼ x2  x3 ; x_ 2 ¼ x1 þ ax2 ; x_ 3 ¼ b þ x3 ðx1  cÞ

(17)

where xi, for i = 1, 2, 3 is the state variable of system, and a, b and c are positive constants which are set is this paper as a = 0.15, b = 0.2 and c = 10. We use it in part as a chaotic function benchmark to test our model. We estimated the Rossler map and the matrix of its first order partial derivative based on 1000 observations with the sample rate of 0.1. The third example is another well-known benchmark attractor named Lorenz attractor which is a threedimensional continuous-time system x_ ¼ aðx  yÞ y_ ¼ xðb  zÞ  y z_ ¼ xy  cz

(18)

where a, b and c are pa rameters and set to a = 16, b = 45.92 and c = 4 and the approximation is performed for 1000

Expert Systems, xxxx 2015, Vol. 00, No. 00

ŷðt þ 1Þ ¼ f ðyðt Þ; yðt  1Þ; uðt ÞÞ

(20)

It is a three-input–single-output fuzzy model. There are 200 input-target data sets chosen as training data. Another 200 input-target data in the interval are chosen as the testing data. 3.2. Function modeling In the first part of the experiment, the neuro-fuzzy model with four different learning algorithms is employed to approximate the aforementioned functions. Least square back propagation (LSBP), genetic algorithm (GA), chaotic genetic algorithm (CGA) and the proposed gradient-based back propagation dynamical iterative (GBPDI) algorithms are the cases of comparison for approximation of Mackey–Glass and noisy Mackey–Glass time series. LSBP, GA and CGA parameters are similar to those implemented in the work of Khayat et al. (Khayat, 2014). In Table 1 structure parameters and the mean square error (MSE) of training and testing data for the structures are given. Table 1 shows the results of Mackey–Glass time series approximation by the neuro-fuzzy model with four different learning methods evaluated on the mean square error of training (first 500 pairs) and testing (remaining 500 pairs) sample approximations. Noisy data are also created with normal noise to invest the model sensitivity to rapid input variations. The neuro-fuzzy model is constructed with three Gaussian and trapezoidal membership functions for each input variable. The total number of rules will then be 81 rules. Training epochs for LSBP and GBPDI algorithms and population/generation sizes for genetic algorithm and chaotic genetic algorithm are also determined in Table 1. Learning time as an important factor in online computations is considered as a criterion in this part of experiment. The results given in Table 1 demonstrate two capabilities of the GBPDI algorithm compared to other methods. One is the accuracy of fitting to the data pairs which is shown as the MSE error. Lower errors attained by the proposed method compared to other learning algorithms with the same structure employed demonstrate the efficiency of the learning method in structure optimization and parameter tuning for the model. Another capability of the proposed learning algorithm is its shorter computation time. The mentioned capabilities are more apparent in the results

© 2015 Wiley Publishing Ltd

Table 1: The results of the neuro-fuzzy model with four different learning algorithm for Mackey–Glass and noisy Mackey–Glass time series with normal noise N (0, 0.05). (P: population, G: generation) Approximator

Structure parameters

Mean square error criterion

Structure

Learning algorithm

Membership function

Parameters (D, NR,rmax, Tmax/P,G)

ANFIS ANFIS ANFIS ANFIS ANFIS ANFIS ANFIS ANFIS ANFIS ANFIS ANFIS ANFIS ANFIS ANFIS ANFIS ANFIS

LSBP LSBP LSBP LSBP GA GA GA GA CGA CGA CGA CGA GBPDI GBPDI GBPDI GBPDI

Gaussian Gaussian Trapezoidal Trapezoidal Gaussian Gaussian Trapezoidal Trapezoidal Gaussian Gaussian Trapezoidal Trapezoidal Gaussian Gaussian Trapezoidal Trapezoidal

(4,3,81,400) (4,3,81,600) (4,3,81,400) (4,3,81,600) (4,3,81,50,200) (4,3,81,50,600) (4,3,81,50,200) (4,3,81,50,600) (4,3,81,50,200) (4,3,81,50,600) (4,3,81,50,200) (4,3,81,50,600) (4,3,81,400) (4,3,81,600) (4,3,81,400) (4,3,81,600)

Time

MSE of train

MSE of test

MSE with noise

Learning time (s)

4.189e  07 4.021e  07 1.514e  06 8.451e  07 1.021e  06 4.419e  07 3.329e  06 7.208e  07 8.951e  07 2.201e  07 9.604e  07 5.258e  07 4.012e  07 9.981e  08 6.615e  07 2.085e  07

2.003e  06 1.318e  06 2.922e  06 1.021e  06 1.318e  06 5.718e  07 4.914e  06 9.017e  07 9.414e  07 2.656e  07 9.894e  07 7.589e  07 4.215e  07 1.210e  07 7.49e  07 2.156e  07

2.008e  06 1.452e  06 2.925e  06 1.215e  06 1.419e  06 5.924e  07 6.012e  06 9.518e  07 9.756e  07 2.854e  07 1.135e  06 8.160e  07 4.416e  07 1.413e  07 7.776e  07 2.328e  07

24 34 26 35 61 87 63 89 50 71 54 73 15 21 17 23

Table 2: Convergence speed and learning time for target error in approximation of four mathematical functions by ANFIS structure with four different learning algorithms Function

Parameters

MSE10

second

MSE20

second

MSE50

second

Time5 MSE:10

Time6 MSE:10

ANFIS structure + LSBP learning algorithm Function 1 (4,3,81) 8.589e  06 Function 2 (4,3,81) 9.216e  06 Function 3 (4,3,81) 9.051e  06 Function 4 (4,3,81) 8.848e  06 ANFIS structure + GA learning algorithm Function 1 (4,3,81) 5.692e  05 Function 2 (4,3,81) 9.154e  05 Function 3 (4,3,81) 8.545e  05 Function 4 (4,3,81) 6.066e  05 ANFIS structure + CGA learning algorithm Function 1 (4,3,81) 2.624e  05 Function 2 (4,3,81) 6.518e  05 Function 3 (4,3,81) 4.159e  05 Function 4 (4,3,81) 2.974e  05 ANFIS structure + GBPDI learning algorithm Function 1 (4,3,81) 9.415e  07 Function 2 (4,3,81) 9.812e  07 Function 3 (4,3,81) 9.704e  07 Function 4 (4,3,81) 9.515e  07

3.521e  6.541e  6.055e  4.841e 

06 06 06 06

8.664e  9.298e  8.716e  8.698e 

07 07 07 07

8.2 s 9.1 s 9.0 s 8.4 s

17.6 s 24.8 s 21.2 s 19.6 s

3.148e  7.524e  6.326e  4.418e 

05 05 05 05

5.695e  8.654e  7.407e  6.021e 

06 06 06 06

37.1 s 44.5 s 41.6 s 39.8 s

87.6 s 94.6 s 93.7 s 90.8 s

7.518e  06 9.959s  06 9.241s  06 8.124s  06

9.414e  07 1.524s  06 1.021s  06 9.897s  07

14.2 s 19.5 s 17.7 s 15.2 s

72.6 s 77.9 s 74.6 s 74.4 s

1.202e  07 2.894e  07 2.659e  07 2.154e  07

1.114e  07 3.365e  07 3.265e  07 1.907e  07

3.1 s 6.5 s 5.8 s 4.2 s

9.7 s 11.2 s 10.5 s 10.1 s

shown in Table 2. Lowest error achieved in a preset time and the computation time required to achieve a preset error can better show the convergence speed of the learning algorithms. Because the aim is comparing the performance of the learning methods, the results can be given in arbitrary units and no matter is with the hardware used. However, it should be noted that all implementations have been performed by MATLAB R2008a (MATLAB, 1999) installed on an AMD FX™

© 2015 Wiley Publishing Ltd

6100 six-core processor (1.39 GHz) with 3.25 GB of RAM and the time scale is in second. The results given in Table 2 make the algorithms clearly comparable in the same situations (the same structures and the same targets). In short preset time durations the achieved error by the proposed learning algorithm is almost an order of magnitude smaller than the errors achieved by LSBP, and it is also almost two orders of magnitude smaller than the errors achieved by GA-based algorithms. In longer

Expert Systems, xxxx 2015, Vol. 00, No. 00

Table 3: Parameter investigation of the proposed learning algorithm Learning Structure algorithm ANFIS ANFIS ANFIS ANFIS ANFIS ANFIS

GBPDI GBPDI GBPDI GBPDI GBPDI GBPDI

Parameters ðTmax;μp σ;ε Þ (400,1,0.1,0.1 ζ p0 ) (600,1,0.1,0.1 ζ p0 ) (400,1,0.4,0.1 ζ p0 ) (600,1,0.4,0.1 ζ p0 ) (400,1,0.1,0.5 ζ p0 ) (600,1,0.1,0.5 ζ p0 )

MSE of test 4.215e  1.210e  5.512e  1.227e  7.418e  3.248e 

07 07 07 07 07 07

Learning time 15 s 21 s 15 s 21 s 15 s 21 s

preset time durations the superiority is still preserved. For the preset error targets, the computation times required by the proposed learning algorithm to optimize the neuro-fuzzy structure and also to tune the structure parameters are considerably shorter than the computation time of the other algorithms. Hence, it can be stated that in structure optimization of the neuro-fuzzy models the gradient based back propagation dynamical iterative learning algorithm outperforms the GA-based and least square back propagation algorithms. For the proposed learning algorithm, some parameters were defined which tolerate the convergence speed and step size of the tuning process. Default values were determined in the section 2 during the definitions of the parameters. However, a change in those parameters can modify the achievements as shown in Table 3. Sensitivity coefficient μP is set default to unit value for simplicity, and its effectiveness can be compensated by the two other parameters. Parameter updating step σ determines the step size of parameter. Lower σ values with a larger number of training epochs will result in more accurate tuning of the parameters. On the contrary, higher σ values with a smaller number of training epochs will attain a faster convergence. The feaster convergence here means no difference in running time of the learning procedure while more accurate results are obtained. However, this statement is true if both the parameters are chosen appropriately and very large σ values do not lead to jumping away the optimal points. There should be set a compromise in σ value and the number of training epochs. As it is shown in the Table 3, default values of parameters (μP, σ, ε) gain a satisfactory result.

4. Conclusion In this paper a learning algorithm was proposed for structure parameter tuning of the adaptive neuro-fuzzy inference system for nonlinear function approximation. A gradient based back propagation algorithm was used to adjust the model parameters towards lower fitting errors. The learning procedure dynamically adjusts the tuning coefficients as tuning step size and sensitivity coefficient. The basis for the dynamical adjustment of tuning factors is the sensitivity of the output to the model parameters. Faster convergence to the optimum structure and more accurate

Expert Systems, xxxx 2015, Vol. 00, No. 00

fitting to the discrete data pairs are the key features we sought in our proposal. The implementations showed that the proposed learning algorithm is superior to the LSBP and GA-based learning methods in terms of learning time and approximation error of the model. However for complex structures of neuro-fuzzy systems with large input dimensions and large number of fuzzy sets for each input variable the main issue involved with the learning procedure remains as the computation time. Therefore, the authors are motivated with using sensitivity analyzed model conceptually similar to what presented in this paper for developing a learning process with higher convergence speed and less computation complexity particularly designed for ANFIS networks of high dimensionality.

References ABIYEV, R.H. and KAYNAK O. (2008) Fuzzy wavelet neural networks for identification and control of dynamic plants—a novel structure and a comparative study, IEEE Transactions on Industrial Electronics, 55, 3133–3140. LAPEDES, A., and FARBER, R. Nonlinear signal processing using neural network: prediction and system modeling. Los Alamos Natural Laboratory Technology Reports, LA-UR-872662, 1987. YAGER R.R., ZADEH L.A. Eds., Fuzzy Sets, Neural Networks and Soft Computing. New York: Van Nostrand Reinhold, 1994. KHAYAT O., EBADZADEH M.M., SHAHDOOSTI H.R., RAJAEI R. and KHAJEHNASIRI I. (2009) A novel hybrid algorithm for creating self-organizing fuzzy neural networks, Neurocomputing 73, 517–524. KHAYAT O., NEJAD H.C., RAHATABAD F.N. and ABADI M.M. (2013) Differentiating adaptive Neuro-Fuzzy Inference System for accurate function derivative approximation, Neurocomputing 103, 232–238. QIAOA J. and WANG, H. (2008) A self-organizing fuzzy neural network and its applications to function approximation and forecast modeling, Neurocomputing 71, 564–569. MATLAB, Users Guide: Fuzzy Logic Toolbox, The Mathworks Inc, Nantick, MA, 1999. JANG, J.S.R., SUN, C.T. & MIZUTANI, E., Neuro-Fuzzy Soft Computing. Englewood Cliffs, NJ: Prentice-Hall, 1997. JANG, J.S.R. (1992) Self-learning fuzzy controllers based on temporal back propagation, IEEE Transactions on Neural Network, 3(5), 714–723. WANG, W.P. and CHEN, Z., A Neuro-Fuzzy Based Forecasting Approach for Rush Order Control Applications, Expert Systems with Applications, 2007. AKCAYOL, M.A. (2004) Application of adaptive neuro-fuzzy controller for SRM. Advances in Engineering Software, 35, 129–137. CHEN, M.Y. and LINKENS, D.A. (2006) A systematic neuro-fuzzy modeling framework with application to material property prediction, IEEE Transactions on Systems Man and Cybernetics Part B, 31, 781–790. KRUSE, V., NAUCK, D., NURNBERGER, A. and MERZ, L. (2007) A neuro-fuzzy development tool for fuzzy controllers under MATLAB/SIMULINK, In Proceedings of the fifth European congress on intelligent techniques and soft computing EUFIT’97, Aachen, Germany, 1029–1033. STUDER, L. and MASULLI, F. (2007) Building a neuro-fuzzy system to efficiently forecast chaotic time series, Nuclear Instruments and Methods in Physics Research Section A, 389, 264–667.

© 2015 Wiley Publishing Ltd

ALTUG, S., CHOW, M.Y. and TRUSSELL, H.J. (2004) Fuzzy inference systems implemented on neural architectures for motor, IEEE Transactions on Industrial Electronics, 46, 1069–1079. SUGENO, M. and TANAKA, K. (1991) Successive identification of a fuzzy model and its applications to predictions of complex systems, Fuzzy Sets and Systems, 42, 315–334. SUGENO, M. and YASUKAWA, T. (1993) A fuzzy-logic-based approach to qualitative modeling, IEEE Transactions on Fuzzy System, 1, 7–31. THIESING, F.M. and VORNBERGER, O. (1997) Sales forecasting using neural network, International Conference on Neural Networks, 4, 2125–2128. GULER, I. and UBEYLI, D.E. (2005) Adaptive neuro-fuzzy inference system for classification of eeg signals using wavelet coefficients, Journal of Neuroscience Methods, 148, 113–121. KHAYAT, O. (2014) Structural parameter tuning of the first-order derivative of an adaptive neuro-fuzzy system for chaotic function modeling, Journal of Intelligent and Fuzzy Systems, 27, 235–245.

The authors Hadi Chahkandi Nejad Hadi Chahkandi Nejad received the BSc and MSc degrees in Electrical Engineering from I.A.U. (Birjand and Gonabad), Iran, in 2007 and 2010, respectively. He received his PhD degree in Electrical Engineering from Birjand University, Iran, in 2016. He is currently a Faculty member of Electrical Engineering Department of Islamic Azad University Birjand Branch. His research interests are in Fuzzy Control, Adaptive Control, System Identification, Image Processing, Artificial Neural Networks, Evolutionary Algorithms and Nonlinear Optimization.

the PhD degree from the Department of Electrical and Computer Engineering, University of Tehran in 2006. His teaching and research interests are design, modeling, and control of electrical machines and drives, System Identification, intelligent modeling and control. He is an assistant professor at Department of Power Engineering, Faculty of Electrical and Computer Engineering, University of Birjand, Birjand, Iran.

Fereidoun Nowshiravan Rahatabad Fereidoun Nowshiravan Rahatabad received his Master and PhD degree in Biomedical Engineering from Science and Research Branch, Islamic Azad University, Tehran, Iran. He is currently a faculty member of Biomedical Engineering Department, Science and Research Branch, Islamic Azad University. His research interests include Biological Signal Processing, Artificial Neural Networks, Biomedical Image Processing.

Omid Khayat Omid Khayat received his Bachelor and Master degree in Biomedical engineering and Nuclear Engineering from Amirkabir University of Technology, Iran, in 2008 and 2011, respectively. Currently, he studies Nuclear Engineering as a PhD student in Amirkabir University of Technology. His research interests include Evolutionary computation, Artificial Neural Networks, Chaotic Dynamical Systems and Fuzzy logic.

Mohsen Farshad Mohsen Farshad was born in Birjand, Iran, in 1967. He received the BSc degree in Electrical Engineering from Sharif University of Technology, Tehran, Iran, in 1990 and the MSc degree in electrical engineering from the University of Tehran, Tehran, Iran, in 1994. He received

© 2015 Wiley Publishing Ltd

Expert Systems, xxxx 2015, Vol. 00, No. 00

Suggest Documents