1395
Research of load identification based on multiple-input multiple-output SVM model selection W Mao1*, M Tian2, and G Yan3 1 College of Computer and Information Technology, Henan Normal University, Xinxiang City, People’s Republic of China 2 Management Institute, Xinxiang Medical University, Xinxiang City, People’s Republic of China 3 MOE Key Laboratory of Strength and Vibration, Xi’an Jiaotong University, Xi’an, People’s Republic of China The manuscript was received on 23 March 2011 and was accepted after revision for publication on 24 August 2011. DOI: 10.1177/0954406211423454
Abstract: In this article, the problem of multiple-input multiple-output (MIMO) load identification is addressed. First, load identification is proved in dynamic theory as non-linear MIMO black-box modelling process. Second, considering the effect of hyper-parameters in small-size sample problem, a new MIMO Support Vector Machine (SVM) model selection method based on multi-objective particle swarm optimization is proposed in order to improve the identification’s performance. The proposed method treats the model selection of MIMO SVM as a multi-objective optimization problem, and leave-one-out generalization errors of all output models are minimized simultaneously. Once the Pareto-optimal solutions are found, the SVM model with the best generalization ability is determined. The proposed method is evaluated in the experiment of dynamic load identification on cylinder stochastic vibration system, demonstrating its benefits in comparison to the existing model selection methods in terms of identification accuracy and numerical stability, especially near the peaks. Keywords:
1
load identification, support vector machine, model selection, MIMO model
INTRODUCTION
As an important branch of inverse problem, dynamic load identification plays an important role in the fields of machine design, machinery fault diagnosis, and structural vibration control [1, 2]. The earliest research was developed in the 1970s to optimize aerocraft structural design. Because the actual loads acting on structure are difficult to be measured directly in practice, system features and dynamic responses of structure, e.g. displacement, speed, acceleration, strain, etc., are required to determine the corresponding loads indirectly [3]. More specifically, modal transform and direct inversion methods are two kinds of representative approaches. The *Corresponding author: College of Computer and Information Technology, Henan Normal University, Xinxiang City, 453007, People’s Republic of China. email:
[email protected]
former needs to determine modal matrix in advance, which is often hard to achieve in actual application. The latter is widely used in many engineering fields. Its basic idea is to establish the model of frequency response function (FRF) in frequency domain via Laplace transform first, and then identify the actual loads according to system’s response. This kind of method only requires priori information of FRF matrix and response spectrum. However, high computational cost is generally unavoidable and the generalized inverse of FRF matrix should exist. Due to the inevitability of measurement error, ill-posed FRF matrix often exists in resonance region, which will cause inversion error. Traditionally, some methods, such as regularization, singular value decomposition, and vibration equation decoupling, are used to reduce inversion error, but the numerical precision and stability are usually not as good as expected [4]. It is worth noting that the relationship between load and response only depends on the structure Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
1396
W Mao, M Tian, and G Yan
itself, so the process of load identification can be viewed as non-linear black-box modelling from perspective of data analysis. Therefore, a new idea was introduced for the research: application of machine learning algorithms for dynamic load identification [5]. Cao et al. [6] utilized artificial neural networks to identify the loads acting on aircraft wings. The key step is to establish a regression model by means of some groups of observed response and load data. However, neural network tends to cause over-fitting which cannot get satisfactory results with small-size training data. Because of the solid theoretical foundation based on statistical learning theory [7], support vector machines (SVMs) are able to avoid overfitting the data at the training procedure in smallsample problems [8]. After introducing kernel method [9], SVMs can theoretically generalize well to unseen data. At present, SVM regressions (SVRs) [10] have been widely used in the areas of system identification and parameter estimation [11–13]. However, to the best of our knowledge, there are few of successful applications of SVM on load identification at present. Yang et al. [14] utilized SVM to identify different typical loads acting on cantilever beam and obtained a comparable result of neural network. Mao et al. [15] combined model selection of SVM and regression modelling, and utilized solution path algorithm, a modified version of SVM, to get higher accuracy than other SVM-based methods. For the sake of getting a model with better generalization performance, Hu et al. [16] proposed a new least square SVM (LSSVM) model selection method based on particle swarm optimization (PSO), and evaluated it in load identification experiment of cylinder vibration system. Previous SVM-based solutions addressed the problem of load identification in most cases within singleinput single-output or multiple-input single-output (MISO) perspective. If multiple measuring points are considered simultaneously, load identification will be transformed into a multiple-input multipleoutput (MIMO) system, whose input and output are response and load data collected from measuring points, respectively. In this MIMO system, all measuring points are considered at the same time instead of building a separated regression model for each point. Therefore, the MIMO system is more in accordance with the nature of load identification. As a pioneer research, Pe´rez-Cruz et al. [17] developed an efficient MIMO regression tool, which has its roots in SVM. When facing problems of non-linear channel estimation [18] and biophysical parameter evaluation [19], this MIMO SVM method, called M-SVR, has shown its benefit. However, M-SVR is sensitive to hyper-parameters in some small-size sample Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
problems. A little bias of hyper-parameters will cause large deviation of prediction result. How to choose optimal hyper-parameters is a key issue in M-SVR applications. From a perspective of model selection, a new M-SVR model selection method, which utilizes multi-objective optimization algorithm to minimize leave-one-out (LOO) errors of all measuring points simultaneously, is proposed in this article. After applying this method to load identification problem, more stable and accurate results can be obtained. The rest of this article is organized as follows. In Section 2, a theoretical analysis about MIMO load identification system from dynamic theory is provided, and a brief review to M-SVR is also given. In Section 3, a new model selection method of M-SVR is presented. Section 4 is devoted to computer experiments, where the proposed method is evaluated in load identification problem of cylinder stochastic vibration system, followed by a conclusion of this article in the final section. 2 DYNAMIC LOAD IDENTIFICATION FOR MIMO SYSTEMS 2.1 Theoretical MIMO model of load identification In previous works [15, 16], dynamic load identification has been proved as a black-box modelling process from the perspective of dynamic theory. In this section, the modelling process will be generalized to MIMO system in frequency domain. For linear deterministic system, the relationship between excitation F ð!Þ and response xð!Þ is described as xð!Þ ¼ H ð!ÞF ð!Þ
ð1Þ
where H ð!Þ is FRF. Also, for linear stationary stochastic system, the relationship between cross-power spectrum density matrix SFF ð!Þ and Sxx ð!Þ, which are of loads and response, respectively, can be described as Sxx ð!Þ ¼ Hð!ÞSFF ð!ÞHð!ÞT
ð2Þ
where Hð!Þ is FRF matrix Hð!Þ ¼ ðK !2 M þ i!CÞ1
ð3Þ
where M, K, and C are mass, stiffness, and damping matrices, respectively. Given the response x ð!Þ, the load F ð!Þ can be calculated from the following equation F ð!Þ ¼ H ð!Þ1 xð!Þ
ð4Þ
Similarly, the following equation can be obtained
Research of load identification based on multiple-input multiple-output SVM model selection
1397
SFF ð!Þ ¼ ½Hð!ÞT Hð!Þ1 Hð!ÞT Sxx ð!Þ ½Hð!ÞHð!ÞT Hð!Þ
ð5Þ
According to equations (4) and (5), the large condition number of Hð!Þ will cause numerical instability. For example, maybe the noise existing in response x ð!Þ leads to the failure of regularization method. Under this condition, equations (4) and (5) cannot be used directly. Hence, it is natural to seek the data-aided solutions of equations (4) and (5) by means of a regression tool. Let F ¼ ½F1 , . . . , Fn T be a set of load signals acting on n driving points, and X ¼ ½x1 , . . . , xm T be a set of response signals collected from m measuring points. According to equation (4), F can be determined by the following equation F ¼ H1 mn X
ð6Þ Fig. 1
In the light of the definition of cross-power spectrum density, Sxx ð!Þ and SFF ð!Þ can be expressed by the following functional whose specific forms are out of interests SFF ð!Þ ¼ ðFÞ
ð7Þ
Sxx ð!Þ ¼ ’ðXÞ
ð8Þ
Therefore, equation (5) is converted to the following equation F ¼ 1 ½Hð!ÞT Hð!Þ1 Hð!ÞT ’ðXÞ ½Hð!ÞHð!ÞT Hð!Þ ð9Þ According to equations (6) and (9), there is certainly a functional relationship between load F and response X in linear system. This relationship only depends on system’s characteristic and can be described as F ¼ f ð!, XÞ
ð10Þ
where f ðÞ is either linear or non-linear function. From the viewpoint of multivariate statistical analysis, f ðÞ can be regarded as an MIMO regression model whose input X and output F are of m and n dimensions, respectively. The framework of MIMO load identification is illustrated in Fig. 1. Load set F and response set X mentioned above are used to construct regression model and the unknown load signal F 0 ð!Þ will be identified by putting a new response signal x 0 ð!Þ into this model. 2.2 Non-linear techniques for MIMO modelling At present, most of the SVM-based non-linear blackbox modelling tools concentrate on MISO system. For MIMO system, a traditional modelling method is neural network. As stated in Section 1, the
Framework of MIMO load identification
performance of neural network generally depends on the initial value of neurons and sample size. As an excellent MIMO regression tool, M-SVR, proposed by Pe´rez-Cruz et al. [17], extends Vapnik "-insensitive loss function [10] to MIMO case, and treats all the output together which will improve generalization of model when only scarce samples are available. Therefore, M-SVR is chosen as the basic MIMO regression algorithm in Fig. 1. Here, a brief summary of M-SVR is provided. training samples x 1 , y 1 , . . . , Given a setd of i.i.d. x n , y n g R RQ , M-SVR is formulated as minimization of the following functional [18] min Lp ¼ W, b
Q n X 1X w j 2 þ C L ðui Þ 2 j¼1 i¼1
ð11Þ
where
0 u5" u 2 2u" þ "2 qffiffiffiffiffiffiffiffiffiffi ui ¼ kei k ¼ eTi ei
L ðu Þ ¼
u"
ð12Þ ð13Þ
eTi ¼ y Ti T ðxi ÞW bT , W ¼ w 1 , . . . , w Q , T b ¼ b1 , . . . , bQ : By adopting the cost function L ðuÞ described in equations (12) and (13), M-SVR is capable of finding the dependencies between outputs, and can take advantage of the information of all outputs to get a robust solution. As equation (11) cannot be solved straightforwardly, an iterative method, named iteratively reweighed least squares (IRWLS), is utilized in reference [18] to obtain the desired solution. By Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
1398
W Mao, M Tian, and G Yan
introducing a first-order Taylor expansion of cost function L ðuÞ, the objective of equation (11) will be approximated by the following equation Lp0 ðW, bÞ ¼ ( where ai ¼
Q n X 1X w j 2 þ 1 ai ui2 þ CT 2 j¼1 2 i¼1
ð14Þ
0 uik 5 " , CT is constant term 2C ðuik "Þ uik " uk i
which does not depend on W and b, and the superscript k denotes kth iteration. To optimize equation (14), an IRWLS procedure is constructed which linearly searched the next step solution along the descending direction based on the previous solution [18]. According to the Representer Theorem [9], the best solution of minimization of equation (14) in feature space can be P expressed as w j ¼ i ðx i Þbj ¼ (T bj , so the target of M-SVR is transformed into finding the best b and b. The IRWLS of M-SVR can be summarized in the following steps [18]. Step 1: Set k ¼ 0, bk ¼ 0, bk ¼ 0. Calculate uik and ai . Step 2: Compute the solution bs and bs according to the next equation
j j y K þ D1 1 b a ¼ j ¼ 1, . . . , Q ð15Þ aT y j aT K 1T a b j where a ¼ ½a1 , . . . , an T , ðDa Þij ¼ ai i j , and K is the kernel matrix. Define the corresponding descending " # W s W k k T . direction P ¼ bs bk Step 3: Use a backtracking algorithm to compute bkþ1 and bkþ1 , and further obtain uikþ1 and ai . Go back to Step 2 until convergence. The convergence proof of the above algorithm is given in reference [18]. Once convergence is reached, the jth column of b and b will construct the regressor of jth output. Similar to standard support vector regression, the samples whose absolute error equal to or greater than " are named support vector. Because uik and ai are computed by means of every dimension of y, each individual regressor contains the information of all outputs which can improve the prediction performance. 3
M-SVR MODEL SELECTION BASED ON MULTI-OBJECTIVE OPTIMIZATION
Similar to other SVMs, M-SVR also needs to choose the optimal hyper-parameter due to its great effect on prediction. As key part of model selection, choosing optimal parameters can be essentially considered as a Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
global optimization problem where the evaluation of generalization performance conducts as fitness function. In reference [17], eightfold cross-validation (CV) is adopted to carry out model selection of MSVR. In this method, 2 mean square error MSE ¼ PNt on all outputs was used. i¼1 y i Wðx i Þ b However, it is difficult to consider the prediction performance of each output separately. For example, a low overall prediction error may be mean value of high errors on some outputs and lower errors on others. In addition, based on the search strategy of model selection such as grid search or PSO [20], the MSE on all outputs could increase possibility of fitness calculation, which will further increase the risk of falling into local extremum. Therefore, the present SVM model selection methods for MISO system could not be well applied to M-SVR. If generalization evaluations on all the outputs are minimized simultaneously, a more effective M-SVR model can be obtained. In this section, a new model selection method for M-SVR will be developed by introducing multi-objective optimization algorithm. 3.1 Optimization strategy When applying M-SVR to identify loads distributed on multiple points, there maybe exist some conflicting identification results, i.e. the results of some points can be improved merely at the cost of worse results of other points. The goal of multi-objective optimization algorithm is to find a set of non-dominated solutions rather than single global best individual in single-objective optimization. A solution is called non-dominated or Pareto-optimal if no objective can be improved without at least one of the others becoming worse. In this section, PSO is chosen as the basic algorithm due to its advantages such as few parameters, simple structure, and high convergence speed. More importantly, PSO can search solutions in multiple inner parallel directions; so, it is easy to extend PSO to multi-objective problem. In particular, a novel multi-objective PSO, named multi-objective comprehensive learning PSO (MOCLPSO) [21], is chosen as practical realization. MOCLPSO can exploit all particles’ historical best information effectively, and further adopts a crowding distance-based archive maintenance strategy to update particles’ velocity and position. Therefore, diversity of the swarm can be improved to avoid premature convergence efficiently. Its basic idea is described as follows [21]: MOCLPSO uses an external archive B to store the set of non-dominated solutions obtained at each generation and uses an archive A to store the best set found so far, and then compares two sets one by one. If the solution x in B is dominated by
Research of load identification based on multiple-input multiple-output SVM model selection
a member of A, reject x; if x dominates a subset C of A, then A ¼ AnC, A ¼ A [ fx g. Huang et al. [21] gave a detailed derivation of this algorithm. 3.2 Fitness function According to Fig. 1, model selection of MIMO SVM for load identification aims at maximizing generalization ability of all outputs. In general, the CV errors are widely used as generalization performance measure of SVR. As a specific case of CV, LOO error is described as n
2 1X yi f^i ðxi Þ ð16Þ LOOðÞ ¼ n i¼1
where is the hyper-parameter, yi the observation of ith sample, and f^i the prediction of ith sample calculated from the other n 1 samples. As proved by Kohavi [22], LOO estimator is approximately unbiased. Therefore, in this article, LOO error is utilized to estimate the generalization performance of M-SVR for each output. Hence, equation (16) is adopted as the fitness function of MOCLPSO. 3.3 Algorithm description For M-SVR, there are three hyper-parameters, regularization parameter C , kernel parameter , and error parameter ". Here, C controls the trade-off between empirical risk and regularized term, determines similarity between samples in feature space, and " defines the width of insensitive tube. Defining ¼ ðC , , "Þ, model selection of M-SVR aims at finding the parameter combination 0 which maximizes LOO errors of all Q outputs simultaneously. It can be described as follows 0 ¼ arg min E ðÞ ¼ arg min LOO1 ðÞ, . . . , LOOQ ðÞ
Fig. 2
Flowchart of model selection of M-SVR based on MOCLPSO
Fig. 3
1399
ð17Þ Obviously, equation (17) is a typical multi-objective optimization problem. In this section, MOCLPSO is used to solve equation (16). To get more stable solutions, binary encoding strategy is adopted and individual of swarm is set as 0 ¼ log C , log , log " . After a set of non-dominated solution has been obtained, the best parameter should be selected according to
Experimental setup of cylinder stochastic vibration system: (a) cylinder structure assembled on the shaker, (b) four force sensors, and (c) control scheme of whole system Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
1400
W Mao, M Tian, and G Yan
the specific requirement. The flowchart of the proposed algorithm is illustrated in Fig. 2. 4 LOAD IDENTIFICATION BASED ON M-SVR MODEL SELECTION
Fig. 4
Finite element model of cylinder shell with steel clamp
(a)
In order to verify the feasibility of the proposed method, a simplified model is chosen as the research object. Without considering internal component, an aircraft can be simplified as a cylindrical structure. Therefore, cylinder shell can be considered as a typical structure in the fields of aeronautics and mechanical manufacture. In this section, a number of experiments on cylinder stochastic vibration system are conducted to show the benefits of M-SVR model selection method in load identification. Gaussian RBF kernel is thus used and defined as kðx, x 0 Þ ¼ 0 2 k expð kxx Þ. Then, the hyper-parameters to be opti2 2 mized are hC , , "i. Each of the input variables x and output y is rescaled linearly to the range ½1, þ 1.
(b)
(c)
Fig. 5
First three radial vibration modes of cylinder shell
Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
Research of load identification based on multiple-input multiple-output SVM model selection
4.1 Data source In this section, a number of random vibration experiments of cylinder shell system are conducted. Although the cylinder shell vibration system has been introduced in previous work [16], the experimental setup and data collection process are still described briefly for the sake of paper’s integrity. The cylinder stochastic vibration system is composed of a cylinder shell, a clamp, and a shaker, as shown in Fig. 3. The cylinder shell shown in Fig. 3(a) is made of steel, with the outer diameter of 370 mm, the inner diameter of 365 mm, and the height of 370 mm. The clamp is made of steel, with the height of 10 mm, the diameter of 380 mm, and connected with cylindrical shell by 18 bolts. The whole structure is fixed on the shaker through four force sensors. Accelerations of the shaking table surface (Fig. 3(c)) are selected as feedback to control the shaking table, in order to make sure the excited input to the cylinder is furthest identical to the expected value. To avoid collecting a low response signal from the accelerometers set on the nodal lines, modal analysis
Fig. 6
1401
is performed using both finite element simulation and modal impact testing. The modal impact testing is conducted using impact hammer excitation, and its target is to verify the natural frequency and vibration mode determined by the finite element method. The finite element model of cylinder shell is in Fig. 4, and the radial vibration modes are shown in Fig. 5.
Fig. 7
One group of acceleration signals collected from node 4
Locations of four accelerometers on cylinder shell Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
1402
W Mao, M Tian, and G Yan
As shown in Fig. 5, there exist some nodal lines on cylinder shell which are hard to be excited. Consequently, the response signal collected from accelerometers on these lines will be too low to analyse efficiently. In many real-world engineering applications, some apparatuses are placed in structure for their own need. Here, the cylinder shell plays as a typical structure, which just simplifies the specific object, e.g. driving cab and pilot house. Moreover, in this article, the method of dynamic load identification is mainly researched from the theoretical perspective; so, the location of accelerometer should be kept away from nodal lines. The locations of four accelerometers are shown in Fig. 6. After the location of accelerometer has been determined, the vibration accelerations in axial directions are recorded from four accelerometers and preprocessed by LMS Test.Lab. 30 groups of stochastic loads (drive currents) with various spectrum patterns are added to drive the shaking table. Correspondingly, 30 groups of accelerations are recorded as response and then transformed into the power spectral densities (PSD) with sampling frequency of 4096 Hz and frequency interval of 1 Hz. For simplicity, one group of acceleration signals collected from node 4 is illustrated in Fig. 7. As mentioned above, PSD of actual loads of the cylinder structure are recorded by the four force sensors, as shown in Fig. 3(b). Each group of PSD contains four dimensions.
called here PSO–MSVR. The other two methods are both model selection of SVR per dimension, and were presented in references [16, 23], respectively. The key parts of these two methods are adopting LOO error bound of LSSVM and Radius-Margin (RM) bound of SVR as criteria for model selection. In this section, they are called LSSVM and SVR, respectively. Correspondingly, the proposed method is called here MOPSO–MSVR. Three evaluation indices, root mean square error (RMSE), average percentage error (APE), and maximum percentage error (MPE), are adopted to test the proposed method, as listed in Table 1. The actual loads measured by four force sensors are denoted by forces 1–4, respectively. The 29th load in
4.2 Identification framework Adopting the collected PSD data of response and load as input and output observations, respectively, the load identification model shown in Fig. 1 is converted to a M-SVR model with n ¼ 4 and m ¼ 4. A group of observation is selected randomly as target load, and other 29 groups are for training M-SVR model. It is worth noting that equation (10) is based on the single frequency !; so, the target load should be identified frequency by frequency in the whole band. The response value of the j node produced by ith group of training load on kHz is denoted by xjik . The framework of load identification based on M-SVR model selection is illustrated in Fig. 8.
Fig. 8
Framework of load identification based on MSVR model selection
4.3 Numerical results Table 1
In this section, the proposed method is compared with three methods of SVR model selection. The first one is common M-SVR model selection method P t 2 used which utilizes MSE ¼ N i¼1 y i Wðx i Þ b in reference [17] to compute uniform LOO error on all outputs and adopts PSO as minimization strategy, Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
Error
Three error evaluation indices of load identification RMSE ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rP n
Expression
i¼1
ð yi y^ i Þ2 n
APE 1 n
n P jyi y^i j jyi j i¼1
MPE max j ji y j i j , y y^ i
i ¼ 1, . . . , n
Research of load identification based on multiple-input multiple-output SVM model selection
proposed method gets higher precision than PSO– MSVR, LSSVM, and SVR in terms of two error indices, MPE and APE, and has comparable results with LSSVM in RMSE index, especially for APE which can mainly represent modelling performance, the proposed method obtains smaller errors than all three traditional methods. In the light of comparative results, the proposed method is superior to PSO– MSVR. The RMSE and APE of SVR are far worse than others, which is probably caused by applying RM bound. In reference [23], RM bound has been proved that it is not a very tight approximation of LOO. As a result, model selection of SVR cannot obtain a satisfactory performance. On the other hand, the APE values of LSSVM are slightly worse than those of MOPSO–MSVR. There are two possible reasons. One reason is that MOPSO–MSVR considers the generalization ability of every output model simultaneously, and particle’s searching ability is better than the traditional PSO. Note that PSO is utilized as search strategy in LSSVM and SVR. Another reason is M-SVR can exploit the relatedness between
total 30 groups of loads is selected randomly to identify, and the identification performances of PSO– MSVR, LSSVM, SVR, and MOPSO–MSVR for forces 1–4 are illustrated in Figs 9 to 12, respectively. The corresponding identification errors are listed in Table 2. As illustrated in Figs 9 to 12, the load curve identified by MOPSO–MSVR fits the true load curve closer than PSO–MSVR, LSSVM, and SVR, especially at main central peak and two side peaks. Figure 9 shows that, although the prediction curve of PSO–MSVR can approximate the true load to a certain extent, it fluctuates at side peaks and some frequencies. Similarly, the prediction curve of LSSVM in Fig. 10 is close to the true load curve at peaks, but it also fluctuates at other frequencies, especially in the frequency range of ½600, 800. In addition, the curve of SVR deviates the true load curve at side peaks to a large extent. Clearly, the proposed method performs more stable than the traditional model selection methods. The comparative results listed in Table 2 also verify the illustrative comparison. As shown in Table 2, the
(a)
(b)
(c)
(d)
Fig. 9
1403
Identification performance of PSO–MSVR for (a) force 1, (b) force 2, (c) force 3, and (d) force 4 Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
1404
W Mao, M Tian, and G Yan
(a)
(b)
(c)
(d)
Fig. 10
Identification performance of LSSVM for (a) force 1, (b) force 2, (c) force 3, and (d) force 4
all outputs and further obtain more information than single SVR per dimension, which can compensate small-size sample to some extents. Therefore, MOPSO–MSVR has more benefits in obtaining accurate solutions than LSSVM in small-size sample MIMO problems. It is worthy to note that the solution of LSSVM is all better than PSO–MSVR, and even its RMSE on forces 3 and 4 is the lowest in all four methods. This phenomenon demonstrates that single SVR per dimension can also obtain good solutions via complete model selection in MIMO problems. However, due to inability to exploit the relatedness between different dimensions and share the information of each output, LSSVM can hardly get fairly good results on every output. Although LSSVM obtains comparable results, MOPSO–MSVR can get more stable solutions than LSSVM in small-size sample MIMO problems. Besides, computational costs of four methods are not tested in this experiment. Obviously, the model selection of single SVR per dimension needs to run Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
four times than MOPSO–MSVR if the evaluation numbers of fitness function are set equal. Also, MSVR generally runs in very high speed. Therefore, a conclusion can be drawn that, although the proposed method MOPSO–MSVR may not be statistically significant in some error indices, it is still competitive in terms of accuracy, stability, and computational cost. To show the benefit of the proposed method in detail, a side peak (447 Hz) is chosen randomly for microscopic analysis. First, the necessity of model selection for M-SVR is verified. The contour map of APE error of M-SVR with respect to regularization parameter C and kernel parameter is illustrated in Fig. 13. Under space limitation, only the distribution on force 1 is provided. As shown in Fig. 13, the APE error of M-SVR is a typical multimodal distribution, which implies that choosing optimal hyper-parameter becomes a multimodal optimization problem. Therefore, it is necessary to perform model selection via introducing optimization algorithm, e.g. PSO and MOPSO.
Research of load identification based on multiple-input multiple-output SVM model selection
(a)
(b)
(c)
(d)
Fig. 11
1405
Identification performance of SVR for (a) force 1, (b) force 2, (c) force 3, and (d) force 4
To verify the necessity of applying MOPSO, contour map of LOO estimates of M-SVR with respect to regularization parameter C and kernel parameter are plotted in Fig. 14. For simplicity, only LOO data on forces 1 and 4 are presented. In Fig. 14, amplified images are provided so that the contradiction between two outputs can be expressed clearly. As shown in Fig. 14, the shapes around minimal point of two outputs are different. Specially speaking, the minimum of LOO in Fig. 14(a) is 0.2076 with ¼ 2:72ðexpð1ÞÞ and C ¼ 2:72, while the minimum of LOO in Fig. 14(b) is 0.1382 with ¼ 20:08ðexpð3ÞÞ and C ¼ 403:42ðexpð6ÞÞ. Clearly, the optimal hyper-parameters of two dimensions are not same, which implies there exists a degree of contradiction of determining the uniform optimal hyper-parameter for all outputs. Therefore, it is necessary to introduce multi-object optimization to solve this problem. For the sake of illustration, the LOO errors obtained by MOPSO–MSVR and PSO–SVR for forces 1 and 4 at 447 Hz are drawn in Fig. 15.
As shown in Fig. 15, MOPSO–MSVR gets three nondominated solutions which construct an approximate Pareto-optimal front. That means, the LOO error of force 1 cannot be improved without the error of force 4 getting worse. In other words, the generalization errors of two outputs reach minimization at the same time. In addition, the LOO error obtained by PSO–MSVR is higher than MOPSO–MSVR, which means that adopting LOO errors of all four outputs as fitness value increases the risk of falling into local minimization. It has also been proved by the corresponding prediction results on 447 Hz. On force 1, the APE of MOPSO–MSVR is 10.90 per cent while PSO– MSVR is 28.61 per cent, and on force 4, the APE of MOPSO–MSVR is 24.79 per cent while PSO–MSVR is 40.72 per cent. The best hyper-parameter hC , , "i obtained by PSO–MSVR is h222:7, 1:61, 0:0013i, while the one obtained by MOPSO–MSVR is h180:02, 1:48, 0:0047i. As an important evaluation of model’s generalization ability, the number of support vectors (NSV) of four methods is also examined. To evaluate the Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
1406
W Mao, M Tian, and G Yan
(a)
(b)
(d)
(c)
Fig. 12
Identification performance of MOPSO–MSVR for (a) force 1, (b) force 2, (c) force 3, and (d) force 4 Table 2
PSO–MSVR LSSVM SVR MOPSO–MSVR
RMSE APE (%) MPE (%) RMSE APE (%) MPE (%) RMSE APE (%) MPE (%) RMSE APE (%) MPE (%)
Comparative results of load identification Force 1
Force 2
Force 3
Force 4
0.8481 22.98 385.72 0.3867 19.24 377.36 1.608 29.04 685.73 0.3638 18.33 387.79
1.0843 22.05 345.91 0.3101 17.85 321.10 2.490 26.44 623.86 0.3058 17.29 310.08
1.6671 21.50 335.90 0.6247 17.27 293.97 2.692 26.81 559.94 0.6861 16.67 304.41
0.7340 23.94 460.59 0.3433 20.47 453.22 1.171 35.12 791.28 0.3646 18.55 429.68
The minimal errors are stressed in bold.
performance as unbiased as possible, the target load in Fig. 8 is selected sequentially in total 30 groups of loads, and for each target load, NSV of four methods are calculated. Specially, NSV of SVR is the mean of NSV obtained by single SVR on four dimensions. The comparative results of NSV are illustrated in Fig. 16. Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
As shown in Fig. 16, MOPSO–MSVR roughly gets the similar NSV with PSO–MSVR and SVR. In most cases, the NSV of MOPSO–MSVR is slightly lower than PSO–MSVR, which implies the proposed method can obtain better generalization ability than PSO–MSVR. However, even though the NSV of SVR is
Research of load identification based on multiple-input multiple-output SVM model selection
Fig. 13
Fig. 14
Distribution of APE of M-SVR on force 1
Distribution of LOO of M-SVR on (a) force 1 and (b) force 4
also lower than MOPSO–MSVR, its identification performance shown in Fig. 11 is much worse. One possible reason is that SVR has good sparsity, but it cannot make use of the information of other dimensions. As a result, lower NSV on each dimension is
1407
Fig. 15
LOO errors obtained by MOPSO–MSVR and PSO–MSVR
Fig. 16
NSV of four methods for different target loads
inferior to slightly higher uniform NSV on all four dimensions in this experiment. In addition, the NSV of LSSVM is same as the number of training samples (29). It is because LSSVM lacks sparsity due to equation constraint. Besides LSSVM, all three other methods get large NSV, so that they can gain information as much as possible in small-size training set. It is worth noting that in Table 2, MPEs of four methods are both large, but MOPSO–MSVR can weaken extremum to some extent. After analysis, the large MPE values are perhaps relevant to the modelling method listed in Fig. 4. The present modelling data are small-scale, which not only lead to sensibility of M-SVR to hyper-parameters, but also could not provide enough information for modelling at extremum point. If the adjacent frequency point can be introduced to increase useful information in the modelling procedure, the prediction performance Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
1408
W Mao, M Tian, and G Yan
at extremum point will be further increased. It will be studied in our future work. 5
CONCLUSION
The load identification methods based on machine learning rest on the following knowledge: once a dynamical system has been defined, the relationship between load and response is intrinsic and will be determined as long as the structure and boundary condition are determined. The present research of SVM based load identification mainly focused on the MISO model. Considering the superior performance of M-SVR, M-SVR is first introduced to establish a MIMO identification model. In addition, in numerical experiments, M-SVR only needs much fewer computational costs than single SVR. Second, considering the sensibility of M-SVR to hyper-parameters, a new M-SVR model selection method based on multi-objective optimization is presented to further decrease the error of load identification. By minimizing the generalization error of all output models at the same time, lower prediction error than the traditional model selection methods can be obtained. From a practical point of view, this method can meet the needs of load identification effectively, and outperform the present methods in terms of identification accuracy and numerical stability. It is worth noting that the responses and loads of different measure points on cylinder shell are somewhat similar in illustration. It is caused by the symmetry of cylinder geometry and location of force sensors. The aim of this article is to present a new efficient method of load identification from theoretical perspective. In engineering application, specific structure and different excitation points will bring various shapes of response and load, and further give a more apparent expression to the performance of the proposed method. FUNDING This research received no specific grant from any funding agency in the public, commercial, or notfor-profit sectors. ACKNOWLEDGEMENTS This study was supported by the National Natural Science Foundation of China (60873104). The author Suganthan from reference 21 is also thanked for providing implementation of MOCLPSO. ß Authors 2011 Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science
REFERENCES 1 Chan, T. H. T., Yu, L., and Law, S. S. Comparative studies on moving force identification from bridge strains in laboratory. J. Sound. Vib., 2000, 235(1), 87–104. 2 Mo¨ller, P. W. M. Load identification through structural modification. J. Appl. Mech., 1999, 66(1), 236–241. 3 Stevens, K. K. Force identification problems - an overview. In Proceedings of SEM Spring Conference on Experimental mechanics, Houston, June 14–19 1987, pp. 838–844. 4 Uhl, T. The inverse identification problem and its technical application. Arch. Appl. Mech., 2007, 77, 325–337. 5 Sjoberg, J., Zhang, Q., and Ljung, L. Nonlinear black-box modeling in system identification: a unified overview. Automatica, 1995, 31(12), 1691–1724. 6 Cao, X., Sugiyama, Y., and Mitsui, Y. Application of artificial neural networks to load identification. Comput. Struct., 1998, 69(1), 63–78. 7 Vapnik, V. N. The nature of statistical learning theory, 1995 (Springer-Verlag, New York). 8 Bishop, C. M. Pattern recognition and machine learning, 2006 (Springer-Verlag, New York). ¨ lkopf, B. and Smola, A. J. Learning with kernels, 9 Scho 2002 (MIT Press, Cambridge, Massachusetts). 10 Smola, A. J. and Scho¨lkopf, B. A tutorial on support vector regression. Stat. Comput., 2004, 14, 199–222. 11 Li, J. and Liu, J. Identification of dynamic systems using support vector regression neural networks. J. Southeast Univ., 2006, 2, 228–233. 12 Drezet, P. M. L. and Harrison, R. F. Support vector machines for system identification. In UKACC International Conference on CONTROL’98, 1–4 September 1998, pp. 688–692. 13 Shen, S., Wang, G., and Chen, H. Support vector machine based identification of inverse dynamic model of thermal system and its application. In Proceedings of the International Conference on Power engineering, vol. 10, Hangzhou, People’s Republic of China, 23–27 October 2007, pp. 914–917. 14 Yang, J., Min, L., and Zhou, C. Dynamic load identification using support vector regression machine. Zhendong Ceshi Yu Zhenduan, 2006, 26, 258–261. ((in Chinese). 15 Mao, W., Hu, D., and Yan, G. A new SVM regression approach for mechanical load identification. Int. J. Appl. Electrom., 2010, 33, 1001–1008. 16 Hu, D., Mao, W., and Zhao, J. Application of LSSVMPSO to load identification in frequency domain. In Proceedings of Artificial Intelligence and Computational Intelligence, Shanghai, People’s Republic of China, 7–8 November 2009, pp. 231–240. 17 Pe´rez-Cruz, F., Camps-Valls, G., Soria-Olivas, E., Jose´ Pe´rez-Ruixo, J., Figueiras-Vidal, A. R., and Arte´s-Rodrı´guez, A. Multi-dimensional function approximation and regression estimation. In ICANN ’02 Proceedings of the International
Research of load identification based on multiple-input multiple-output SVM model selection
Conference on Artificial Neural Networks, Lecture Notes in Computer Science, vol. 2415, Madrid, Spain, 28–30 August 2002, pp. 757–762. 18 Sa´nchez-ferna´ndez, M., De-prado-cumplido, M., Arenas-garcı´a, J., and Pe´rez-Cruz, F. SVM multiregression for nonlinear channel estimation in multiple-input multiple-output systems. IEEE Trans. Signal. Process., 2004, 52(8), 2298–2307. 19 Tuia, D., Verrelst, J., Alonso, L., Pe´rez-Cruz, F., and Camps-Valls, G. Multioutput support vector regression for remote sensing biophysical parameter estimation. Geosci. Rem. Sens. Lett. IEEE, 2011, 8, 804–808. 20 Guo, X. C., Yang, J. H., Wu, C. G., Wang, C. Y., and Liang, Y. C. A novel LS-SVMs hyper-parameter
1409
selection based on particle swarm optimization. Neurocomputing, 2008, 71, 3211–3215. 21 Huang, V. L., Suganthan, P. N., and Liang, J. J. Comprehensive learning particle swarm optimizer for solving multiobjective optimization problems. Int. J. Intell. Syst., 2006, 21, 209–226. 22 Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In the 14th International Conference on Artificial Intelligence (IJCAI), Montre´al, Canada, August 20– 25, 1995, pp. 1137–1143. 23 Chang, M. W. and Lin, C. J. Leave-one-out bounds for support vector regression model selection. Neur. Comput., 2005, 17(5), 1188–1222.
Proc. IMechE Vol. 226 Part C: J. Mechanical Engineering Science