Prediction of Automotive Engine Power and Torque Using Least ...

51 downloads 39864 Views 732KB Size Report
vehicle engine can be determined by training the sample data acquired from the ... By knowing the power and torque functions, the automotive engineers can ...
Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

Prediction of Automotive Engine Power and Torque Using Least Squares Support Vector Machines and Bayesian Inference CHI-MAN VONG 1*, PAK-KIN WONG 2, YI-PING LI 1 1

(Department of Computer and Information Science, University of Macau, P.O.Box 3001, Macau, China) 2

(Department of Electromechanical Engineering, University of Macau, P.O.Box 3001, Macau, China)

* E-mail: [email protected], Phone: (853) 3974476

Abstract: Automotive engine power and torque are significantly affected with effective tune-up. Current practice of engine tune-up relies on the experience of the automotive engineer. The engine tune-up is usually done by trial-and-error method, and then the vehicle engine is run on the dynamometer to show the actual engine output power and torque. Obviously the current practice costs a large amount of time and money, and may even fail to tune up the engine optimally because a formal power & torque function of the engine has not been determined yet. With an emerging technique, Least Squares Support Vector Machines (LS-SVM), the approximate power and torque functions of a vehicle engine can be determined by training the sample data acquired from the dynamometer. The number of dynamometer tests for an engine tune-up can therefore be reduced because the estimated engine power and torque functions can replace the dynamometer tests to a certain extent. Besides, Bayesian framework is also applied to infer the hyper-parameters used in LS-SVM so as to eliminate the work of cross-validation, and this leads to a significant reduction in training time. In this paper, the construction, validation and accuracy of the functions are discussed. The study shows that the predicted results are good agreement with the actual test results. To illustrate the significance of the LS-SVM methodology, the results are also compared with that regressed using a multilayer feed forward neural networks Keywords: Automotive engine setup, Least Squares Support vector machines, Bayesian inference, Engine power and torque.

1. INTRODUCTION Modern automotive gasoline engines are controlled by the electronic control unit (ECU). The engine output power and torque are significantly affected by the setup of control 1

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

parameters in the ECU. Many parameters are stored in the ECU using a look-up table/ map (Fig.1). Normally, the data of a car engine and torque is obtained through dynamometer tests. An example of performance data of an engine output horsepower and torque against speeds is shown in Fig. 2. The engine power and torque reflect the dynamic performance of an engine. Traditionally, the setup of ECU is done by the vehicle manufacturer. However, in recent, the programmable ECU and ECU read only memory (ROM) editors have been widely adopted by many passenger cars. These devices allow the non-OEM’s engineers to tune up their engines according to different add-on components and driver’s requirements. Current practice of engine tune-up relies on the experience of the automotive engineer who will handle a huge number of combinations of engine control parameters. The relationship between the input and output parameters of a modern car engine is a complex multi-variable nonlinear function, which is very difficult to be found, because modern automotive engine is an integration of thermo-fluid, electromechanical and computer control systems. Consequently, engine tune-up is usually done by trial-and-error method. Firstly, the engineer guesses an ECU setup based on his/her experience and then stores the setup values in the ECU. Finally, the engine is run on a dynamometer to test the actual engine power and torque. If the performance is loss, the engineer adjusts the ECU setting and repeats the procedure until the performance is satisfactory. That is why vehicle manufacturers normally spend many months to tune-up an ECU optimally for a new car model. Moreover, the power and torque functions are engine dependent as well. Every engine requires doing the similar tune-up procedure. By knowing the power and torque functions, the automotive engineers can predict if a trial ECU setup is gain or loss. The car engine only requires going through a dynamometer test for verification after estimating a satisfactory setup from the functions. Hence the number of unnecessary dynamometer tests for the trail setup can be drastically reduced so as to save a large amount of time and money for testing. Recent researches (Brace, 1998; Traver et al., 1999; Su et al., 2002; Yan et al., 2003; Liu et al., 2004) have described the use of neutral-networks for modeling the diesel engine emission performance based on experimental data. It is well known that a neural network (Bishop, 1995; Haykin, 1999) is a universal estimator. It has, however, two main drawbacks (Smola et al., 1996; Schölkopf and Smola, 2002). (1) The architecture, including the number of hidden neurons, has to be determined a priori or modified while training by heuristic, which results in a non-necessarily optimal network structure. (2) Neural networks can easily be stuck by local minima. Various ways of preventing local minima, like early stopping, weight decay, etc., are employed. However, those methods greatly affect the generalization of the estimated function, i.e., the capacity 2

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

of handling new input cases. Traditional mathematical methods of nonlinear regression (Sen and Srivastava, 1990; Ryan, 1996; Harrell, 2001; Tabachnick and Fidell, 2001; Seber and Wild, 2003) may be applied to construct the engine power and torque functions. However, an engine setup involves too many parameters and data. Constructing the functions in such a high dimensional and nonlinear data space is very difficult for traditional regression methods. With an emerging technique, Support Vector Machines (SVM) (Cristianini and Shawe-Taylor, 2000; Schölkopf and Smola, 2002; Suykens et al., 2002), the issues of high dimensionality as well as the previous drawbacks from neural networks are overcome. Using SVM, the regressed engine power and torque functions can be used for precision prediction so that the number of dynamometer tests can be significantly reduced. The dynamometer tests normally cost a large amount of money and time. Moreover, dynamometer is not always available, particular in the case of on-road fine tune-up. Research on the prediction of modern gasoline engine output power and torque subject to various parameter setups in the ECU are still quite rare, so the use of SVM for modeling of engine output power and torque is the first attempt.

2. SUPPORT VECTOR MACHINES SVM is an interdisciplinary field of machine learning, optimization, statistical learning and generalization theory. Basically it can be used for pattern classification and nonlinear regression (Gunn, 1998). No matter which application, SVM considers the application as a Quadratic Programming (QP) problem for the weights with regularization factor included. Since a QP problem is a convex function, the solution of the QP problem is global (or even unique) instead of a local solution. The advantages of SVM (Smola et al., 1996) as opposed to Neural Networks are: (1) The architecture of the system has not to be determined before training. Input data of any arbitrary dimensionality can be treated with only linear cost in the number of input dimensions. (2) SVM treats regression as a QP problem of minimizing the data fitting error plus regularization, which produces a global (or even unique) solution having minimum fitting error, while high generalization of the estimated function can also be obtained. 2.1 Least squares support vector machine Least squares support vector machines (LS-SVM) (Suykens, et al., 2002) is a variant of SVM, which employs least square errors in the objective function of the optimization problem. SVM solves nonlinear regression problems by means of convex quadratic programs and the sparseness is obtained as a result of this QP problem. However, QP problems are inherently difficult to be solved. Although many commercial packages exist in the world for solving QP problems, it is still preferred to have a simpler 3

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

formulation. LS-SVM is the variant that modifies the original SVM formulation, leading to solving a set of linear equations that is easier to use/solve than QP problems. In addition, LS-SVM requires only two hyper-parameters for Radial Basis Function kernel whereas SVM requires three hyper-parameters. Moreover, the threshold b is returned as part of the LS-SVM solution while on the contrary SVM must calculate the threshold b separately. 2.2 LS-SVM for nonlinear function estimation Consider a data set, D = {(x1, y1), …, (xN, yN)}, with N data points where xk ∈ Rn, y ∈ R, k = 1 to N. LS-SVM deals with the following optimization problem in the primal weight space

1 T 1 N 2  γ min J ( w , e ) w w = + ∑ ek  w ,b ,e P 2 2 k =1  such that ek = y k − [w T ϕ (x k ) + b],

   k = 1,..., N 

(1)

where w ∈ R nh is the weight vector of the target function, e = [e1;…,eN] is the residual vector, and ϕ : R n → R nh is a nonlinear mapping, n is the dimension of xk, and nh is the dimension of the unknown feature space. Solving the dual of Eq. (1) can avoid the high (and unknown) dimensionality of w. The LS-SVM dual formulation of nonlinear function estimation is then expressed as follows (Suykens, et al., 2002): Solve in α, b :    0  1Tv   b  0  (2) 1   =     1 v Ω + I N  α  y    γ     where IN is an N-dimensional identity matrix, y = [y1, …, yN]T, 1v is an (N-1)-dimensional vector = [1,…,1]T, α =[α1,..., αN]T, and γ∈R is a scalar for regularization (which is a hyper-parameter for tuning). The kernel trick is employed as follows:

Ω k ,l = ϕ ( x k ) T ϕ ( x l ) = K ( x k , xl )

k , l = 1, K, N .

where K: predefined kernel function. The resulting LS-SVM model for function estimation becomes

4

(3)

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

y = M ( x) N

= ∑ α k ϕ (x k ) T ϕ (x) + b k =1 N

= ∑ α k K (x k , x) + b k =1

 xk − x = ∑ α k exp −  σ2 k =1  N

2

 +b  

(4)

where αk, b ∈R are the solutions of Eq. (2), xk is training data, x is the new input case, and Radial Basis Function (RBF) is chosen as the kernel function K. From the viewpoint of the current application, some parameters in Eq. (2) are specified as: N : total number of engine setups (data points) xk : engine input control parameters in the kth sample data point k = 1,2…N (i.e. the kth engine setup) yk : engine output torque in the kth sample data point

3. APPLICATION OF LS-SVM TO GASOLINE ENGINE MODELLING In current application, M(x) in Eq. (4) is the torque function of an automotive engine. The power of the engine is calculated based on the engine torque as discussed in Section 4. The issues of LS-SVM for this application domain are discussed in the following sub-sections. 3.1 Schema The training data set is expressed as D = {dk} ={(xk, yk)}, k = 1 to N. Practically, there are many input control parameters and they are also ECU and engine dependent.

Moreover, the engine power and torque curves are normally obtained at full-load condition. For the demonstration purpose of the LS-SVM methodology, the following common adjustable engine parameters and environmental parameter are selected to be the input (i.e., engine setup) at engine full-load condition. x = < Ir, O, tr, f, Jr, d, a, p > and y = where r: Engine speed (RPM) and r ∈ {1000, 1500, 2000, 2500,…, 8000} Ir: Ignition spark advance at the corresponding engine speed r (degree before top

dead centre) O: Overall ignition trim ( ± degree before top dead centre) tr: Fuel injection time at the corresponding engine speed r (millisecond) f: Overall fuel trim ( ± %) Jr: Timing for stopping the fuel injection at the corresponding engine speed r (degree 5

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

before top dead centre) d: Ignition dwell time at 15V (millisecond) a: Air temperature (°C) p: Fuel pressure (Bar) Tr: Engine torque at the corresponding engine speed r (Kg-m)

The engine speed range for this project has been selected from 1000 RPM to 8000 RPM. Although the engine speed r is a continuous variable, in practical ECU setup, the engineer normally fills the setup parameters for each category of engine speed in a map format. The map is usually divided the speed range discretely with interval 500 as shown in Fig. 1, i.e. r = {1000, 1500, 2000, 2500,…}. Therefore, it is unnecessary to build a function across all speeds. Under this reason, r is manually categorized with a specified interval of 500 instead of any integer ranging from 0 to 8000. As some data is engine speed dependent, another notation Dr is used to further specify a data set containing the data with respect to a specific r. For example, D1000 contains the following parameters: , while D8000 contains (Fig.3). Consequently, D is separated into fifteen subsets namely D1000, D1500, …, D8000. An example of the training data (engine setup) for D1000 is shown in Table 1. For every subset Dr, it is passed to the LS-SVM regression module, Eq. (2), one by one in order to construct fifteen torque functions Mr(x) with respective to engine speed r, i.e. Mr(x)=Mr ={M1000, M1500,…M8000}. In this way, the LS-SVM module is run for fifteen times. In every run, a different subset Dr is used as training set to estimate its corresponding torque function. An engine torque against engine speed curve is therefore obtained by fitting a curve that passes through all data points generated by M1000, M1500, M2000,…,M8000.

4. DATA SAMPLING AND IMPLMENTATION In practical engine setup, the automotive engineer determines an initial setup, which can basically start the engine, and then the engine is fine-tuned by adjusting the parameters about the initial setup values. Therefore, the input parameters are sampled based on the data points about an initial setup supplied by the engine manufacturer. In the experiment, a sample data set D of 200 different engine setups along with output torque has been acquired from a Honda B16A DOHC engine controlled by a programmable ECU, MoTeC M4 (Fig. 4), running on a chassis dynamometer (Fig. 5) at wide open throttle. The engine output data is only the torque against the engine speeds because the horsepower HP of an engine is calculated using: HP =

2π × r × 9.81 × T 746 × 60 6

(5)

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

where

HP : Engine horsepower (Hp) r : Engine speed (RPM : Revolution per minute) T: Engine torque (Kg-m)

After collection of sample data set D, for every data subset Dr ⊂ D, it is randomly divided into two sets: TRAINr for training and TESTr for testing, such that Dr = TRAINr ∪TESTr, where TRAINr contains 80% of Dr and TESTr holds the remaining 20% (Fig. 6). Then every TRAINr is sent to the LS-SVM module for training, which has been implemented using LS-SVMlab (Pelckmans et al., 2003), a MATLAB toolbox under MS Windows XP. The implementation and other important issues are discussed in following sub-sections. 4.1 Data pre-processing and post-processing In order to have a more accurate regression result, the data set is conventionally

normalized before training (Pyle, 1999). This prevents any parameter from domination to the output value. For all input and output values, it is necessary to be normalized within the range [0,1], i.e. unit variance, through the following transformation formula: v − v min N (v ) = v * = (6) v max − v min where: vmin and vmax are the minimum and maximum domain values of the input or output parameter v respectively. For example, v∈[7, 39], vmin = 7 and vmax = 39. The limits for each input and output parameter of an engine should be predetermined via a number of experiments or expert knowledge or manufacturer data sheets. As all input values are normalized, the output torque value v* produced by the LS-SVM is not the actual value. It must be de-transformed using the inverse N-1 of Eq. (6) in order to obtain the actual output value v.

4.2 Error function To verify the accuracy of each function of Mr, an error function has been established. For a certain function Mr, the corresponding validation error is:

Er =

1 N

 y k − M r (x k )  ∑   yk k =1   N

2

(7)

where xk ∈ Rn is the engine input parameters of kth data point in a test set or a validation set, yk is the true torque value in the data point dk ( dk = represents the kth data point) and N is the number of data points in the test set or validation set. The error Er is a root-mean-square of the difference between the true torque value yk of a test point dk and its corresponding estimated torque value Mr(xk). The difference is also divided by the true torque yk, so that the result is normalized within the range [0, 1]. 7

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

It can ensure the error Er also lies in that range. Hence the accuracy rate for each torque function of Mr is calculated using the following formula: Accuracyr = (1 − Er ) × 100%

(8)

4.3 Procedures of selection of hyper-parameters According to Eqs. (2) and (4), it can be noted that the user has to adjust two hyper-parameters (γ, σ), where γ is the regularization factor and σ specifies the kernel width. Without knowing the best values for these hyper-parameters, all engine torque functions cannot perform well. In order to select the best values for these hyper-parameters, 10-fold cross validation is usually applied but it takes a very long time. Recently, there is a more sophisticated Bayesian framework (Suykens et al., 2002; Van Gestel et al., 2001a) that can infer the hyper-parameter values for LS-SVM. As Bayesian inference is out of the scope of this research, it is not discussed in detail but only the basic inference procedure is given. The procedure is based on a modified version of LS-SVM program as follows, where µ is now the regularization factor instead of γ, and ζ is the variance of the noise for residual ek (assuming constant variance):

min J P (w, e) = µE w + ζE D  w ,b ,e such that ek = y k − [w T ϕ (x k ) + b],

  k = 1,..., N 

(9)

with 1 T w w 2 1 N 1 N E D = ∑ ek2 = ∑ ( yk − [w T ϕ (x k ) + b]) 2 2 k =1 2 k =1 Ew =

(10)

whose dual program is the same as Eq. (2), where w ∈ R nh is the weight vector of the target function and e = [e1;…,eN] is the residual vector. The relationship of γ with µ and

ζ is γ =

ζ . µ

The Bayesian inference for LS-SVM regression has three inference levels as described below. Each level corresponds to infer different parameters.



Level 1: [inference of parameters w, b] p (D | w, b, µ , ζ , M σ ) p(w, b | D, µ , ζ , M σ ) = . p(w, b | µ , ζ , M σ ) p(D | µ , ζ , M σ )

(11)

where D is the training data set, Mσ = M(x) is the regressed function with a specified (or guessed) kernel width σ, it is similar to Eq. (4): After derivation, the posterior probability Eq. (11) is expressed as (Seeger, 8

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

2004):

ζ N   µ p (w , b | D, µ , ζ , M σ ) ∝ exp − w T w − ∑ ek2 . 2 k =1   2 Minimizing

µ 2

wT w +

ζ

(12)

N

∑e 2

2 k

for w and b gives the solution for the function

k =1

to be estimated and that is exactly what LS-SVM does (Eqs (9) and (10)). So, Level 1 is not necessary in the work for guessing the hyper-parameters, but it just show the complete Bayesian framework.



Level 2: [inference of hyper-parameters µ, ζ] p(D | µ , ζ , M σ ) p( µ , ζ | D, M σ ) = p( µ , ζ | M σ ) p(D | M σ )

(13)

It was shown (MacKay, 1995; Van Gestel et al., 2001b) that calculating Eq. (13) is equivalent to an optimization problem in the hyper-parameter γ =

ζ : µ

N −1 1 min J (γ ) = ∑ log(λG ,i + ) + ( N − 1) log( EW + γE D )

γ

γ

i =1

(14)

where EW + γE D ≈

1 1 1 (y − mˆ y 1v )T VG (DG + I N eff ) −1 VGT (y − mˆ y 1v ) γ 2 2 Empirical mean mˆ y =

1 N

N

∑y

i

i =1

DG = diag ([λG ,1 ,K, λG , N eff ]), and, VG = [ v G ,1 ;K; v G , N eff ]

(15)

The above unknown variables Neff, vG,i, and λG,i are obtained by solving a symmetric eigenvalue problem about a centered Gram matrix G = M ΩM T with M = (I N −

1 1v1Tv ) and Ω as defined in Eq. (3): N Gv G,i = λG ,i v G,i

i = 1,..., N eff ≤ N − 1

(16)

In Eq. (16), Neff is defined as the number of non-zero eigenvalues, λG,i and vG,i are the eigenvalues and eigenvectors of G, respectively. Once the best hyper-parameter value γMP is found (MP stands for Maximum Posterior) using a simple one-variable optimization (e.g., a line search) with Eq. (14) as the objective function, the concerned hyper-parameters µMP and ζMP can be found as well, using Eq. (17), the relationship γMP = ζMP /µMP, and Eq. (15): N −1 µ MP = (17) 2( EW + γ MP E D ) These hyper-parameters µMP and ζMP are used in next level, while the user actually 9

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

is interested in γMP only.



Level 3: [inference of kernel parameter σ] p (D | M σ ) p ( M σ | D) = p(M σ ) p (D)

(18)

where Mσ is the hypothesis, or function, using RBF kernel parameterized with kernel width value σ. The purpose of this step is to optimize the value for σ , so that the regressed function Mσ using value σ as the kernel width is the best fit to the training data set D. It was shown that (MacKay, 1995; Van Gestel et al., 2001b): N eff

N −1 µ MP ζ MP

p ( M σ | D) ∝ p (D | M σ ) ∝

N eff

(19)

(γ eff − 1)( N − γ eff )∏ ( µ MP + ζ MP λG , i ) i =1

where N eff

γ MP λG , i i =1 1 + γ MP λG , i

γ eff = 1 + ∑

(20)

To optimize the value for σ, another line search is usually invoked using Eq. (19) as the objective function. This can be easily done with commercial optimization package such as MATLAB optimization toolbox. 4.4 Training The training data is firstly preprocessed using Eq. (6). Then the hyper-parameters (γ, σ) for the target torque functions are estimated at this point. Since there are fifteen target torque functions, then fifteen individual sets of hyper-parameters (γr, σr) are inferred with respect to r. The detailed inference procedure for a certain training data set TRAINr is listed in Fig. 7. The following paragraph also describes the procedures in detail. In order to find out the best hyper-parameters, a line search on σ is initially performed. A value of σ is guessed for the objective function to be optimized. The objective function for σ is evaluated as follows. Firstly, considering the training data set TRAINr = {(xk, yk)}, k = 1 to N, a matrix Ω is prepared with the guessed σr (the RBF kernel width) where Ωkl = K(xk, xl), and a matrix M = (I N −

1 1v1Tv ) , where xk and xl are the kth and N

lth data points in TRAINr, and N is the number of data points in TRAINr. Then the centered GRAM matrix G = M ΩM T can be calculated. After that, the eigen problem in Eq. (16) is solved using a commercial package such as MATLAB. Two sets of results about eigenvalue λG,i and eigenvectors vG,i of G are returned, where i = 1 to Neff, and Neff is the number of (nonzero) eigenvalues returned in the result. Knowing the values of Neff,

10

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

1 N ∑ yi can N i =1 also be calculated, where yi can be found in the ith data point in TRAINr. So, EW + γE D

vG,i, and λG,i can construct the matrices DG and VG as in Eq.(15). mˆ y =

can be estimated as listed in Eq. (15), where γ is still an unknown scalar. Hence, all necessary information is prepared for Eq. (14) at this stage. To find out the best hyper-parameter γMP, another line search is carried out using Eq. (14) as the objective function. After obtaining the hyper-parameter γMP, another hyper-parameter µMP can be calculated easily from Eq. (17). According to the relationship γMP = ζMP /µMP, ζMP can be known as well. At this point, the information of γMP, ζMP, µMP, λG,i, Neff, and N are known. According to Eq. (20), γeff can be calculated as well. The posterior of the function using the guessed σ can be evaluated from Eq. (19). Up to this stage, only one iteration for the line search on σ

is done. Then, the algorithm goes back and guesses

another value forσ, and evaluates the whole procedure again, until the best σ can be found and returned as σMP. γMP is also returned as one of the solutions. This pair of values (γMP, σMP) is considered to be the best hyper-parameters for the training data set TRAINr, and it is denoted as (γMP,r , σMP,r). After obtaining the inferred hyper-parameters (γMP,r , σMP,r), the training data set TRAINr is used for calculating the support values α and threshold b in Eq. (2). Finally, the target function Mr can be constructed using Eq. (4).

5. RESULTS To illustrate the advantage of LS-SVM regression, the results are compared with that obtained from training a multilayer feedforward neural network (MFN) with backpropagation. Since MFN is a well-known universal estimator, the results from MFN can be considered as a rather standard benchmark. 5.1 LS-SVM Results After obtaining all torque functions for an engine, their accuracies are evaluated one by one against their own test sets TESTr using Eqs. (7) and (8). According to the accuracy obtained in Table 2, the predicted results are in good agreement with the actual test results under their hyper-parameters (γMP,r , σMP,r) inferred using the procedure described in Fig.7. However, it is believed that the function accuracy could be improved by increasing the number of training data. An example of comparison between the predicted and actual engine torque and horsepower under the same ECU configuration is shown in Fig.8. 5.2 MFN results Fifteen neural networks NETr = {NET1000, NET1500,…,NET8000} with respect to engine speed r are built based on the fifteen sets of training data TRAINr = TRr ∪ Validr. TRr is really used for training the corresponding network NETr whereas Validr is used as validation set for early stop of trainings so as to provide better network generalization. 11

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

Every neural network consists of 8 input neurons (the parameters of an engine setup at a certain engine speed r), 1 output neuron (the output torque value Tr), and 50 hidden neurons, which is just a guess. Normally, 50 hidden neurons can provide enough capability to approximate a highly nonlinear function. The activation function used inside the hidden neurons is Tan-Sigmoid Transfer function while for the output neuron, a pure linear filter is employed (Fig.9). The training method employs standard backpropgation algorithm (i.e., gradient descent towards the negative direction of the gradient) so that the results of MFN can be considered as a standard. Learning rate of weight update is set to be 0.05. Each network is trained for 300 epochs. The training results of all NETr are shown in Table 3. The same test sets TESTr are also chosen so that the accuracies of the engine torque functions built by LS-SVM and MFN can be compared reasonably. The average accuracy of each NETr shown in Table 3 is calculated using Eqs. (7) and (8). 5.3 Comparison of results With reference to Tables 2 and 3, SVM outperforms MFN about 7.07% in overall accuracy under the same test sets TESTr. In addition, the issues of hyper-parameters and training time have also been compared. In LS-SVM, two hyper-parameters (γMP ,σMP) are required. They can be guessed using Bayesian inference, which totally eliminates the user burden. In MFN, learning rate and number of hidden neurons are required to be supplied from the users. Surely, these parameters can also be solved by 10-fold cross-validation. However, the users have to predetermine a grid of guessed values for these parameters, and the grid may not cover the best values for the hyper-parameters. Under this reason, LS-SVM could often produce better generalization rate over MFN as indicated in Tables 2 & 3. MFN produces less training error than LS-SVM since there is no regularization factor controlling the tradeoff between training error and generalization. In the contrast, LS-SVM produces much better generalization, about 7.07%, due to the regularization factor γMP introduced in the objective function. Another issue is about the time required for training. Under a 800 MHz Pentium III PC with 512M Byte RAM on board, LS-SVM takes about 25 minutes for training 200 data points of 8 attributes for one time. The Bayesian inference for two hyper-parameters takes about 35 minutes. In other words, fifteen engine torque functions requires (25+35) ×15 = 900 minutes training time. For MFN, an epoch takes about 0.5 minutes and each network takes 300 epochs for training. Consequently, it takes about (300×0.5)×15 = 2250 minutes for fifteen neural networks. According to this estimation, LS-SVM only takes 40% of training time of MFN. The major time reduction is caused by doing one time optimization in LS-SVM as opposed to 300 trainings in MFN. Even LS-SVM compares with standard SVM, LS-SVM require less training time because of elimination of 10-fold cross-validation for guessing hyper-parameters. 12

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

6. CONCLUSIONS LS-SVM method is applied to produce a set of torque function for an automotive engine according to different engine speeds. According to Eq. (5), the engine power is calculated based on the engine torque. In this research, the torque functions are separately regressed based on fifteen sets of sample data acquired from an automotive engine through the chassis dynamometer. The engine torque functions developed are very useful for vehicle fine tune-up because the effect of any trial ECU setup can be predicted to be gain or loss before running the vehicle engine on a dynamometer or road test. If the engine performance with a trial ECU setup can be predicted to be gain, the vehicle engine is then run on a dynamometer for verification. If the engine performance is predicted to be loss, the dynamometer test is unnecessary and another engine setup should be made. Hence the function for prediction can greatly reduce the number of expensive dynamometer tests, which saves not only the time taken for optimal tune-up, but also the large amount of expenditure on fuel, spare parts and lubricants, etc. It is also believed that the function can let the automotive engineer predict if his/her new engine setup is gain or loss during road tests, where the dynamometer is unavailable. Moreover, experiments have been done to indicate the accuracy of the torque functions, and the results are highly satisfactory. In comparison to the traditional neural network method, the LS-SVM method outperforms about 7.07% in overall accuracy and its training time is approximately 60% less than that using neural-networks. From the perspective of automotive engineering, the construction of modern automotive gasoline engine power and torque functions using LS-SVM is a new attempt and this methodology can also be applied to different kinds of vehicle engines. REFERENCES Bishop, C., 1995. Neural Networks for Pattern Recognition. Oxford University Press, New York. Brace, C., 1998. Prediction of Diesel Engine Exhaust Emission using Artificial Neural Networks. IMechE Seminar S591, Neural Networks in Systems Design, U.K. Cristianini, N., Shawe-Taylor, J., 2000. An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, U.K. Gunn, S., 1998. Support Vector Machines for Classification and Regression. ISIS Technical Report ISIS-1-98. Image Speech & Intelligent Systems Research Group, University of Southapton, May, 1998, U.K. Harrell, F., 2001. Regression Modelling Strategies with Applications to Linear Models, Logistic Regression, and Survival Analysis. Springer-Verlag, New York. Haykin, S., 1999. Neural Networks: A comprehensive foundation. Prentice Hall, second 13

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

ed., USA. Liu, Z., Fei, S., 2004. Study of CNG/diesel dual fuel engine’s emissions by means of RBF neural network. Journal of Zhejiang University SCIENCE, 5(8): 960-965. MacKay, D., 1995. Probable Networks and Plausible Predictions – A Review of Practical Bayesian Methods for Supervised Neural Networks. Network Computation in Neural Systems, 6, 469-505. Pelckmans, K., Suykens, J., Van Gestel, T., De Brabanter, J., Lukas, L., Hamers, B., De Moor, B., and Vandewalle, J., 2003. LS-SVMlab: a MATLAB/C toolbox for Least Squares

Support

Vector

Machines.

Available

at

http://www.esat.kuleuven.ac.be/sista/lssvmlab Pyle, D., 1999. Data Preparation for Data Mining. Morgan Kaufmann, USA. Ryan, T., 1996. Modern Regression Methods. Wiley-Interscience, USA. Schölkopf, B., Smola, A., 2002. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, USA. Seber, G., Wild, C., 2003. Nonlinear regression, New Edition. Wiley-Interscience, USA. Seeger, M., 2004. Gaussian processes for machine learning. International Journal of Neural Systems, 14(2):1-38. Sen, A., Srivastava, M., 1990. Regression Analysis: Theory, Methods, and Applications. Springer-Verlag, New York. Smola, A., Burges, C., Drucker, H., Golowich, S., Van Hemmen, L., Muller, K., Scholkopf, B., Vapnik, V., 1996. Regression Estimation with Support Vector Learning Machines, available at http://www.first.gmd.de/~smola Su, S., Yan, Z., Yuan, G., Cao, Y., Zhou, C., 2002. A Method for Prediction In-Cylinder Compound Combustion Emissions. Journal of Zhejiang University SCIENCE, 3(5): 543-548. Suykens, J., Gestel, T., De Brabanter, J., De Moor, B., and Vandewalle, J., 2002. Least Squares Support Vector Machines. World Scientific, Singapore. Tabachnick, B., Fidell, L., 2001. Using Multivariate Statistics. Allyn and Bacon, fourth ed., USA. Traver, M., Atkinson, R. and Atkinson, C., 1999. Neural Network-based Diesel Engine Emissions

Prediction

Using

In-Cylinder

Combustion

Pressure.

SAE

Paper

1999-01-1532. Van Gestel, T., Suykens, J., De Moor, B., Vandewalle, J., 2001a. Automatic relevance determination for least squares support vector machine classifiers, in Proc. of the European Symposium on Artificial Neural Networks (ESANN'2001), Bruges, Belgium, Apr. 2001: 13-18. Van Gestel, T., Suykens, J., Lambrechts, D., Lanckriet, A., Vandaele, G., De Moor, B., Vandewalle, J., 2001b. Predicting Financial Time Series using Least Squares Support 14

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

Vector Machines within the Evidence Framework, IEEE Trans. On Neural Networks, Special Issue on Financial Engineering, 12(4), 809-821. Yan, Z., Zhou, C., Su, S., Liu, Z., Wang, X., 2003. Application of Neural Network in the study of Combustion Rate of Neural Gas/Diesel Dual Fuel Engine, Journal of Zhejiang University SCIENCE, 4(2): 170-174.

15

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

Fig. 1 Example of fuel map in a typical ECU setup where the engine speed (RPM) is discretely divided

Fig. 2 Example of engine output horsepower and torque curves

16

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

D I1000

...

I8000

O

t1000

...

t8000

f

J1000

...

J8000

d

a

p

T1000

...

T8000

D1000 I1000

O

t1000

f

J1000

d

a

p

T1000

I1500

O

t1500

f

J1500

d

a

p

T1500

...

...

... D8000 ...

I8000

O

t8000

f

J8000

d

a

p

T8000

Fig. 3 Division of data set D into 15 subsets Dr according to various engine speeds

Fig. 4 Adjustment of engine input parameters using MoTeC M4 programmable ECU

17

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

Fig. 5 Car engine performance data acquisition on a chassis dynamometer

D1000

80% TRAIN1000

90% TR1000 10% VALID1000

20% TEST1000 .

.

.

.

.

.

D8000

80% TRAIN8000

90% TR8000 10% VALID8000

20% TEST8000 Fig. 6 Further division of data randomly into training sets (TRAINr) and test sets (TESTr)

18

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

Optimize σr with respect to P (TRAIN r | M σ r ) . For each possible σr, calculate the optimal µMP,r ,ζMP,r , and γMP,r as follows: A.

Solve the eigenvalue problem in Eq. (15), getting λG,i ,vG,i, and Neff.

B.

Find the optimal value γMP,r in Eq. (13) using a line search.

C.

Given γMP,r, calculate µMP,r and ζMP,r using Eqs. (16), (14), and the relationship that γMP,r = ζMP,r /µMP,r.

D.

Given γMP,r, and λG,,i, calculate γeff in Eq. (19).

E.

Using Eq. (18), find out the posterior P (TRAIN r | M σ r ) for σr.

F.

If the current σr produces the highest posterior, return the values (γMP,r , σr). Otherwise, choose another value for σr and go back to Step A. The work of guessing next choice of σr can be done by some well-known optimization methods (e.g., line search). Steps A to E just prepare the objective function to be optimized. Fig. 7 Inference procedure for hyper-parameters (γ, σ)

Fig. 8 Example of comparison between predicted and actual engine torque and power

19

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

Fig. 9 Architecture (layer diagram) of every MFN

20

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

Fig. 1 Example of fuel map in a typical ECU setup where the engine speed (RPM) is discretely divided Fig. 2 Example of engine output horsepower and torque curves Fig. 3 Division of data set D into 15 subsets Dr according to various engine speeds Fig. 4 Adjustment of engine input parameters using MoTeC M4 programmable ECU Fig. 5 Car engine performance data acquisition on a chassis dynamometer Fig. 6 Further division of data randomly into training sets (TRAINr) and test sets (TESTr) Fig. 7 Inference procedure for hyper-parameters (γ, σ) Fig. 8 Example of comparison between predicted and actual engine torque and power Fig. 9 Architecture (layer diagram) of every MFN

21

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

I1000 O t1000 f J1000 d

a

p

T1000

d1

8

0

7.1 0 385 3 25 2.8

20

d2

10

2

6.5 0 360 3 25 2.8

11

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

dN

12

0

7.5 3 360 2.7 30 2.8

12

Table 1 Example of training data di in data set D1000

22

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

Torque

γMP,r σMP,r

function Mr

Mean Square Error

Average accuracy

with training set TRAINr

with test set TESTr

M1000

0.28

2.32

0.43%

91.2%

M1500

0.31

8.77

0.65%

91.1%

M2000

0.22

4.91

0.89%

90.5%

M2500

1.14

5.64

0.44%

91.2%

M3000

0.59

2.42

0.32%

91.3%

M3500

0.74

4.37

0.27%

91.6%

M4000

0.98

3.38

0.08%

92.5%

M4500

1.33

5.89

1.25%

84.2%

M5000

0.10 10.71

2.10%

81.1%

M5500

0.49

6.87

1.89%

83.2%

M6000

0.59 10.92

1.24%

88.7%

M6500

1.23

7.43

0.58%

90.0%

M7000

0.43

3.05

0.77%

91.3%

M7500

0.75

6.34

0.66%

90.5%

M8000

0.61

3.28

0.39%

90.4%

0.80%

90.32%

Overall

Table 2 Accuracy of different functions Mr and its corresponding hyper-parameter values

23

Vong, Wong, and Li, Engineering Applications of Artificial Intelligence, vol.19(3):227-297, 2006

Neural network

Training error

Average accuracy with test set

NETr

(Mean square error)

TESTr

NET1000

0.01%

86.1%

NET1500

0.03%

87.2%

NET2000

0.07%

87.9%

NET2500

0.25%

83.4%

NET3000

0.04%

85.5%

NET3500

0.24%

81.4%

NET4000

0.04%

86.3%

NET4500

0.08%

79.3%

NET5000

0.12%

75.2%

NET5500

0.85%

77.3%

NET6000

0.23%

82.9%

NET6500

0.45%

85.3%

NET7000

0.07%

82.4%

NET7500

0.12%

84.8%

NET8000

0.21%

83.8%

Overall 0.19% 83.25% Table 3 Training errors and average accuracy of the fifteen neural networks

24

Suggest Documents