Genetic Algorithms Solve Combinatorial Optimisation Problems in the Calibration of Combustion Engines K. Knoeder1, J. Poland1, A. Mitterer2 and A. Zell1 1 University of Tuebingen, Computer Science Department D-72076 Tuebingen, Germany
[email protected] 2
BMW Group Munich D-80788 Muenchen, Germany
[email protected]
Abstract Several combinatorial optimisation problems occur during the calibration of combustion engines. In this work, it is shown that three particular process steps benefit from genetic algorithms: First, the D-optimal experimental design is improved by the use of an appropriate crossover operator. Thereby the heuristics DETMAX or k-exchange perform a local search. The second problem concerns the optimal test bed scheduling for a more efficient and thus less expensive execution of measurements. This higher dimensional variant of the Travelling Salesman Problem (TSP) is solved by a hybrid genetic algorithm using adjacency coded individuals and a 2-opt heuristic as a local search. Finally, well-defined look-up tables, that lead to smooth maps, are composed from multiple valued look-up tables. Again a genetic algorithm finds better solutions than local search heuristics.
1 Introduction Constantly decreasing legal upper bounds for exhaust emissions and fuel consumption as well as customer requests for economy and performance constitute a permanent challenge for car manufacturers. Without the introduction of additional engine functions, there is no way to meet both the legal and the customer demands. As a consequence the number of adjustable engine actuators and parameters vastly increase. These variables are controlled by an electronic control unit. In order to manage the calibration of this unit with acceptable expenditure of time and man power, more tasks need to be performed automatically. In this work, (hybrid) genetic algorithms (see e.g. [1],[2],[3]) are used to solve three combinatorial problems that occur during the calibration process of combustion engines. Figure 1 visualises the main parts of a modern calibration workflow. In the first phase, design of experiment (DOE) techniques are used to determine measuring points that lead to an optimal polynomial engine model (the shaded block in the upper left corner of Figure 1. The calculation of an optimal test bed scheduling achieves a faster measuring process. It takes into account how the
engine and the test bed behave when the parameter settings are changed (the shaded block in the middle of Figure 1). This problem is a higher dimensional TSP variant, where a best way through the engine's parameter space is calculated. In the following step, the engine is modelled by artificial neural networks on one side and multivariate regression models on the other. The input variables for these models are the engine speed, the relative air-mass flow, the inlet and the exhaust valve spread, and the ignition timing angle. The outputs are the fuel consumption and the exhaust emissions. The engine speed and the air-mass flow define the operating range. The other parameters are controlled by an electronic control unit with respect to the presently detected position on the operating range. In the next step, the input combinations that lead to the best outputs are determined using evolutionary strategies (see e.g. [3]) or sequential quadratic programming techniques (see e.g. [4]) with constraint handling (see e.g. [5]). Verification measurements at the test bed compare the quality of the modelling and optimisation results with the real physical engine behaviour. In the final step, the validated optimal parameters are converted to look-up tables that are stored within electronic control units (the shaded block on the right side in Figure 1).
Figure 1: Main parts of a calibration process for modern combustion engines. The tasks corresponding to the shaded blocks, i.e. design of experiment, Optimal Test Bed Scheduling, and Look-up Table Calculation are automated using genetic algorithms. A deeper insight into the calibration workflow is given in [6], [7], and [8]. The tasks related to the shaded blocks in Figure 1 are explained in the next sections. In section 2 a suitable crossover operator is described which assists the DETMAX or the k-exchange heuristics to construct better D-optimal design of experiments ([9]). The problem described in section 3 is to arrange measuring points in an optimal execution order before starting the measurement process itself. A genetic algorithm combined with the 2-opt local search ([10],[11]) solve this TSP variant ([12]). The fine tuning of look-up tables for electronic control units is described in section 4. Different approaches for the smoothing of maps defined by multiple valued lookup tables are suggested ([13],[14]).
2 D-optimal Design of Experiments Since the amount of adjustable engine actuators constantly increases, it is no longer economical to execute a full factorial grid of measurements. Instead, parts of the system are modelled and optimised in an off-line process at the computer. The goal of the design of experiments (DOE) is to select a set of measuring points in such a way, that an optimal estimate for a polynomial model of a predetermined order is achieved ([15],[6]). Assumed there are n candidates x1,K ,xn defined by n points
u1,K ,u n in the parameter space and by a polynomial model. For the 2-dimensional case the j = 1K p candidates x j = (1,u1 j ,u 2 j ,u12j ,u 22 j ,u1 j u 2 j ) t define a 2-nd order
model for the points (u1 j ,u 2 j ) pj=1 . By the choice of p < n candidates indicated by ξ = ( j1,K ,j p ) ∈ {1K p} p , the design matrix is defined by X ξ = ( x j1 ,K ,x j p ) t .
Consider a model y = Xξ β + ε
which is linear in the coefficients, where ε is a random vector with distribution
N (0,σ 2 ⋅ Id ) and y is the observation vector of size p. A least square estimate,
(
βˆ = X ξt X ξ
)
−1
X ξt y
is optimised by its minimal covariance matrix ( X ξt X ξ ) −1 σ 2 using appropriate candidates j1,K ,j p . There exist alternative minimum criteria for a matrix. Here, the D-optimality criterion is considered, that is characterised by a minimised det(( X ξt X ξ ) −1 ) or equivalently maximised det( X ξt X ξ ) . This work describes how a genetic algorithm can improve the common heuristics DETMAX ([16],[17]) and k-exchange (see e.g. [18]) for the construction of experimental designs.
2.1
The Genetic Algorithm
All of the heuristics used to construct D-optimal designs are based on the idea of sequentially exchanging bad candidates for better ones (see [19] for a comparison). Depending on the actual size p and the desired final size p0 of the design ξ , the
DETMAX algorithm adds or removes a candidate x j , if this leads to a larger or smaller determinant. The new determinant can be expressed in terms of the old one, e.g. for the addition process: æ æ X ξ öö −1 detç X ξt x j çç t ÷÷ ÷ = det X ξt X ξ ⋅ æç1 + x tj X ξt X ξ x j ö÷ ç ÷ x è ø è j øø è The candidate that maximises the sum in brackets is added. For the removal process, the candidate x j that minimises 1 − x tj ( X ξt X ξ ) −1 x j is chosen. The k-
(
)
(
)
(
)
exchange algorithm considers the case, that it might not be optimal to add the candidate for which 1 + x tj ( X ξt X ξ ) −1 x j attains its maximum, if afterwards the
formula 1 − x tj ( X ξt X ξ ) −1 x j forces the removal of the wrong candidate. The addition and the removal form one step, which requires N ⋅ p instead of p examinations. These heuristics often get stuck in local optima for large numbers of candidates. A suitable crossover operation can combine two designs to one design that inherits only their good properties. The left side of Figure 2 shows a 2dimensional illustration of this mechanism. 2.1.1 The Individual Coding One possible representation of a design is a bit string b of fixed length n, b = b1 Kbn , that contains the candidate j if b j = 1 (see Figure 2 right side). Using
this binary coding together with standard crossover types (see e.g. [1],[2]) yields good results for medium sized designs. Another coding that is capable of including repetitions is an ordered list containing the numbers of all points of a design. Design sizes with up to 2 ⋅ p0 points (unused entries are filled with 0) offer more variety during the optimisation process. An adapted crossover operation on list coded individuals is used. Either the DETMAX or the k-exchange algorithm perform a local search.
Figure 2: Left: 2-dimensional illustration of a crossover operation on design individuals. The upper offspring inherits the good properties of the parents. Right: Visualisation of binary coded design individuals. An individual is a binary string, where every bit corresponds to one candidate.
2.1.2 The Fitness Function Since designs with | ξ |> p0 may have larger determinants, this value should not be
used directly as fitness function. An initial estimate d 0 of the optimum obtained by a single heuristic run helps to define a non-stationary fitness function (see e.g. [5] for this topic): −1 Φ (ξ ) = detæç X ξt X ξ ö÷ + C (t ) ⋅ 1 ξ > p ⋅ ( ξ − p0 ) ⋅ d 0 0 è ø
(
)
2.2
An Application Example
As application example a data set with 1280 points in the 4-dimensional parameter space and a 3-rd order polynomial model is used ([6],[9]). The desired size of the design is p0 = 130 , where 66 candidates are already fixed and where the algorithm has to choose 64 candidates. Figure 3 shows the results of 20 runs of the genetic algorithm using the k-exchange and the DETMAX algorithms as local search operations compared to the results of the pure heuristics. The plot uses a logarithmic (base 2) scale, where the values are relative to the fitness of the pure kexchange algorithm using the list coding. In this application the genetic algorithm always yields an improved design compared to the corresponding heuristic. Thereby list coding yields better results than binary coding.
Figure 3: Relative performance of the algorithms for D-optimal DOE.
3 Test Bed Scheduling as TSP Variant As indicated by Figure 1, the measuring process holds a dominant place in the calibration process. The parameter setting at test beds often results in undesired system oscillations which slow down the data recording. An optimised measuring order can achieve a faster measuring process ([12]). By the way the robustness and the reproducibility of the measurements is improved. Changing the operating points, i.e. the engine speed and the relative air-mass flow is more critical than changing the valve spreads. Therefore, in a set of 4-dimensional measuring points {x1i ,x2i ,y1i ,y2i }iN=1 more weight lies on the x-range, i.e. the operating range than on the y-range. The set of N measuring points is separated in j = 1K M ( M < N ) subsets, each defined by k = 1Kn j points y kj = ( y1jk ,y 2jk ) on the y-range at the same operating point x j . Figure 4 displays the y1 components for some operating points. For the reduction of oscillations a separation in subproblems, both TSP variants, may be helpful. In the first step, only the M different x-points are ordered. In the second step the order of the n j y-points within each of the M blocks is optimised.
1 0.8 y
1
0.6 0.4 0.2 0 1
0.8
0.6
0.4
0.2
0
0.2
0
x
x
2
0.6
0.4
0.8
1
1
Figure 4: Visualisation of the application data set for optimal test bed scheduling. Here the y1 components (filled circles) at some operating points x (stars) are displayed.
3.1
Subproblem 1: Minimising paths on the x-range
The ordering of the x-points differs in two respects from the original TSP: First, one-way paths instead of cycles are of interest, where the x-point with minimal x1 and x2 values is used as starting point. Second, the path must be driven in certain directions in order to achieve sufficiently short relaxation times. A common heuristic for the TSP is the 2-opt edge exchange algorithm combined with single point insertion. Especially for large problems, a genetic algorithm using this heuristic as local search yields better results. The genetic algorithm Adjacency coding for list representations of TSP paths was originally suggested in [20] and afterwards extended to matrix representations in [21] and [22].This locusbased coding is much more appropriate to use with genetic algorithms than a timebased coding. In order to introduce an adjacency coding, that uses the natural dimension of the TSP the ideas of multidimensional embedding graphs presented in [23] and [24] are used. For this purpose, rectangular grids with minimal amount of grid points are constructed using the algorithms given in [25]. Infeasible offspring created by n-point crossover on adjacency coded individuals are repaired by replacing or inserting multiple or missing points according to the order within the parents ([21],[22]). A strong local search is performed by the above mentioned heuristic. The distance between two points x and ~ x is defined by their Euclidean distance plus the weighted x1 and x2 distances: d ( x,~ x ) = x-~ x + g ⋅ x -~ x + g ⋅ x -~ x . x1
2
x2
1 1
2
2
Now an appropriate fitness function for an individual p is defined by Φ( p ) =
M −1
å d (x ( ),x ( ) ), p j
j =1
p j +1
1
1
0.8
0.8
0.8
0.6
0.6
0.6
2
x2
x
x2
1
0.4
0.4
0.4
0.2
0.2
0.2
0
0
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0
0.2
x
x
0.4
0.6
0.8
1
x
Figure 5: Paths on the operating range for different weight factors.
where p is a permutation of {1,K ,M } representing a path. The weights g x1 and g x2 help to prefer paths that run mainly parallel to the coordinate axis x1 or x2 and that change the most critical parameter less frequently.
3.2
Subproblem 2: Minimising paths on the y-range
The sequence of the n j points within each of the M blocks ykj is determined by searching the shortest path between all N y-points under conservation of the order of the x-points. The length of a complete path P = ( p1,p 2 ,K , p M ) , where p j is a permutation of the numbers {1,K , n j } , is given by the function Ψ :
åå ( M
Ψ(P) =
nj
j =1 k =1
) å d æçè y
d y pj j (k ),y pj j (k +1) +
M −1 j =1
ö j +1 ( ),y p j +1 (1) ÷ø.
j p j nj
The first term in Ψ calculates the total length of all individual sub-paths, the second adds the connections between them. Accurate results are achieved using the y ) = y-~ y 2 between two points y and ~y . Euclidean distance d ( y,~ 3.2.1 Orientation switching The shortest path between the n j y-points within each block j, is calculated
minimising the j-th sum over k in ξ . For small n j , this can be done by the evaluation of all possible permutations, in other cases, the heuristics easily find the shortest connection. A genetic algorithm is used to find the best orientation for each optimised path by minimising the second term in ξ . The individuals are binary coded bit strings of length M with 0 or 1 at each bit position for the two orientations respectively. 3.2.2 Pure genetic approach Here, a generalisation of the latter method for small n j is considered. At each
point x j there are n j y-points, hence n j! possible permutations. Remember that the former method only used the two permutations representing the shortest path with different orientations. Variable alphabet coding suggested in [12] and [14] is now used to generate individuals, which will again be treated by a genetic
algorithm. An individual v has the form v = (v j ) Mj=1 ∈ ⊗ Mj=1{1K n j !} . This method uses the total function ξ as fitness function for the individuals. Note that the performance of a genetic algorithm using variable alphabet coding rapidly decreases when the number of possible values at the bits increases (in this case when n j ≥ 5 for at least one j). 105 MC Orientation Switching GA Orientation Switching GA Variable Alphabet Coding
100 95 90 85 80 75 70 65 60 (1,1)
(5,1)
(10,1)
Figure 6: Relative algorithm performance of the orientation switching and of the pure genetic algorithm approach for different weight factors.
3.3
An application example
For the data set partly displayed in Figure 4, minimised paths on the x- and the yrange are calculated. There are N = 200 measuring points at M = 72 x-points. Figure 5 shows the results for the first subproblem using three different weight factor combinations ( g x1 ,g x2 ) ∈ {(1,1), (5,1), (10,1)} . The described methods for the second subproblem use the standard crossover and mutation operators. For comparison reasons, the orientation switching was performed using a simple heuristic. The results out of 100000 runs are presented, each of which switching the orientations at 1000 randomly chosen x-points sequentially. Figure 6 shows the results of 30 runs of the algorithms for different weight factor combinations.
4 Final look-up table design At the end of the calibration process, a final selection task has to be performed in order to fix look-up tables for the electronic control unit. The use of different engine models and optimisation methods normally leads to several optimum parameter candidates at each operating point. Normally, these candidates vary only slightly in their quality, i.e. the resulting fuel consumption and exhaust emission, but may differ in their parameter combinations significantly. Figure 7 shows an example look-up table. Since the look-up tables, that are finally stored in the electronic control unit need to have well-defined parameter values at each operating point, a candidate selection is necessary. Thereby the following problem has to be considered: There are parameters that are mechanically adjusted by their
actuators, e.g. the valve spreads are set by the camshaft. Mechanical adjustment needs some time. Therefore it is important that the maps defined by a look-up table are sufficiently smooth to ensure fast transitions. Other actuators or parameters are adjusted electrically or electronically, e.g. the ignition timing angle. In this case there is less need for smooth transitions. At each operating point the candidate has to be selected, which together with the other chosen candidates leads to the smoothest map. In [13] the problem of composing the smoothest map from such a set of candidates was shown to be NP-hard for a certain abstract smoothness criterion. Here, this problem is solved by a genetic algorithm using variable alphabet coding and standard crossover operations. For larger data sets, a local search performed by a simple neighbourhood heuristic can be helpful ([26]).
3
y13 3
y12 y3
11
x2
x1
Figure 7: Visualisation of the y1 components of the candidates at a grid of
operating points. For one example operating point x13 the candidates 3 3 3 y11 ,y12 ,y13 are named.
4.1
The genetic algorithm
At the starting point, there are j = 1K M operating points ( x1j ,x2j ) with k = 1Kn j optimal parameter candidates ykj . To solve the described problem of choosing the appropriate candidate at each operating point a genetic algorithm with variable alphabet coding is used. An Individual v is then represented as:
( )Mj=1 ∈ ⊗Mj=1{1Kn j }.
v = vj
The integrated (Euclidean) norm of the gradients is a suitable smoothness criterion for a map f y and thereby a suitable fitness function for the genetic algorithm:
( ) ò ∇f y (x ) dx.
Φ fy =
x
In some cases, e.g. for the design of look-up tables for the valve spreads, two or more maps need to be smoothened simultaneously. Choosing one constellation within the one look-up table already fixes the corresponding constellation in the other. Here, a simple aggregation method Φ ( f y1 ,f y2 ) = Φ ( f y2 ) + Φ ( f y2 ) allows the simultaneous smoothing of the maps f y1 and f y2 . Variable alphabet coding allows the application of standard crossover operations, like uniform crossover or n-point crossover. For larger data sets the genetic algorithm can be combined with
a local search operation using a neighbourhood heuristic. In a certain number of steps, at a randomly chosen operating point x, the candidate y is selected that is most similar to its presently chosen neighbours. The similarity between the candidates is characterised by the Euclidean distance between the candidate and the mean of the surrounding y-points.
25 20 15 y
1
10 5 0 −5 −10 1
0.5
0
x
−0.5
−1
−1
0
−0.5
0.5
1
x
2
1
25 20 15 y
2
10 5 0 −5 −10 1
0.5
0
−0.5
−1
−1
x2
0
−0.5
0.5
1
x
1
Figure 8: An application example for the problem of simultaneously smoothing two look-up tables. The candidates at each operating point are marked by different symbols. The best results for the look-up tables are visualised by the meshes.
4.2
An application example
As application example two look-up tables are constructed using a data set of 431 points in 4 dimensions. At j = 1 K121 different operating points ( x1j ,x2j ) there are between k = 1Kn j parameter candidates ( y1jk ,y2jk ) , where n j ∈ {1K6} . Figure 8 visualises the candidates for each look-up table. In Figure 9 the performances of the genetic algorithm using 30 runs and of the heuristic using 1000 runs with 1000 selection steps each are displayed. The heuristic performs the candidate selection either with or without consideration of the global shape of the map. In the latter case, a candidate is only selected if the fitness is improved. The best result for the final look-up tables is visualised by the meshes in Figure 9.
0.45 min mean max
0.4
0.35
0.3
genetic algorithm
elite heuristic
heuristic
Figure 9: Relative performance of the smoothing algorithms: The genetic algorithm with standard mutation for variable alphabet coded individuals finds the best results. The elite neighbourhood heuristic, where a new candidate is only chosen, if global improvement is achieved, and the standard neighbourhood heuristic perform worse.
5 Conclusions Three combinatorial optimisation problems occurring in the engine calibration are solved with genetic algorithms. The DETMAX and the k-exchange heuristics for the construction of D-optimal experimental designs can be improved. A multi-step optimisation for optimal test bed schedules to reduce the overall measuring time uses genetic algorithms. Genetic algorithms also give benefits to the fine tuning of look-up tables which has been performed manually in the past.
Acknowledgments We thank Thomas Fleischhauer and Frank Zuber-Goos for helpful discussions. This research has been supported by the BMBF (grant no. 01 IB 805 A/1).
References 1 2 3 4 5
6
J. Holland. Adaptation in Natural and Artificial Systems. Ann Arbor: The University of Michigan Press, 1975. D. Goldberg. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, 1989. T. Baeck. Evolutionary Algorithms in Theory and Practice. Oxford University Press, 1996. R. Fletcher. Practical Methods of Optimization. Chichester, Wiley & Sons, 2nd edition, 1991. Z. Michalewicz. A survey of constraint handling techniques in evolutionary computation methods., Fourth Annual Conference on Evolutionary Programming, Cambridge, MA, 1995. A. Mitterer. Optimierung vielparametriger Systeme in der Antriebsentwicklung, Statistische Versuchsplanung und Künstliche Neuronale Netze in der Steuergeraeteauslegung zur Motorabstimmung. VDI Verlag GmbH Duesseldorf, 2000. VDI Fortschritt-Berichte Number 434 – Reihe 12
7
8
9
10 11 12
13 14 15 16 17 18 19 20
21
22 23 24 25 26
A. Mitterer and F. Zuber-Goos. Modellgestuetzte Kennfeldoptimierung - ein neuer Ansatz zur Steigerung der Effizienz in der Steuergeraeteapplikation. ATZ-Automobiltechnische Zeitschrift, 102, 2000. K. Weicker, A. Mitterer, T. Fleischhauer, F. Zuber-Goos, and A. Zell. Einsatz von Softcomputing-Techniken zur Kennfeldoptimierung elektronischer Motorsteuergeraete. at-Automatisierungstechnik, 48, 2000. J. Poland, A. Mitterer, K. Knoedler, and A. Zell. Genetic algorithms can improve the construction of d-optimal experimental designs. In Advances In Fuzzy Systems and Evolutionary Computation, Proceedings of WSES Conference, 2001. G. Croes. A method for solving traveling salesman problems. Operations Research, 5:791–812, 1958. S. Lin. Computer solutions of the traveling salesman problem. Bell System Technical Journal, 44:2245–2269, 1965. K. Knoedler, J. Poland, A. Mitterer, and A. Zell. Optimizing data measurements at test beds using multi-step genetic algorithms. In Advances In Fuzzy Systems and Evolutionary Computation, Proceedings of WSES Conference, 2001. J. Poland. Finding smooth maps: A new NP-complete problem from engineering. Preprint, 2001. J. Poland and K. Knoedler et al. A genetic algorithm with variable alphabet coding for a new NP-complete problem from engineering. Preprint, 2001. H. Bandemer and A. Bellmann. Statistische Versuchsplanung. B. G. Teubner, Stuttgart, 1994. T. J. Mitchell. An algorithm for the construction of "d-optimal" experimental designs. Technometrics, 16(2), May 1974. Z. Galil and J. Kiefer. Time- and space-saving computer methods, related to mitchell’s detmax, for finding d-optimum designs. Technometrics,22(3), 1980. M. E. Johnson and C. J. Nachtsheim. Some guidelines for constructing exact d-opitmal designs on convex design spaces. Technometrics, 25, 1983. R. D. Cook and C. J. Nachtsheim. A comparison of algorithms for constructing exact d-optimal designs. Technometrics, 22(3), Aug 1980. J. Grefenstette, R. Gopal, B. Rosmaita, and D. Van Gucht. Genetic algorithms for the travelling salesman problem. Proceedings of the first International Conference on Genetic Algorithms and Application, 1985. A. Homaifar, S. Guan, and G. E. Liepins. A new approach on the traveling salesman problem by genetic algorithms. 5th International Conference on Genetic Algorithms, 1993. T. N. Bui and B. R. Moon. A new genetic approach for the traveling salesman problem. International Conference on Evolutionary Computation, 1994. T. N. Bui and B. R. Moon. On multi-dimensional encoding/crossover. 6th International Conference on Genetic Algorithms, 1995. B. R. Moon and C. K. Kim. A two-dimensional embedding of graphs for genetic algorithms. 7th International Conference on Genetic Algorithms, 1997. J. Poland, K. Knoedler, and A. Zell. On the efficient arrangement of given points in a rectangular grid. EvoCOP, 2001. H. Boettle. Methoden zur Glaettung von Motorkennfeldern. Master’s thesis, Universitaet Tuebingen, 2000.