Comparing statistical models and artificial neural networks on ...

5 downloads 10533 Views 250KB Size Report
Abstract. The current article presents an investigation into predicting tool wear in hard machining D2 AISI steel using neural networks. An experimental ...
Int J Adv Manuf Technol (2008) 37:641–648 DOI 10.1007/s00170-007-0999-7

ORIGINAL ARTICLE

Comparing statistical models and artificial neural networks on predicting the tool wear in hard machining D2 AISI steel Ramón Quiza & Luis Figueira & J. Paulo Davim

Received: 18 October 2006 / Accepted: 6 March 2007 / Published online: 28 March 2007 # Springer-Verlag London Limited 2007

Abstract The current article presents an investigation into predicting tool wear in hard machining D2 AISI steel using neural networks. An experimental investigation was carried out using ceramic cutting tools, composed approximately of Al2O3 (70%) and TiC (30%), on cold work tool steel D2 (AISI) heat treated to a hardness of 60 HRC. Two models were adjusted to predict tool wear for different values of cutting speed, feed and time, one of them based on statistical regression, and the other based on a multilayer perceptron neural network. Parameters of the design and the training process, for the neural network, have been optimised using the Taguchi method. Outcomes from the two models were analysed and compared. The neural network model has shown better capability to make accurate predictions of tool wear under the conditions studied. Keywords Hard steel turning . Neural networks

1 Introduction Hard turning has certain advantages over the traditional turning/hardening/cylindrical grinding practice in terms of increased productivity and reduced power consumption.

R. Quiza Department of Mechanical Engineering, University of Matanzas, Autopista a Varadero km 31/2, Matanzas 44740, Cuba L. Figueira : J. Paulo Davim (*) Department of Mechanical Engineering, University of Aveiro Campus Santiago, 3810-193 Aveiro, Portugal e-mail: [email protected]

The hard turning is generally performed without a coolant (additional environmental advantage). Several authors [1–8] reported the performance of ceramic tools in hard turning. Obikawa et al. [1] studied the crater wear mechanism of alumina (Al2O3) ceramic tool based on the stress and temperature on the rake face. The obtained wear mechanism indicated that the sintered alumina ceramic had several times higher crater wear than the CVD layer of alumina coated on the cemented carbide under the same conditions (temperature and stress). D’Errico et al. [2] investigated the cutting performances of different ceramic grades (oxide nitride, mixed and whisker reinforced ceramics) in terms of wear resistance, toughness and resistance to thermal shock. Experimental results are discussed with application to turning of several materials (for example, nickel alloys, hard steels and grey cast iron). Xu et al. [3] showed in detail the effect of yttrium on the mechanical properties and machining performance of Al2O3/ Ti(C, N) ceramic tool. Results show that adequate addition of yttrium improves the mechanical properties of the ceramic tool material. Barry and Byrne [4, 5] investigated the mechanism of Al2O3/TiC cutting tool wear in the finish turning of hard steels with particular cognisance of the workpiece material inclusion content. The rate of tool wear appears to be determined by the hard inclusion or carbides content of the workpiece material. A new mechanism is proposed to account for the superior wear resistance of CBN/TiC composites in comparison to high-content CBN tools in the finish machining of 4340 (AISI) steel of 52 HRC. Chou and Song [6] presented thermal modelling for white layer predictions in finish hard turning. This work showed that in worn cutting, flank wear has decisive effects on temperature of machined surfaces. The maximum temperatures increase 2–3 times from 0–0.2 mm flank wear.

642

Grzesik et al. [7] presented an analysis of part surface roughness in continuous dry turning of hardened construction steel when using mixed alumina inserts. Results of this work produce surface profiles and microstructures, although the arithmetic surface roughness values of 0.25 μm can be comparable to those produced by cylindrical grinding. Recently, Davim and coworkers [8, 9] investigated the machinability of AISI D2 cold work tool steel using ceramic inserts. The results indicated that turning AISI D2 steel with mixed alumina inserts allowed a surface finish as good as that produced by cylindrical grinding. Process parameters optimization in machining operations must be undertaken in two stages: modelling of input-output and in-process parameter relationships, and determination of optimal cutting conditions [10]. Thus, adequate modelling of cutting parameters is an important task as much from a theoretical point of view as from a practical one. Both statistical and neural approaches have been proposed in order to model the relationship between tool wear and cutting parameters in turning processes. Statistical models have been widely used from the first reported research [11] to some recent works, such as [12, 13]. However, due to the high complexity and non-linearity of cutting phenomena, especially by using modern tool materials, prediction of tool wear by means of traditional statistical regressions, which use linear or linearized models, has hard limitations [14]. In recent years, artificial neural networks have been successfully used in many fields for complex classification and prediction tasks. Beyond any biological analogy, a neural network can be seen as a function constructed from compositions of weighted sums of bounded monotone functions [15]. Some neural network topologies, such as multilayer perceptron, can approximate arbitrarily well continuous functions. This is the so-called universal approximation property [16]. Comparing statistical regressions with neural networks, Mukherjee and Ray [10] point out that although statistical regression may work well for modelling, this technique may not describe precisely the underlying nonlinear complex relationship between the decision variables and responses. Moreover, a prior assumption regarding functional relationship (linear, quadratic, higher-orderpolynomial, exponential, etc.) between output and input decision variables is a pre-requisite for regression equation-based modelling. However, neural networks have some drawbacks including: an inability to interpret model parameters for non-linear relationships; dependence on the availability of voluminous datasets in order to attain a successful training; and difficulties in identification of influential observations, outliers, and significance of various predictors. Thus, neural networks must be attemp-

Int J Adv Manuf Technol (2008) 37:641–648

ted only when regression techniques fail to provide an adequate model. In order to predict the tool wear in the turning process several models involving neural networks have been proposed [17–22]. In these works, neural networks have proved their ability to predict successfully the tool wear for different cutting conditions. However, a lack of comparisons between the outcomes of neural networks and statistical models is frequently noted. The current article investigates the influence of cutting parameters (cutting speed V and feed f) under flank wear (VBC) in turning of cold work tool steel hardened with ceramic tools using statistical multiple regression analysis and neural networks, and then compares their outcomes.

2 Experimental procedure The experiment was carried by MACTRIB-Machining & Tribology Research Group at the Department of Mechanical Engineering of the University of Aveiro, Portugal. Machining experiments was performed using a highly rigid lathe Kingsbury 50 CNC with 18 kW spindle power and a maximum spindle speed of 4,500 rpm. The following cutting parameters were used: cutting speed V of 80, 115 and 150 m/min; feed rates f of 0.05, 0.10 and 0.15 mm/rev; and constant depth of cut of 0.2 mm. Workpieces of AISI D2 tool steel, with the following chemical composition, were tested: 1.55% C; 0.30% Si; 0.40% Mn; 11.80% Cr; 0.80% Mo and 0.80% V. After heat treatment (quenching in a vacuum atmosphere at 1000–1040°C) an average hardness of 60 HRC was obtained. Mixed alumina inserts with ref. CC650 (ISO codeCNGA 120408 T01020) were used to machining the tool steel with a geometry as follows: rake angle −6°(negative), 6° clearance angle, 95° edge major tool cutting and −6° cutting edge inclination angle. A type DCLNL2020K12 (ISO) tool holder was used. Evaluation of the flank tool wear was made by a shop microscope Mitutoyo TM-500 with 30× magnification and 1 μm resolution. The admissible wear was established according ISO 3685 standard and measured at a corner radius (VBC). The obtained data are shown in Table 1 and are plotted in Fig. 1. For every combination of feed and speed, as was expected, wear grows with time. However, there is a complex relation among cutting parameters and tool wear. Presumably, the combination of feed and speed has an influence on the tool wear as significant as each single factor.

Int J Adv Manuf Technol (2008) 37:641–648

643

Table 1 Experimental data No.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

3 Results and discussion

Cutting speed Vc (m/min)

Feed f (mm/rev)

Time t (min)

Wear VB (mm)

80 80 80 80 80 80 80 80 80 115 115 115 115 115 115 115 115 115 150 150 150 150 150 150 150 150 150

0.05 0.05 0.05 0.10 0.10 0.10 0.15 0.15 0.15 0.05 0.05 0.05 0.10 0.10 0.10 0.15 0.15 0.15 0.05 0.05 0.05 0.10 0.10 0.10 0.15 0.15 0.15

5 10 15 5 10 15 5 10 15 5 10 15 5 10 15 5 10 15 5 10 15 5 10 15 5 10 15

0.032 0.068 0.112 0.086 0.134 0.190 0.089 0.148 0.156 0.081 0.097 0.146 0.102 0.131 0.177 0.090 0.111 0.152 0.104 0.137 0.179 0.085 0.151 0.242 0.144 0.262 0.333

Fig. 1 Experimental data. Wear plotted versus time and feed for different values of cutting speed

3.1 Statistical modelling Three regressions were adjusted from experimental data: linear, quadratic and potential. The obtained statistical models were: VB ¼ 0:1309 þ 9:873  104 V þ 0:5878 f   þ 9:711  102 τ R2 ¼ 0:74

ð1aÞ

VB ¼ 0:2939 þ 2:168  105 V 2  4:656  103 V   þ 6:573  104 Vf t R2 ¼ 0:79

ð1bÞ

VB ¼ 2:020  103  V 0;7737  f 0:4154    τ 0:6700 R2 ¼ 0:79

ð1cÞ

In the quadratic model, the independent terms having a probability value (associated with the t-statistic) less than 0.10, have been removed, in order to simplify the equation. In linear and potential models no removal was needed. Therefore, in the aforementioned models, all independent terms are statistically significant at the 90% confidence level. In all of them, the probability value, in the ANOVA table, is lower than 0.01; hence, there is a statistically significant relationship between the variables at the 99% confidence level. Quadratic and potential models have a higher R-squared statistic than the linear one. However, it is less than 0.8. This indicates that the adjusted models explain less than 80% of the variability in the dependent variable VB. In order to select the best statistical model, normality of residuals was analyzed. In Figs. 2 and 3, residuals versus predicted values were plotted for quadratic and potential models. Additionally, four tests were performed to analyse normality of the residuals: chi-square, Shapiro-Wilks, standardized skewness and standardized kurtosis. For each one, the respective probability value (p-value) was computed. For both models, these values are shown in Table 2. For the quadratic model the lowest P-value amongst the performed tests equalled 0.3721. Thus, the hypothesis that residuals come from a normal distribution cannot be rejected with 90% or higher confidence. On the contrary, for the potential model the lowest P-value is 0.0800, so that the hypothesis that residuals come from a normal distribution can be rejected with 90% confidence. Since residuals from the quadratic model seem to have a normal distribu-

644

Int J Adv Manuf Technol (2008) 37:641–648 Table 2 Normality tests for residuals of quadratic and potential models Test

Statistic

Quadratic model

Potential model

Chi-square test

Goodnessof-fit statistic P-value W-statistic P-value Z score for skewness P-value Z score for kurtosis P-value

11.8889

10.8519

0.3721 0.9626 0.4494 0.8015

0.4558 0.9310 0.0800 1.1450

0.4228 −0.1167

0.2522 −0.0953

0.9071 0.3721

0.9241 0.0800

Shapiro-Wilks Standardized skewness test

Fig. 2 Residuals vs. predicted values for the quadratic model

tion and residuals from the potential model do not, the quadratic model was selected as the most convenient to describe the wear. Nevertheless, as can be seen in Fig. 4 where this model is graphically shown, there is a serious divergence between experimental data and predicted values for several points.

4 Neural network based modelling To establish a useful relationship between tool wear and cutting parameters a multilayer perceptron (MLP) type neural network was selected. The neural network has two layers: one hidden layer and one output layer. The hidden layer uses a transference function of sigmoid type: f ð xÞ ¼

1 P 1 þ exp ½ðb þ wi xi Þ

Fig. 3 Residuals vs. predicted values for the potential model

ð2aÞ

Standardized kurtosis test Minimal p-value for the model

On the contrary, the output layer uses a linear function: X f ð xÞ ¼ b þ wi xi ð2bÞ To carry out the training process, three homogeneously located points were selected from the experimental data to form the validation set. They are points numbered 6, 16 and 20. The remaining data form the training set. Not only input variables but also the outputs were normalised in the range [0, 1] in order to facilitate the neural networks training process. The network was trained by using the gradient descendent with momentum backpropagation algorithm. In this algorithm, four parameters must be tuned: learning rate LR; moment constant MC; training epochs E; and number of

Fig. 4 Graphical representation of quadratic statistical model

Int J Adv Manuf Technol (2008) 37:641–648 Table 3 Values for the three levels of network training parameters

645

Factor

Levels

Learning rate LR Moment constant MC Epochs E Hidden nodes HN

hidden nodes HN. To obtain the most convenient values for these parameters, the Taguchi method has been used. An orthogonal array L9 (34), formed by four factors and three levels, was used. The values for the three levels were selected through several preliminary coarse training processes; they are shown in Table 3. The orthogonal array, which was built with these values, is shown in Table 4. To evaluate the performance of the neural network, two parameters were used. The first is the root mean square error RMSE of the predicted value. This parameter represents the network prediction goodness. The second parameter appraises the generalization capability GC of the network, and is evaluated by the probability value of a student t-test, comparing the means of the residuals for the training and validation sets. While the probability value is nearer to one, the network has better generalization capability. For each point in the orthogonal array, the corresponding network was trained and evaluated. The training process was carried out three times for each point in order to verify the convergence to the same weights and biases. In Table 4 the RMSE and CG values are shown for the nine points in the array. On the basis of these outcomes, a graphical representation of the two performance parameters for each factor was made. In Fig. 5a–d, the four graphics are shown. Analysing the above graphs, the most convenient value for each factor can be selected. As the two performance

Minimum

Medium

Maximum

0.01 0.1 5,000 1

0.02 0.5 25,000 2

0.03 0.9 45,000 3

parameters (RMSE and GC) are usually conflicting, a compromise value must be chosen. For the learning rate (Fig. 5a) the most convenient value is 0.03, because the lowest RMSE is achieved and the value of GC is high enough. For moment constant (Fig. 5b), a value of 0.7 was selected because the highest GC is attained, while the RMSE does not have a significant reduction. The number of epochs was established as 25,000 because from this point the GC has a noteworthy decrease (Fig. 5c). Finally, three nodes were selected for the hidden layer because this is an acceptable compromise value for CG and RMSE (Fig. 5d). Using the aforementioned values, the final training process was carried out. Figure 6 shows the behaviour of RMSE of the predictions along the network training process. The trained network achieves a RMSE value of 0.022, and a GC (probability value) equal to 0.670. The Appendix shows the obtained network model programmed as a C function. In Fig. 7, the distributions of residuals are shown. Analysing the sets of residuals for training and validation data, it can be seen that the confidence interval for the difference between the means, which extends from −0.016 to 0.0071, contains the value 0. Therefore there is not a statistically significant difference between the means of the two samples. Furthermore, the confidence interval for the ratio of the variances of both sets extends from 0.00685372 to 1.17607. Since the interval contains the value 1, there is not a statistically significant difference between the stan-

Table 4 Orthogonal array Point

1 2 3 4 5 6 7 8 9

Design factors

Response

LR

MC

E

HN

RMSE

GC

0.01 0.01 0.01 0.02 0.02 0.02 0.03 0.03 0.03

0.1 0.5 0.9 0.1 0.5 0.9 0.1 0.5 0.9

5,000 25,000 45,000 25,000 45,000 5,000 45,000 5,000 25,000

2 3 4 4 2 3 3 4 2

0.132 0.095 0.068 0.058 0.094 0.132 0.070 0.085 0.090

0.777 0.948 0.715 0.104 0.621 0.627 0.292 0.581 0.897

646

Int J Adv Manuf Technol (2008) 37:641–648

Fig. 5 Graphs of RMSE and generalization capability vs studied factors

dard deviations of the two samples. Finally, by performing a Kolmogorov-Smirnov test, a statistically significant difference was not found between the two distributions. All the above-mentioned tests were carried out at the 95% confidence level. In Fig. 8, the neural network model is graphically shown. As can be noted, the surfaces corresponding to different levels of cutting speed have a complex form, which is in correspondence with the experimental data.

that obtained for the statistical model (0.799). The mean absolute error was 0.0070 for the neural network model while it was 0.0231 for the statistical one. The maximum absolute errors were 0.0633 and 0.0228, respectively. In Fig. 9, residuals are plotted versus predictions for both models. As can be seen, the neural network model has a less dispersed distribution of residuals.

4.1 Comparison between statistical and neural networks models For the neural network model, an R-squared statistic equal to 0.979 was obtained. This value is significantly greater than

Fig. 6 Behaviour of the RMSE along the training process

Fig. 7 Residuals vs predicted values for training and validation sets in a neural network model

Int J Adv Manuf Technol (2008) 37:641–648

Fig. 8 Graphical representation of neural network model

Furthermore, comparing the graphical representation of both models (see Figs. 4 and 8), it can be noted that surfaces from the neural network model have a shape closer to those that are expected from the experimental data (Fig. 1). Considering all the above-mentioned points, the conclusion can be reached that the neural network model describes with more precision the relationship between the tool wear and the cutting speed, feed and cutting time.

5 Conclusions A neural network of multilayer perceptron type has been successfully used in predicting flank wear for several cutting parameters in hardened D2 (AISI) steel turning.

Fig. 9 Residuals vs predicted values for statistical and neural network models

647

The Taguchi method used allowed selecting the most convenient parameters for the neural network. The optimized network model has achieved a high precision (R-square equal to 0.979) and a homogeneous distribution of residuals. Furthermore, it has shown a good generalization capability. In comparing the obtained neural network model with the statistical multiple regression, it has been shown that the neural network allows obtaining more accurate predictions for the tool wear. The model obtained can be used in optimizing cutting conditions for turning D2 hardened steel. Furthermore, neural networks have shown, again, their ability to model non-linear relationships. Considering the modern computing equipment which is accessible to researchers and engineers, there is no opposition to the wider use of neural networks in modelling of machining processes.

Appendix: C code for the trained neural network double Wear (double Speed, double Feed, double Time) { double W1[3][3], W2[3], B1[3], B2; double X[3], S, L[3]; int i, j; X[0]=(Speed - 80)/(150 - 80); X[1]=(Feed - 0.05)/(0.15 - 0.05); X[2]=(Time - 5)/(15 - 5); W1[0][0]=1.8753; W1[0][1]=2.6878; W1[0][2]=0.9498; W1[1][0]=−1.1571; W1[1][1]=−2.8293; W1[1][2]=−1.3605; W1[2][0]=−1.6989; W1[2][1]=−1.2206; W1[2][2]=−0.4847; W2[0]=−5.4640; W2[1]=−3.5099; W2[2]=−5.7351; B1[0]=−3.5050; B1[1]=3.1480; B1[2]=3.4430; B2=9.1174; for (i=0; i