Modeling Unknown Relationships with Polynomial ...

4 downloads 83 Views 359KB Size Report
Grand Sierra Resort and Casino. Reno, Nevada. June 21 – June 24, 2009. Abstract. Nonlinear polynomial network (NPN) modeling is described and presented ...
An ASABE Meeting Presentation Paper Number: 096547

Modeling Unknown Relationships with Polynomial Networks Dennis G. Watson, Associate Professor Plant, Soil & Agricultural Systems, Southern Illinois University, Carbondale, IL. [email protected]

Tony V. Harrison, Chief Operating Officer JPT Integrated Solutions, Inc., Gainesville, FL. [email protected]

Written for presentation at the 2009 ASABE Annual International Meeting Sponsored by ASABE Grand Sierra Resort and Casino Reno, Nevada June 21 – June 24, 2009 Abstract. Nonlinear polynomial network (NPN) modeling is described and presented as a method of modeling unknown relationships in data. NPN is compared with linear multiple regression (LMR) for predicting CO emissions of a diesel engine. Models for each method were developed from data collected from ISO 8178-4 (1996) test cycle B-type tests (ISO) and an expanded set of tests (EXP) to predict carbon monoxide (CO) emissions from a diesel engine. LMR using the ISO training data (R2 = 0.93) resulted in over-training of the model, as applied to the evaluation data (R2 = 0.54). LMR based on the expanded data (R2 = 0.68) was a better LMR model, when applied to the evaluation data (R2 = 0.67). NPN using the ISO training data (R2 = 0.96) resulted in a considerable improvement over the LMR models for the evaluation data (R2 = 0.82). NPN using the EXP training data (R2 = 0.97) resulted in the best model when applied to the evaluation data (R2 = 0.89). When applied to the evaluation data, the mean absolute error of the NPN EXP based model was significantly less than from the NPN ISO based model. The results of these models demonstrate the potential for NPN methods to determine relationships among variables with a relatively small set of observations. Keywords. modeling, regression, polynomial network, emissions, diesel engine

The authors are solely responsible for the content of this technical presentation. The technical presentation does not necessarily reflect the official position of the American Society of Agricultural and Biological Engineers (ASABE), and its printing and distribution does not constitute an endorsement of views which may be expressed. Technical presentations are not subject to the formal peer review process by ASABE editorial committees; therefore, they are not to be presented as refereed publications. Citation of this work should state that it is from an ASABE meeting paper. EXAMPLE: Author's Last Name, Initials. 2009. Title of Presentation. ASABE Paper No. 09----. St. Joseph, Mich.: ASABE. For information about securing permission to reprint or reproduce a technical presentation, please contact ASABE at [email protected] or 269-429-0300 (2950 Niles Road, St. Joseph, MI 49085-9659 USA).

Introduction Nonlinear polynomial network (NPN) modeling is a non-parametric, self-organization approach in which underlying relationships of variables are automatically discovered by the NPN algorithm. In this context, a network is a function represented by the composition of many functions (Barron & Barron, 1988). NPN is closely related to the group method of data handling (GMDH) algorithm developed in Kiev, and first published by Ivakhnenko (1968). Barron et al. (1984) described early polynomial network software development in the United States as based on the GMDH described by Ivakhnenko (1971). Ivakhnenko was influenced by the concept of the perceptron as described by Rosenblatt (1958). According to Farlow (1984), Ivakhnenko’s work was prompted by the requirement of many mathematical models to know things about the system that are generally impossible to find. If modelers are expected to make wild guesses for certain variables, they can’t expect the resulting model to have a high level of reliability. A method was needed that relied on objective methods rather than biases of the researchers (Farlow, 1984).

GMDH Algorithm The GMDH algorithm has been described in detail by Ivakhnenko (1968, 1971) and Farlow (1981, 1984), The basic GMDH algorithm is a procedure for relating input variables to a single output variable by constructing a high order polynomial of the following form (Farlow, 1981).

(1) The data used for modeling with GMDH consists of observations of independent variables and one dependent variable. The observations are divided into training and checking sets. Equations are developed from the training set and the checking set is used to compute least square errors from the equations. GMDH consists of multiple iterations of two steps. The first step consists of constructing new variables from each pair of independent variables. For example, a simple data set with three input variables (x1, x2, and x3) would have new variables of z1 derived from x1 and x2, z2 derived from x1 and x3, and z3 derived from x2 and x3. The new variables of z are calculated based on the least squares polynomial of the form: (2) that best fits the dependent variable (Farlow, 1981). The second step is to sum the squared error (S) produced by each of the z variable equations. The z variables with S less than some prescribed number become new x variables for the next iteration. The variable with the minimum S (Smin) from this iteration is noted. Additional iterations (or layers) of the two steps are completed until the Smin from an interation is greater than the Smin from the prior iteration. The equation for the variable from the prior iteration with Smin is the resulting model of the dependent variable. GMDH is typically considerd a heuristic (trial and error) procedure since it is not based on a solid theoretical foundation (Farlow, 1981). The algorithm is considered self-organizing since it uses mathematical methods for combining inputs and a series of self-selections based on heuristic criteria (Ivakhnenko, 1971). Ivakhnenko (1971) described the starting point of this heuristic, self-organization approach as, “I know that I know nothing; let us generate and compare all possible input-output combinations.” Malone (1984) described GMDH as a tool to be used for nonparametric statistical process of regression without models, saying “if we have 2

no other information, except perhaps some idea of the "smoothness" of the relationship, and do not in any way limit our study a priori, this is regression without models (or nonparametric regression).” Scott and Hutchinson (1976) compared the GMDH process to the process of plant breeding. Suppose a plant breeder desires a seed oil producing plant with a certain set of traits. The breeder begins with seed from promising plants (original x variables), grows them, and crosspolinates each combination of pairs. The resulting seed (z variables) are examined for desireable traits. The seed that represent an improvement, are kept for further breeding. This represents one iteration of the GMDH algorithm. The breeding process is repeated until a plant with the desired traits is produced. Scott and Hutchinson (1984) reported that GMDH offered a significant advance in mathematical modeling, in that it could determine the inherent structure of extremely complex and highly nonlinear systems. Stepanov (1974) compared the GMDH algorithm to the least squares regression method for predicting the long-term demand for fifty consumer goods. Stepanov (1974) concluded that even in the case of considerable lack of information, GMDH was much more efficient than a prediction based on regression (trend) models. Variations of GMDH have been developed. Ikeda et al. (1976) developed a modified GMDH procedure they called sequential GMDH. Ikeda (1984) presented a modified versions of GMDH that included clustered input and output data which were separated into training and checking sets. Tamura and Kondo (1984) proposed heuristics-free GMDH methods using partial polynomials and intermediate polynomials, without the need to split the data into training and checking sets. Hayashi and Tanaka (1990) introduced fuzzy GMDH as the use of GMDH applied to possibility models. Nikolaev & Iba (2003) introduced polynomial harmonic GMDH with back propogation for time series modeling.

Polynomial Network Software Development Barron et al. (1984) described early polynomial network software development in the United States. Beginning in 1971, a polynomial network training routine (PNETTR I) was developed to implement the GMDH algorithm. In 1972, PNETTR II incorporated features not found in GMDH, including a clustering algorithm (Mucciardi & Gose, 1972), retention of the original inputs for subsequent layers, and optimization of coefficients. A weakness of GMDH and PNETTR II was the need to divide the data into training and checking sets, which reduced the number of observations for training. PNETTR IV incorporated the predicted squared error (PSE) criterion method of Barron (1984) which eliminated the need for the checking data set. The PSE selection criterion consisted of a squared error term based on the training data and an overfit penalty term as follows. (3) where: K = number of coefficients estimated to minimize TSE N = number of training observations = prior estimate of true error variance TSE = training squared error PSE = predicted squared error As each coefficient was added to reduce the error of the polynomial network, the over fit penalty increased. The over fit penalty was designed to keep a model from over fitting the training data

3

to the extent that it performs poorly on future observations. Once PSE for a layer increased from the prior layer, the polynomial network from the prior layer was selected as the best model. PNETTR IV extended the basic GMDH beyond pairs of inputs to include all combinations of single, double, or triple inputs, all expressed in third order polynomials. Saved elements from preceding layers were candidate inputs for a given layer (Barron & Barron, 1988). Montgomery (1989) depicted an example polynomial network as consisting of inputs, normalizers, unitizers, single, double, and triple elements (see Figure 1). Montgomery (1989) described Abductory Induction Mechanism (AIM) by AbTech Corporation, as an algorithm based on the work of Barron et al. (1984). Drake (1992) described a preliminary step in the modeling process, added with AIM, of normalizing each variable to a mean of zero and a standard deviation of unity. AIM software was incorporated into a newer version called ModelQuest (MarketMiner, 2004). Input A

N

Single Double

Input B

N

Input C

N

Triple

U

Output

Double Input D

N

Figure 1. Example of polynomial network with N symbolizing normalizing function, U symbolizing unitizing function, and single, double, and triple indicating the number of inputs in a network node. Since PNETTR IV did not use data groups (training and checking) is was not, strictly speaking, a GMDH algorithm, but is considered a GMDH-type algorithm (Barron et al., 1984). Rather than GMDH, other terms are used to describe the algorithm, including: polynomial network (Barron et al. 1984; Drake et al., 1994; Griffin et al., 1994; Kleinsteuber & Sepehri, 1996), adaptive learning network (ALN) (Barron et al, 1984), abductory induction (Montgomery, 1989), abductive reasoning networks (Montgomery & Drake, 1991), abductive polynomial network (Drake & Kim, 1997), polynomial neural networks (Aksyonova et al., 2003; Gardner, 1992), and statistical learning network (Drake, 1992; Muller et al, 1998; Barron & Barron, 1988). The use of the terms abductory or abductive are derived from Peirce (1955) and Fann (1970) and defined as reasoning from general principles and initial facts to new facts under uncertainty. Drake and Hess (1990) described abductive machine learning as integrating non parametric regression and neural network methods to discover numeric rules, which consist of a network of nodes of high order polynomial functions. More recently, polynomial network software have been classed as data mining tools (Agarwal, 1999; Kim, 2002; King et al., 1998; and Pyo et al., 2002). The term, nonlinear polynomial network (NPN) is used in this paper, when referencing the modeling procedure.

Polynomial Network Applications Ivakhnenko’s applications of GMDH included control of water levels in reservoirs (Ivakhnenko & Ovchinnikov, 1975), thermal state of blast furnace (Ivakhnenko, Mikryukov et al., 1975), river

4

runoff (Ivakhnenko & Stepashko, 1975), plankton ecological system (Ivakhnenko, Vysotskiy et al., 1975), economy (Parks et al., 1975), and long-range forecasting (Ivakhnenko, 1978). Ivakhnenko & Ivakhnenko (1995) presented an extensive list of problems solvable by GMDH. Barron et al. (1984) summarized an extensive list of early applications of PNETTR, including: process control, radar, passive acoustic and seismic analysis, infrared, x-ray, ultrasonic and acoustic emission, eddy currents, missile guidance, materials, multisensor signal processing, biomedical, and econometric forecasting. Additional applications of NPN include defense (Montgomery et al., 1990; Drake, 1992; Drake et al, 1994; Gardner, 1992), education (El-Alfy & Abdel-Aal, 2008), energy (Abdel-Aal, 2006), environmental (Abdel-Aal, 2007; Duffy & Franklin, 1975; Ikeda et al., 1976), financial (Agarwal, 1999; Cerullo & Cerullo, 2006; Drake & Kim, 1997; Stepanov, 1974; Kim, 2002, 2005, 2007; Kim & Nelson, 1996), machinery (Kleinsteuber & Sepehri, 1996), medical (Abdel-Aal & Mangoud, 1997; Abdel-Aal, 2005; Griffin et al., 1994), process control (Shewhart, 1991; Silis & Rozenblit, 1976), tourism (Pyo et al., 2002), and agriculture (Duffy & Franklin, 1975; Ivakhnenko et al., 1977; Lebow et al., 1984; Pachepsky & Rawls, 1999; Reddy & Pachepsky, 2000).

Example Application The ISO 8178-4 (1996), standard for emissions measurement includes universal test cycle B, which specifies 11 test modes of engine load and speed combinations for emissions measurement, specifically: 10%, 25%, 50%, 75%, and 100% torque at rated speed and an intermediate speed and no load at low idle. Overall emission values are determined by averaging the emissions of the test modes. The goal of the ISO standard was to minimize test modes while ensuring test cycles were representative of actual engine operation (ISO, 1996). Hansson et al. (2001) found emissions of a tractor to be more than 50% higher than values determined according to ISO 8178-4. They concluded it was not possible to design one set of emissions factors that produced representative results for all types of tractors and work operations (Hansson et al. 2001). Besides using the ISO 8178-4 test modes to compute a set of overall emissions values, additional data from tests may be useful for developing a mathematical model for predicting emissions at various load and speed combinations. Predicting emissions of a stationary or portable engine used to power a relatively constant load, such as an irrigation pump, would be an intial test of this hypothesis. In some cases, engines have been sized to operate at rated continuous power, but in other cases, engines operate at less than rated power and may be considerably over-powered for an application. Data from ISO 8178-4 tests may be insufficient to model emissions of engines operating at speeds different than the two tested speeds or at loads between the tested loads. Emission data from additional loads and speeds may produce a better model of the range of potential operating conditions. Models have been developed for diesel-powered, heavy duty, on-road vehicles. Ramamurthy et al. (1998) fit a polynomial curve to emissions based on axle power of a heavy duty diesel vehicle. Krijnsen et al. (2000) successfully modeled NOx emissions from a diesel engine using an artificial neural net, a split and fit algoritm, and a nonlinear polynomial model. Yanowitz et al. (2002) used test data from a heavy duty transient test to predict diesel emissions based on engine power and found a good linear correlation between rates of horsepower increase and particulate matter emissions.

Objective NPN and linear multiple regression (LMR) modeling methods were compared for predicting carbon monoxide (CO) emissions for a diesel irrigation engine, using two data sets. The data sets consisted of data obtained from tests similar to the ISO 8178-4 B test cycle and an

5

expanded set of tests with additional loads and speeds. The data included engine operating conditions from the engine’s controller area network (CAN) and torque, emissions, and ambient condition sensors. The results of the study provide comparative data on the relative capability of NPN and LMR for predicting CO emissions of a diesel engine running at a constant load and speed.

Equipment and Procedures Data for this study was collected from a turbocharged diesel engine. The equipment used for data collection and storage included a CAN protocol adapter, programmable automation controller (PAC), ambient condition sensors, and computer with LabVIEW (National Instruments, 2003), previously described by Hogan et al. (2007) and Watson et al. (2008). Exhaust emissions were collected and analyzed by an Infrared Industries FGA4000XD gas analyzer (GA). The GA used non-dispersed infrared light to measure CO. The GA also measured exhaust temperature, pressure, and air fuel ratio. The GA measured CO in parts per million (ppm) and units of g/kWh were calculated as specified by Infrared Industries. SAE standard J1939-71 (SAE, 2002) defined variables potentially available on the CAN. Eight variables were available that were related to engine performance. These were included in the 18 variables measured or calculated for the emissions tests (see Table 1). Table 1. Variables measured and calculated during engine emissions tests and used for modeling. Variable Name Engine speed (rpm) Percent torque Percent load Percent friction Fuel rate (l/h) Engine fuel temperature (deg C) Coolant temperature (deg C) Intake manifold temperature (deg C) Torque (Nm) Power (kW) Ambient temperature (deg C) Relative humidity (%) Barometric pressure (mbar) Exhaust temperature (deg C) Exhaust pressure (kPa) Air to fuel ratio CO (ppm) CO (g/kWh)

Source CAN CAN CAN CAN CAN CAN CAN CAN Lebow TMS 9000 Calculated Opto 22 ICTD sensor Honeywell HIH3610 Novalynx WS16BP GA GA GA GA Calculated

Two data sets, each from different experiments, were compared to determine which produced the best prediction model. The first data set was called ISO and was based on ISO 8178-4 (1996) test cycle B. The rated speed was 2500 rpm and an intermediate speed of 1500 rpm was selected. No-load tests were substituted for the 10% torque tests, since testing equipment would not support the low 10% of torque at both speeds. The 11 torque and engine speed combinations were replicated four times for a total of 44 sets of observations.

6

The second data set expanded on the ISO test modes and was called EXP (expanded). It was designed to provide more data points by testing loads at 10% intervals between 40%-100% torque and using additional RPM settings. Emissions data were collected while the diesel engine was operated at 0%, 25%, 40%, 50%, 60%, 70%, 80%, 90%, and 100% of torque for each engine speed of 1500, 1750, 2000, 2250, and 2500 (rated power) rpm. The 45 torque and engine speed combinations were replicated four times for a total of 180 sets of observations. Other than the torque and speed combinations there were no other differences in equipment and procedures in collecting data for the ISO and EXP data sets. The data were combined into three files for modeling. All data from replications one, two, and three of the ISO tests were combined into one file for the ISO training data. Likewise, all data from replications one, two, and three of the EXP tests were combined into one file for the EXP training data. Data from replication four of both the ISO and EXP tests were combined into one file for the evaluation data set. The first 16 variables from Table 1 were used as independent variables (inputs) and CO in g/kWh was the dependent (output) variable. Although the resulting training sample sizes (n = 33 for ISO and n = 135 for EXP) were relatively small for LMR and were expected to result in over fitting, the LMR models were included as a comparison to the NPN models—which have been found to be more efficient than LMR with small sample sizes (Stepanov, 1974). SAS® 9.1 (SAS, 2007) was used to compute correlation and regression coefficients for the 16 inputs to CO. Two LMR models were developed—one each for the ISO and EXP training data. The form of the regression equation was: Y´ = a + b1x1 + b2x2 + … + bkxk

(4)

where: Y´ = predicted CO (g/kWh) a = intercept constant bk = regression coefficient for the kth predictor variable xk = the kth predictor variable Both of the LMR models were evaluated by using the resulting equations to predict CO with inputs from the evaluation data set. Two NPN models were developed—one each for the ISO training and EXP training data. ModelQuest® (MarketMiner, 2004) software was used to complete the steps to derive the NPN model. ModelQuest software has been used by other researchers, including Abdel-Aal and Mangoud (1997), Agarwal (1999), Cerullo and Cerullo (2006), Kim (2002), and Reddy and Pachepsky (2000). The NPN was calculated one layer at a time. The initial (or input) layer consisted of normalizing the 16 inputs to a mean of zero and standard deviation of one. For each subsequent layer of the network, each possible combination of inputs from the prior layer were combined into third order polynomial equations with each combination of single, double, and triple inputs, using the following equations (Montgomery, 1989). Single = w0 + w1x1 + w2x12 + w3x13

(5)

Double = w0 + w1x1 + w2x2 + w3x12 + w4x22 + w5x1x2 + w6x13 + w7x23

(6)

Triple = w0 + w1x1 + w2x2 + w3x3 + w4x12 + w5x22 + w6x32 + w7x1x2 + w8x1x3 + w9x2x3 + w10x1x2x3 + w11x13 + w12x23 + w13x33

(7)

where: wi = coefficient

7

xi = input variable Single = equation with one input variable Double = equation with two input variables Triple = equation with three input variables PSE (Barron, 1984) was used as the selection criterion. PSE was applied to each of the single, double, and triple equations, along with inputs from the prior layer to select the best predictors for input to the next level. PSE was also applied to the resulting network for each layer. Until PSE was minimized, additional layers were added to the network using inputs calculated in the prior layer. The resulting value (normalized CO) was converted to units of g/kWh. Each of the NPN models was evaluated with the same evaluation data set as the LMR models. The models were compared based on coefficient of determination (R2), average error, and maximum error. Paired t-tests were also used to determine if absolute error was different between paired models.

Results and Discussion Linear Multiple Regression (LMR) Models LMR was used to fit an equation to the combination of the 16 input variables from the ISO training data to predict CO. The resulting equation accounted for approximately 93% of observed variance in CO in the training data (F16,16 = 13.70, p < 0.0001, adjusted R2 = 0.86). Table 2 lists the regression coefficients and standardized coefficients for each input. Inputs of power, fuel rate, percent torque, and torque had the highest weights. Table 2. Regression coefficients and standardized coefficients for each of the ISO and EXP training data sets. Regression Coefficients Variable Name ISO EXP y-intercept 1021.602 209.904 Engine speed (rpm) -0.003 0.017* Percent torque -0.620 -0.262 Percent load -0.041 0.087 Percent friction 0.527 -4.823** Fuel rate (l/h) 3.983 -0.080 Engine fuel temperature (deg C) -0.951* -0.247* Coolant temperature (deg C) -0.510 -0.474 Intake manifold temp (deg C) 0.182 0.593** Torque (Nm) 0.110 -0.040 Power (kW) -1.211 -0.541 Ambient temperature (deg C) 0.369 -0.109 Relative humidity (%) 0.543 -0.060 Atmospheric pressure (mbar) -1.157 0.052 Exhaust temperature (deg C) -0.325 -0.127 Exhaust pressure (kPa) 0.298 -0.222 Air to fuel ratio -0.319 -0.192* * Regression coefficient significant with p < 0.05. ** Regression coefficient significant with p < 0.0001.

Standardized Coefficients ISO EXP 0 0 -0.377 1.157 -4.195 -1.182 -0.322 0.451 0.395 -2.047 7.419 -0.097 -1.545 -0.226 -0.329 -0.164 1.798 3.956 3.730 -0.942 -8.596 -2.717 0.323 -0.071 0.262 -0.057 -0.353 0.027 -0.195 -0.081 0.072 -0.114 -0.929 -0.408

8

LMR was also used with the EXP training data to fit an equation to CO. The resulting equation accounted for approximately 68% of observed variance in CO (F16,118 = 15.77, p < 0.0001, adjusted R2 = 0.64). Table 2 lists the regression coefficients and standardized coefficients for each input. Inputs of intake manifold temperature, power, and percent friction had the three highest weights. The respective regression equations of the ISO and EXP training data were applied to the evaluation data to predict CO. Table 3 summarizes the R2, mean absolute error, and maximum absolute error of each model applied to the training data and evaluation data. When applied to the evaluation data, the R2 for the LMR model based on the ISO training data dropped from 0.93 to 0.54, while the R2 for the model based on EXP data decreased only slightly from 0.68 to 0.67. For the ISO based model applied to the evaluation data, mean absolute error and maximum absolute error increased by multiples of 4.2 and 4.3, respectively. Mean and maximum absolute error for the EXP based model applied to the evaluation data increased by multiples of only 1.2 and 1.3, respectively. When comparing mean absolute error of the LMR models applied to the evaluation data, the model based on the EXP data had a significantly lower error (t(55) = 3.078; p = 0.0032). Table 3. Comparative performance of LMR and NPN models derived from each of ISO and EXP data sets and the evaluation data. Training Data** Mean Maximum Absolute Absolute Error Error

Evaluation Data*** Mean Maximum Absolute Absolute R2* Error Error

Model Strategy R2* Linear Multiple ISO 0.93 0.83 2.93 0.54 3.45 12.51 Regression (LMR) Nonlinear Polynomial ISO 0.96 0.56 2.51 0.82 1.82 15.74 Network (NPN) Linear Multiple EXP 0.68 1.96 14.03 0.67 2.44 18.04 Regression (LMR) Nonlinear Polynomial EXP 0.97 0.30 7.70 0.89 0.69 12.79 Network (NPN) * Coefficient of determination between actual and predicted CO for the respective data set. ** Actual CO (g/kWh) values in the ISO data set ranged from 0.23 to 14.55 with a mean of 2.85 and standard deviation of 4.18. Actual CO values in the EXP data set ranged from 0.19 to 29.58 with a mean of 2.09 and a standard deviation of 5.09. *** The actual CO (g/kWh) values in the evaluation data set ranged from 0.21 to 26.55 with a mean of 3.11 and a standard deviation of 6.86. Data Set

9

Nonlinear Polynomial Network (NPN) Models The ISO training data were used to develop a NPN model to fit an equation to CO based on the 16 input variables. The resulting NPN is depicted in Figure 2. Of the 16 input variables, only power and ambient temperature were used by the resulting polynomial network. The predicting network accounted for approximately 96% of the observed variance in CO and consists of the following network of equations: Pn = -1.2318 + 0.0337 * P

(8)

ATn = -7.2504 + 0.273 * AT DB = -0.7563 + 0.7013 *

(9)

Pn2

- 0.3649 *

Pn3

+ 0.1721*

ATn2

- 0.1294 *

ATn3

CO = 2.8506 + 4.1803 * DB

(10) (11)

where: Pn = normalized power P = observed power (kW) ATn = normalized ambient temperature AT = observed ambient temperature (°C) DB = double, intermediate output of network node with two inputs CO = carbon monoxide emissions (g/kWh) Power (kW)

N

Double Ambient Temp (°C)

U

CO (g/kWh)

N

Figure 2. Polynomial network generated from ISO training data. The EXP training data were likewise used to develop a NPN model to fit an equation to CO based on the 16 input variables. The resulting polynomial network is depicted in Figure 3. The only input variables used in the model were torque and air fuel ratio. The predicting network accounted for approximately 97% of the observed variance in CO and consists of the following network of equations: Tn = -1.9012 + 0.0083 * T

(12)

AFRn = -3.1372 + 0.0925 * AFR

(13)

DB1 = -0.4517 - 0.1958 * Tn + 0.7117 * Tn2 - 0.3458 * Tn3 - 0.2493 * AFRn + 0.4003 * Tn * AFRn - 0.05 * Tn2 * AFRn + 0.1297 * Tn * AFRn2 + 0.0398 * AFRn3

(14)

DB2 = -0.6526 - 2.5311 * DB1 - 4.0485 * DB12 + 1.1273 * DB13 + 0.396 * AFRn + 1.5691 * DB1 * AFRn + 1.527 * DB12 * AFRn - 0.1099 * AFRn2 - 0.3378 * DB1 * AFRn2 CO = 2.0919 + 5.093 * DB2

(15) (16)

where: Tn = normalized torque T = observed torque (Nm)

10

AFRn = normalized air fuel ratio AFR = observed air fuel ratio DB1 = double, intermediate output of network node with two inputs DB2 = double, intermediate output of network node with two inputs CO = carbon monoxide emissions (g/kWh) Torque (Nm)

N Double1 Double2

Air Fuel Ratio

U

CO (g/kWh)

N

Figure 3. Polynomial network generated from EXP training data. The NPNs of the ISO and EXP training data were applied to the evaluation data to predict CO. Table 3 summarizes the R2, mean absolute error, and maximum absolute error of each model applied to the training data and evaluation data. When applied to the evaluation data, the R2 for the NPN model based on the ISO training data dropped from 0.96 to 0.82. The R2 for the EXP data dropped from 0.97 to 0.89. For the ISO based model applied to the evaluation data, mean absolute error and maximum absolute error increased by multiples of 3.3 and 6.3, respectively. Mean and maximum absolute error in predicting CO for evaluation data based on the EXP model were multiples of 2.3 and 1.7, respectively, of the training data. When applied to the evaluation data, the mean absolute error of the NPN ISO based model was significantly less than LMR EXP based model (t(55) = -2.159; p = 0.0352). The mean absolute error of the NPN EXP based model was significantly lower than the NPN ISO based model (t(55) = 4.721; p < 0.0001), based on the evaluation data. The relative accuracy of the LMR and NPN models based on the ISO data, at predicting the evaluation data, are illustrated in Figure 4. When CO was actually below 1.4 g/kWh, the LMR model predicted CO values ranging from -4.0 g/kWh to 11.68 g/kWh. In contrast, the NPN model predicted values ranged from 0.23 g/kWh to 7.5 g/kWh. Both models under predicted when actual CO was above 1.5 g/kWh (no load condition) and the highest actual output of over 26 g/kWh is where each model had its maximum absolute error.

11

Figure 4. Predicted CO values for evaluation data, from LMR and NPN models, derived from ISO training data. The relative accuracy of the LMR and NPN models based on the EXP data, at predicting the evaluation data, are depicted in Figure 5. When CO was actually below 1.5 g/kWh, the LMR model predicted CO values ranging from –4.1 g/kWh to5.1 g/kWh. In contrast, the NPN model predicted values ranged from 0.19 g/kWh to 1.1 g/kWh. The LMR and NPN models encountered maximum absolute error when actual CO was the highest (26.5 g/kWh).

12

Figure 5. Predicted CO values for evaluation data, from LMR and NPN models, derived from EXP training data.

Summary NPN and LMR models were compared for predicting CO emissions of an engine operating at a range of constant loads and speeds, based on ISO and EXP data sets from diesel engine emissions tests. Models were evaluated with data consisting of a fourth replication of data from the same tests used for the ISO and EXP training data. The LMR model developed with the ISO training data (R2 = 0.97) was able to explain only 54% of the variation in the evaluation data (R2 = 0.54). Although the ISO training data alone resulted in a strong relationship, the resulting model was over-trained and was not as effective at predicting the evaluation data with the additional engine operating conditions. The relatively small number of observations (n = 33) compared to the 16 inputs contributed to this problem. Results of this model indicated ISO 8178-4 test cycle B-type tests modes alone were not be sufficient to model the target range of engine operation. A second LMR model was developed using the EXP training data (R2 = 0.68). Performance was similar when applied to the evaluation data (R2 = 0.67) and this model out-performed the ISObased LMR model. The additional test modes (n = 135) in the EXP training data improved the LMR model evaluation, but still left 33% of the variation of CO unexplained.

13

An NPN model was developed with the ISO training data (R2 = 0.96). When applied to the evaluation data (R2 = 0.82) it explained 82% of the variance of CO, compared to 54% of the variance with the LMR model. The second NPN model developed with the EXP training data (R2 = 0.97) had the best performance, of all the models, with the evaluation data (R2 = 0.89). Mean and maximum absolute errors of 0.69 g/kWh and 12.79 g/kWh were also better than the other models. The expanded data for the EXP test modes would be preferable for modeling emissions. In the absence of EXP data, data collected from ISO 8178-4 test cycle B and modeled with NPN would result in a better model than using LMR with either the ISO or EXP training sets. The NPN method was superior to LMR in predicting CO of the evaluation data, even when trained with the smaller sample size (n = 33) of the ISO training data.

Conclusions This comparison of using NPN compared to LMR to predict CO emissions of a diesel engine demonstrated that NPN resulted in significantly better models (based on mean absolute error) than LMR. The NPN modeling method should be considered in circumstances where the researcher does not know the mathematical relationship of multiple inputs to the output and prefers to use a mathematical technique to find the relationships rather than inject his/her own bias into the model.

References Abdel-Aal, R. E. 2005. Improved classification of medical data using Improved classification of medical data using different feature subsets. Computer Methods and Programs in Biomedicine 80: 141-153. Abdel-Aal, R. E. 2006. Modeling and forecasting electric daily peak loads using abductive networks. Electrical Power and Energy Systems 28: 133-141. Abdel-Aal, R. E. 2007. Predictive modeling of mercury speciation in combustion flue gases using GMDH-based abductive networks. Fuel Processing Technology 88: 483-491. Abdel-Aal, R. E., and A. M. Mangoud. 1997. Modeling Obesity Using Abductive Networks. Computers and Biomedical Research 30: 451-471. Agarwal, A. 1999. Abductive Networks for Two-Group Classification: A Comparison with Neural Networks. The Journal of Applied Business Research 15(2): 1-12. Aksyonova, T. I., V. V. Volkovich, and I. V. Tetko. 2003. Robust Polynomial Neural Networks in Quantitative-Structure Activity Relationship Studies. Systems Analysis Modelling Simulation 43(10): 1331-1339. Barron, A. R. 1984. Predicted Squared Error: A Criterion for Automatic Model Selection. In S. J. Farlow (Ed.), Self-Organizing Methods in Modeling: GMDH Type Algorithms (Vol. 54, p. 350). New York, NY: Marcel Dekker, Inc. Barron, A. R., and R. L. Barron. 1988. Statistical Learning Networks: A Unifying View. In E. J. Wegman, D. T. Gantz, and J. J. Miller (Ed.), Computing Science and Statistics: Proceedings of the 20th Symposium on the Interface (pp. 192 - 203). Alexandria, VA: American Statistical Association. Barron, R. L., A. N. Mucciardi, F. J. Cook, J. N. Craig, and A. R. Barron. 1984. Adaptive Learning Networks: Development and Applications in the United States of Algorithms Related to GMDH. In S. J. Farlow (Ed.), Self-Organizing Methods in Modeling: GMDH Type Algorithms (Vol. 54, p. 350). New York, NY: Marcel Dekker, Inc.

14

Cerullo, M. J., and M. V. Cerullo. 2006. Using Neural Network Software as a Forensic Accounting Tool. Information Systems Control Journal 2. Drake, K. C. 1992. Highly-automated, non-parametric statistical learning for autonomous target recognition. Proceedings of the 20th AIPR Workshop. 1623, pp. 184 - 193. SPIE. Drake, K. C., and P. Hess. 1990. Abduction - A Numberic Knowledge Acquisition Approach. PC AI 4(5): 58-61. Drake, K. C., and R. Y. Kim. 1997. Abductive Information Modeling Applied to Financial Time Series Forecasting. Nonlinear Financial Forecasting, Finance and Technology, (pp. 95108). Drake, K. C., R. Y. Kim, T. Y. Kim, and O. D. Johnson. 1994. Comparison of polynomial network and model-based target recognition. Proceedings for the Sensor Fusion and Aerospace Applications II (pp. 2-11). SPIE. Duffy, J. J., and M. A. Franklin. 1975. A Learning Identification Algorithm and Its Application to an Environmental System. IEEE Transactions on Systems, Man, and Cybernetics 5(2): 226-240. El-Alfy, E.-S. M., and R. E. Abdel-Aal. 2008. Construction and analysis of educational tests using abductive machine learning. Computers & Education 51: 1-16. Fann, K. T. 1970. Peirce's Theory of Abduction. The Hague: Martinus Hijhoff. Farlow, S. J. 1981. The GMDH Algorithm of Ivakhnenko. The American Statistician 35(4): 210215. Farlow, S. J. 1984. The GMDH Algorithm. In S. J. Farlow (Ed.), Self-Organizing Methods in Modeling: GMDH Type Algorithms (Vol. 54, p. 350). New York, NY: Marcel Dekker, Inc. Gardner, S. 1992. Polynomiual Neural Nets for Signal Detection in Chaotic Backgrounds. Southcon/92 Conference Record, (pp. 247 - 252). Orlando, FL. Griffin, M. P., D. F. Scollan, and J. R. Moorman. 1994. The Dynamic Range of Neonatal Heart Rate Variability. Journal of Cardiovascular Electrophysiology 5: 112-124. Hansson, P.-A., M. Lindgren, and O. Noren. 2001. A Comparison between Different Methods of calculating Average Engine Emissions for Agricultural Tractors. Journal of Agricultural Engineering Research 80(1):37-43. Hayashi, I., and H. Tanaka. 1990. The Fuzzy GMDH Algorithm by Possibility Models and Its Applications. Fuzzy Sets and Systems 36: 245-258. Hogan, J. A., D. G. Watson, and T. V. Harrison. 2007. Data Points and Duration for Estimating Fuel Consumption of a LPG Engine. Agricultural Engineering International IX. Ikeda, S. 1984. Nonlinear Prediction Models for River Flows and Typhoon Precipitation by SelfOrganizing Methods. In S. J. Farlow (Ed.), Self-Organizing Methods in Modeling: GMDH Type Algorithms (p. 350). New York: Marcel Dekker. Ikeda, S., M. Ochiai, and Y. Sawaragi. 1976. Sequential GMDH Algoritm and Its Application to River Flow Prediction. IEEE Transactions on Systems, Man, and Cybernetics 6(7): 473479. ISO. 1996. ISO 8178-4; Reciprocating internal combustion engines, exhaust emission measurements, Part 4: test cycles for different engine applications. Genéve, Switzerland: International Organization of Standardization. Ivakhnenko, A. G. 1968. The Group Method of Data Handling -- A Rival of the Method of Stochastic Approximation. Soviet Automatic Control 13(3): 43-55. Ivakhnenko, A. G. 1971. Polynomial Theory of Complex Systems. IEEE Transactions of Systems, Man, and Cybernetics 1(4): 364-378.

15

Ivakhnenko, A. G. 1978. The Group Method of Data Handling in Long-Range Forecasting. Technological Forecasting and Social Change 12: 213-227. Ivakhnenko, A. G., and G. A. Ivakhnenko. 1995. The Review of Problems Solvable by Algorithms of the Group Method of Data Handling (GMDH). Mathematical Theory of Pattern Recognition 5(4): 527-535. Ivakhnenko, A. G., and V. A. Ovchinnikov. 1975. Control of Dniepr Hydroelectric Power Station Water Reservoirs with Two Optimality Criterion Based on Self-Organization. Soviet Automatic Control 8: 41-49. Ivakhnenko, A. G., and V. S. Stepashko. 1975. Self-Organization of Models and Long-Term Prediction of River Runoff by the Balance Criterion. Soviet Automatic Control 8(5): 2733. Ivakhnenko, A. G., B. G. Mikryukov, A. N. Boshnyakov, and B. K. Svetal'skiy. 1975. Objective Identification of Thermal State of Blast Furnace by Self-Organization Methods. Soviet Automatic Control 8: 35-39. Ivakhnenko, A. G., V. N. Vysotskiy, V. D. Fedorov, V. M. Maksimov, and S. A. Sokolova. 1975. Simulation of the Dynamics of the Environment-Plankton Ecological System of the White Sea and Analysis of Its Stability. Soviet Automatic Control 8: 6-13. Ivakhnenko, A. G., V. S. Stepashko, M. G. Khomovnenko, and Y. P. Galyamin. 1977. SelfOrganization of Dynamic Models of Growth of Agricultural Crops for Control of Irrigated Crop Rotation. Soviet Automatic Control 10: 23-33. Kim, K. S. 2002. Value management and common accounting performance measures for corporations. Expert Systems with Applications 22: 331-336. Kim, K. S. 2005. Predicting bond ratings using publicly available information. Expert Systems with Applications 29: 75-81. Kim, K. S. 2007. Critical factors for determining market prices of stocks traded on the Korean Stock Exchanges (KSE). International Journal of Intelligent Systems Technologies and Applications 2(1): 32-40. Kim, K. S., and W. A. Nelson. 1996. Assessing the Rental Value of Residential Properties: An Abductive Learning Networks Approach. The Journal of Real Estate Research 12(1): 6377. King, M. A., J. F. Elder, B. Gomolka, E. Schmidt, M. Summers, and K. Toop. 1998. Evaluation of Fourteen Desktop Data Mining Tools. IEEE International Conference on Systems, Man, and Cybernetics. San Diego, CA. Kleinsteuber, S., and N. Sepehri. 1996. A Polynomial Network Modeling Approach to a Class of Large-Scale Hydraulic Systems. Computers & Electrical Engineering 22(2): 151-168. Krijnsen, H. C., R. Bakker, W. E. van Kooten, H. P. Calis, R. P. Verbeek, and C. M. van den Bleek. 2000. Evaluation of Fit Algorithms for NOx Emission Prediction for Efficient DeNOx Control of Transient Diesel Engine Exhaust Gas. Industrial & Engineering Chemical Research 39: 2992-2997. Lebow, W. M., R. K. Mehra, and P. M. Toldalagi. 1984. Forecasting Applications of GMDH in Agricultural and Meteorological Time Series. In S. J. Farlow (Ed.), Self-Organizing Methods in ModeilngL GMDH Type Algorithms (p. 350). New York: Marcel Dekker. Malone, J. M. 1984. Regression Without Models: Directions in the Search for Structure. In S. J. Farlow (Ed.), Self-Organizing Methods in Modeling: GMDH Type Algorithms (Vol. 54, p. 350). New York, NY: Marcel Dekker, Inc. MarketMiner. 2004. MarketMiner ModelQuest Analyst Version 6.0 User's Guide. Charlottesville, VA: MarketMiner Inc.

16

Montgomery, G. J. 1989. Abductive Diagnostics. AIAA Computers in Aerospace Conference, 7th (pp. 267-275). Washington, DC: American Institute of Aeronautics and Astronautics. Montgomery, G. J., and K. C. Drake. 1991. Abductive reasoning networks. Neurocomputing 2: 97-104. Montgomery, G. J., P. Hess, and J. S. Hwang. 1990. Abductive Networks Applied to Electronic Combat. Applications of Artificial Neural Networks. 1294, pp. 454-465. Bellingham, WA: SPIE. Mucciardi, A. N., and E. E. Gose. 1972. An Automatic Clustering Algorithm and Its Properties in High-Dimensional Spaces. IEEE Transactions on Systems, Man, and Cybernetics 2(2): 247-254. Muller, J.-A., A. G. Ivachnenko, and F. Lemke. 1998. GMDH Algorithms for Complex Systems Modelling. Mathematical and Complex Modelling of Dynamic Systems 4(4): 275-316. National Instruments. 2003. LabVIEW 7 Express User Manual. Austin, TX: National Instruments. Nikolaev, N. Y., and H. Iba. 2003. Polynomial harmonic GMDH learning networks for time series modeling. Neural Networks 16: 1527-1540. Pachepsky, Y. A., and W. J. Rawls. 1999. Accuracy and Reliability of Pedotransfer Functions as Affected by Grouping Soils. Soil Science Society of America Journal 63: 1748-1757. Parks, P. C., A. G. Ivakhnenko, L. M. Boichuk, and B. K. Svetalsky. 1975. A Self-Organizing Model of the British Economy for Control with Optimal Prediction Using the Balance-ofVariables Criterion. International Journal of Computer and Information Sciences 4(4): 349-379. Peirce, C. S. 1955. Abduction and Induction. In J. Buchler (Ed.), Philosophical Writings of Peirce (p. 386). New York, NY: Dover Publications, Inc. Pyo, S., M. Uysal, and H. Chang. 2002. Knowledge Discovery in Database for Tourist Destinations. Journal of Travel Research 40: 396-403. Ramamurthy, R., N. N. Clark, C. M. Atkinson, and D. W. Lyons. 1998. Models for Predicting Transient Heavy Duty Vehicle Emissions. SAE Paper 982652. Warrendale, PA: SAE. Reddy, V. R., and Y. A. Pachepsky. 2000. Predicting crop yields under climate change conditions from monthly GCM weather projections. Environmental Modeling & Software 15: 79-86. Rosenblatt, F. 1958. The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain. Psychological Review 65(6): 386-408. SAE. 2002. SAE J1939-71 AUG2002 Recommended Practice for Truck and Bus Control and Communications Network—Vehicle Application Layer. Troy, MI: SAE. SAS. 2007. SAS 9.1. Cary, NC: SAS Institute, Inc. Scott, D. E., and C. E. Hutchinson, C. E. 1976. The GMDH Algorithm -- A Technique for Economic Modeling. In W. G. Vogt, & M. H. Mickle (Ed.), Modeling and Simulation, Proceedings of the Seventh Annual Pittsburg Conference 7, pp. 729-733. University of Pittsburgh. Scott, D. E., and C. E. Hutchinson. 1984. An Application of the GMDH Algorithm to Economic Modeling. In S. J. Farlow (Ed.), Self-Organizing Methods in Modeling: GMDH Type Algorithms (p. 350). New York: Marcel Dekker. Shewhart, M. 1991. Application of Machine Learning and Expert System to Statstical Process Control (SPC) Chart Interpretation. 2md CLIPS Conference Proceedings Volume 1 (pp. 123 - 135). Houston, TX: NASA.

17

Silis, Y. Y., and A. B. Rozenblit. 1976. Algorithm for Construction of Decision Function in the Form of a Complex Logic Proposition. Soviet Automatic Control 9(2): 1-5. Stepanov, V. A. 1974. Results of Mass Checking of GMDH Efficiency in Long-Term Prediction of Demand for Consumer Goods. Soviet Automatic Control 7: 46-51. Tamura, H., and T. Kondo. 1984. On Revised Algorithms of GMDH With Applications. In S. J. Farlow (Ed.), Self-Organizing Methods in Modeling: GMDH Type Algorithms (p. 350). New York: Marcel Dekker. Watson, D. G., T. V. Harrison, and R. W. Steffen. 2008. Data Points and Duration for Estimating Fuel Consumption of a Diesel Engine. Agricultural Engineering International X. Yanowitz, J., M. S. Graboski, and R. I. McCormick. 2002. Prediction of In-Use Emissions of Heavy-Duty Diesel Vehicles from Engine Testing. Environmental Science & Technology 36(2): 270-275.

18

Suggest Documents