Automation in Construction 54 (2015) 106–115
Contents lists available at ScienceDirect
Automation in Construction journal homepage: www.elsevier.com/locate/autcon
Optimized artificial intelligence models for predicting project award price Jui-Sheng Chou ⁎, Chih-Wei Lin, Anh-Duc Pham, Ji-Yao Shao Department of Civil and Construction Engineering, National Taiwan University of Science and Technology, 43, Sec. 4, Keelung Rd., Taipei 106, Taiwan
a r t i c l e
i n f o
Article history: Received 2 September 2014 Received in revised form 28 November 2014 Accepted 22 February 2015 Available online xxxx Keywords: Artificial intelligence Optimization Project management Cost estimation Bid award amount Regression analysis Artificial neural network Case-based reasoning Genetic algorithm
a b s t r a c t Bridges are essential components of transportation systems. The bidding process is the main determinant of whether a contractor is commissioned to complete a construction project. Therefore, contractors must rapidly and precisely estimate construction costs and the bid award amount. This study involved optimizing artificial intelligence models to forecast bid award amounts for bridge construction projects. A genetic algorithm is used in several forecasting models, including models based on multiple regression analysis, artificial neural networks (ANNs), and case-based reasoning (CBR). Data for public bridge construction projects were collected from the Taiwan government e-procurement system. The cross-validation results show that the mathematical model for the ANNs provides more reliable simulations and has a superior fit compared with the regression methods, CBR, and the conventional approach. This study provides an optimization process for estimating project award prices that improves construction and evaluations of AI-based models as well as an auxiliary tool that contractors can use to make bidding decisions. © 2015 Elsevier B.V. All rights reserved.
1. Introduction Bridges are major structures within transportation systems and serve as lifelines that connect people to economic activities. These structures enable the public to cross rivers, valleys, and terrains of all types. Taiwan is a country with many rivers and streams of all sizes. In particular, Taiwan has more than 9699 bridges constituting a total length of 502,021.8 m [33]. The bridges vary in their styles, spanning rivers and valleys or intersections of freeways, roads, and highways. Thus, bridges are essential transportation components in Taiwan. The bidding process is a crucial determinant of whether construction firms receive project contracts. Because the main objective of construction companies is to expand their business volume by being commissioned to complete various projects, preparing realistic and accurate bids is essential [3]. The Government Procurement Law in Taiwan classifies project bidding invitations as open, selective, or restricted and divides the bidding invitation process into two or three stages. Bidding award methods include the lowest bid, most advantageous bid, and multiple-awards bid [38]. The government has jurisdiction over public construction in Taiwan, and contracts are generally
⁎ Corresponding author. Tel.: +886 2 2737 6321; fax: +886 2 2737 6606. E-mail addresses:
[email protected] (J.-S. Chou),
[email protected] (C.-W. Lin),
[email protected] (A.-D. Pham),
[email protected] (J.-Y. Shao).
http://dx.doi.org/10.1016/j.autcon.2015.02.006 0926-5805/© 2015 Elsevier B.V. All rights reserved.
awarded through open bidding, in which the lowest bid wins the contract. Because general bidding prices depend on cost and profit, competitive bidding requires vendors to consider the bidding prices of other vendors and the base price acceptable to the client before proposing their own bid. However, bidding firms usually rely on subjective judgment for initial costs estimates. The subjective judgments of many people may produce estimation errors that do not provide a sound basis for setting actual bid amounts. Estimation errors may also distort the bid amount. Overestimation of costs can cause loss of the contract, whereas underestimation can cause a financial loss for the company. Therefore, construction firms must rapidly and precisely estimate construction costs and bid award amounts. A limitation of bidding models used in the construction industry [7, 27] is that they require sensitive information on competitors. Many models have been developed for estimating construction costs [11–13, 26,44]. These methods are more suitable for use at the early stages of a construction project, such as at the conceptual and schematic design stages, rather than at the bidding stage. Previous studies have asserted that, because the knowledge-based economy has major implications for the construction industry, construction firms should apply tools that can improve their performance and competitiveness [37]. Most approaches based on artificial intelligence (AI) have focused on developing algorithms for improving modeling accuracy or the speed of model building [15,28,34,46]. Therefore,
J.-S. Chou et al. / Automation in Construction 54 (2015) 106–115
AI-based models are potential tools for solving civil engineering and management problems. This study applied techniques based on data mining and AI, namely genetic algorithm (GA), multiple regression, artificial neural network (ANN), and case-based reasoning (CBR) models, to forecast bid award amounts on the basis of limited available information and to provide a reference for vendors when making bid submission decisions. For validation, the proposed modeling system uses a k-fold cross-validation algorithm [31]. A case study method was used to analyze cases in the government e-procurement system involving bridge bidding for public roads within a budget range of NT$1 billion. The bridges were either beam or rigid-frame bridges. The remainder of this paper is organized as follows: Section 2 summarizes the literature on this topic; Section 3 introduces the theories and research methodologies employed in the study; Section 4 elucidates the data prearrangement process and model construction (specifically, the combination of predictive models and a GA are performed to estimate the award bid amounts); finally, Section 5 draws conclusions and provides suggestions for future studies. 2. Literature review For construction contracts, the conventional approach used by government agencies is competitive bidding. Many studies of strategies for bidding and identifying the most appropriate projects have addressed the determination of whether to submit a bid and the optimal bid value (or markup) [47]. The “lowest bidding price for a given base price” method is currently used in the bidding process for Taiwan construction projects. This study focuses only on the second decision (i.e., bid award method). Many studies have addressed bidding models, which can be classified into statistical models [7,20,39,41], multicriteria models [6,10], game theory [24], and AI models [9]. Conventional bidding models are based on statistical and probability theory. The Friedman model (1956) provides the optimal set of bids on contracts in a competitive bidding scenario [20]. The general bidding model developed by Carr (1982) estimates the probability distribution of contractor costs and bids by opponents [7]. Ahmad (1990) used a utility-value-based approach and developed a bidding methodology of decision analysis for solving the bidding problem [1]. The correlation analysis of the lowest bidding price and the final project completion price performed by Williams (2003) showed that a natural logarithm conversion increases the correlation between the lowest bidding price and final project completion price [49]. However, making a bidding decision is a complex process that involves many factors; the level of complexity may not be adequately reflected in these models. Several bidding-related studies have focused on determining bid markups in relation to an estimated project cost [16,17]. Dozzi et al. (1996) applied a multicriteria utility theory to bid-markup decisions for construction projects [19]. Wang et al. (2007) integrated a simulationbased cost model and a multicriteria evaluation model to reflect bidder preferences regarding decision criteria [45]. Cheng et al. (2011) developed a multicriteria model for making bidding-related decisions to assist contractors in making bidding decisions and to determine the scale of markup for the submitted bid [10]. Another major line of research is estimating project costs during bidding [44]. Chou (2009a) used a generalized linear model for accurately and reliably estimating public road construction costs and for continually tracking construction expenses [13]. Several other studies have applied or evaluated AI-based models for estimating construction prices and costs [5,23,50]. Williams (2002) used bidding data in ANN and regression models to predict the completed cost of competitively bid highway projects and found that a natural log transformation of the data strengthened the linear relationship between a low bid and completed cost [48]. Another AI-based forecasting technique is CBR, which is a subbranch of AI. This computational method involves solving current problems on
107
the basis of previous solutions. Chou (2009b) applied CBR in analyzing road-paving maintenance costs and forecasting maintenance costs at the initial project planning stages [14]. Yao and Yang (1998) employed CBR to forecast the work period and costs of a construction project [51]. Ji et al. (2010) developed a CBR revision model in which a “revise” phase is implemented in the CBR cycle to improve the accuracy of the predicted costs for multifamily housing projects [28]. Recently, a growing tendency is to enhance performance by using hybrid models. Křivý et al. (2000) used a randomized algorithm to identify parameter values in a nonlinear regression (NLR) [32]. Cheng et al. (2010) combined AI with fuzzy neural network concepts to enhance the precision of cost estimates at the early stages of a construction project [11]. To study early-planning practice and its relationship to final project outcomes in the Taiwan building construction industry, Wang et al. (2012) used ANN ensemble and support vector machine classification models to predict construction costs and scheduling successes [46]. Studies have proven that hybrid methods that combine AI and GA are powerful. A GA [21] is a stochastic search algorithm inspired by the mechanics of natural evolution, including survival of the fittest, reproduction, crossover, and mutation. To improve performance, GAs are often combined with other AI techniques. To optimize ANN parameters, Seo (2006) proposed a hybrid method in which product life cycle costs are estimated using a GA for simultaneous optimization of ANN parameters [42]. Hegazy and Ayed (1998) substituted a GA for neuron weighting in ANNs to establish a road item and project estimation model [23]. Evolving hybrid systems combining CBR and a GA have also been used in various fields [2,8,18,29]. However, for cost estimation in bridge projects, AI models have been used only in the maintenance and management phases; the forecasting techniques and accuracy rates remain inadequate. Although erecting a bridge is among the most vital construction projects in Taiwan, bridge projects are complex to classify, and a basis of calculation remains to be developed. Therefore, construction companies often experience difficulty in calculating times and costs when estimating bidding prices. Providing bidders with improved bid award estimation techniques would increase the accuracy of award price predictions for competitive bidding. Generally, related studies have focused on estimating construction costs rather than bid award amounts. By contrast, this study focused on predicting bid prices in instances in which the bid has already been awarded (i.e., this study did not involve analyzing whether a bid amount should be accepted or rejected). This study used hybrid models in which AI techniques are integrated to enhance accuracy in predicting bid award prices. 3. Methodology 3.1. Statistical analysis Many statistical methods for describing, testing, and forecasting data have been developed. Analysis of variance and curve evaluations are used for factor testing and verification, and regression analysis is used for forecasting. However, because the values used in regression analysis must conform to a normal distribution, models that do not meet this normality require a variable transformation [30,40] such as the Box– Cox transformation to convert original data into a normal distribution, as shown in Eq. (1): 8 < yλ −1 ; if λ≠0 yðλÞ ¼ : λ logðyÞ; if λ ¼ 0
ð1Þ
where λ can use GA searching to ensure a normal distribution in the posttransformation data of the chi square test. After the variable transformation, the data can be subjected to regression analysis.
108
J.-S. Chou et al. / Automation in Construction 54 (2015) 106–115
Regression analysis is a statistical method for exploring the correlations between independent (xi) and dependent (y) variables. Its purpose is to identify the equation that most accurately predicts the dependent variable as a linear function of two or more independent variables according to the causal relationships among changes in internal factors. The optimization method can also be used to identify linear combinations of independent variables (known as the regression line) and can quantitatively interpret variability in the dependent variables. The general regression equation can be expressed as Eq. (2): y ¼ β 0 þ β i xi þ ε
ð2Þ
where y is the dependent variable; β0 is the y intercept; βi is the slope associated with xi; and ε is the error term. 3.2. Artificial neural network ANN models are powerful tools that can solve extremely complex problems. The processing elements of a neural network resemble neurons in the human brain and can be considered simple computational elements arranged in layers. In a multilayer perceptron (MLP) neural network, the input layer contains a set of sensory input nodes representing influential bid price components, one or more hidden layers contain computation nodes, and an output layer contains one computation node representing the bid award amount. Like any intelligence model, ANNs have the capability to learn. The most widely used and effective learning algorithm for training an MLP neural network is the back-propagation (BP) algorithm. The activation of a neuron in a hidden output layer is calculated as follows: net k ¼
X
wk j o j and yk ¼ f ðnet k Þ
ð3Þ
where netk is the activation of the k-th neuron, j is the set of neurons in the preceding layer, wkj is the weight of the connection between the neurons k and j, oj is the output of the neuron j, and yk is the sigmoid or logistic-transfer function. f ðnet k Þ ¼
1 1 þ e−net k
historical and validation data were selected for comparisons. The comparison analysis results showed a set of similarity values between each validation and historical data item. The historical data item with the highest similarity value was used to forecast the validation data. The similarity value was based on the sum of each factor multiplied by the weight. The factor similarity calculations were divided into categorical variables and numerical variables. The following rule was used to calculate similarity in the categorical variables: when the kth categorical variable in the ith case and the kth categorical variable in jth case are identical, the output is 1; otherwise the output is 0, as shown in Eq. (7): (
k
k
k
k simi; j
k Pi
¼ Pj
simi; j ¼ 0; if P i ≠P j
l simi; j
min P li ; P lj ¼ max P li ; P lj
ð8Þ
where l represents the lth numerical variable, and the remaining symbols are identical to those in the categorical variable equation. Therefore, simli,j can be interpreted as the similarity value of the lth numerical variable for the ith historical datum and the jth validation datum. After the similarity of the factor variables is calculated, the overall similarity is calculated using a weighted-mean method, as shown in Eq. (9): m X
simi; j ¼
k¼1
k
wk simi; j þ m X
l
wl simi; j
wk þ
n X
ð9Þ wl
l¼1
ð5Þ
The change value Δwkj(t) is calculated as Δwk j ðt Þ ¼ ηδp j op j þ αΔwk j ðt−1Þ
n X l¼1
k¼1
wk j ðt Þ ¼ wk j ðt−1Þ þ Δwk j ðt Þ
ð7Þ
k
where sim is the similarity value, P is the case or project, i is the ith historical datum, j is the jth validation datum, and k is the kth categorical variable. Therefore, simki,j can be interpreted as the similarity value of the kth categorical variable of the ith historical datum and the jth validation datum. To calculate the similarity of numerical variables, the general rule is to divide the values of the lth numerical variables of the ith case and the jth case by each other. The higher and lower values are set as the denominator and numerator, respectively, as shown in Eq. (8):
ð4Þ
The formula for training and updating weights wkj in each cycle t is
¼ 1; if
ð6Þ
where η is the learning rate parameter, δpj is the propagated error, opj is the output of the neuron j for the record p, α is the momentum parameter, and Δwkj(t − 1) is the change in wkj in the previous cycle. BP networks learn by storing nonlinear information such as influential factors and the strength of influences. Connection weights are adjusted during training to match predictions for target values in specific records; therefore, the outcomes generated by the network improve as the network “learns.” BP networks use the steepest-descent method as a basis for learning. This method involves using computation methods to perform a descending search for optimization. However, the solutions are confined to the local optimum. The solution to this problem is to use GAs instead of the steepest-descent method [42]. 3.3. Case-based reasoning CBR is an AI method that entails establishing a set of systematic reasoning or inference processes for forecasting based on previous experience [43]. This study used weighted CBR for modeling, in which
where simi,j is the overall value of the similarity between the ith historical datum and the jth validation datum; and wk and wl are the weights of the categorical variable and numerical variable, respectively. After the overall similarity value is calculated, the historical case with the highest overall similarity value is used to forecast the jth validation datum. 3.4. Genetic algorithm GAs, which are adaptive techniques inspired by the mechanics of natural evolution, are widely used to solve search and optimization problems. Here, GAs were used to identify the Box–Cox λ value and parameter values for models based on NLR, the BP network, and CBR. GAs emulate the genetic inheritance process of selecting genes suitable for preservation, which includes crossover and mutation [21]. The GA computation mechanism increases convergence speed, and its mutation mechanism avoids confinement to local optimal solutions. This study used Evolver, a spreadsheet-based GA optimization tool from Decision Tools software (Palisade) for GA modeling [36]. The population size was set as 500, the crossover rate was set as 0.8, and the mutation rate was set as 0.1. An appropriate reference range must be set during computations. Setting an excessively narrow range prevents identifying the ideal solution, and setting an excessively wide range causes computation to be time consuming. Because of the
J.-S. Chou et al. / Automation in Construction 54 (2015) 106–115
multiple parameters used to weight the aforementioned models, the use of a dynamic range obviated the need to set boundaries for each parameter. Fig. 1 shows the procedure for the GA.
Fold 1 Fold 2
3.5. Cross-validation algorithm
Fold 3
The k-fold cross-validation algorithm is often used to minimize bias associated with the random sampling of training and holdout data samples [22]. Kohavi (1995) confirmed that 10-fold validation testing can optimize computation time and variance [31]. Thus, a stratified 10-fold cross-validation approach was used to assess model performance in this study. After randomization, the dataset was divided into 10 subsets of equal size. Next, one subset was removed, and the learning scheme was trained using the remaining nine subsets. The error rate was then calculated on the basis of the holdout set. The learning procedure was executed 10 times for different training sets, with one subset removed (Fig. 2). The algorithm accuracy is expressed as the average accuracy achieved by the 10 models in 10 validation rounds.
Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9
4. Data organization and model construction
Fold 10
4.1. Data collection and prearrangement This study collected bridge bid invitations and bid award data from a government e-procurement system by downloading bid invitation documents regarding bridge construction bid awards from June 2008 to May 2009. The data obtained from the Taiwan Public Construction Commission database included bids for 275 bridge construction projects. Data related to incomplete projects or projects beyond the scope of this research were eliminated, leaving 101 projects of interest. After the exclusion of three cases that exceeded NT$1 billion, the final sample contained 98 cases. In the cross-fold validation process, 98 projects were divided into 10 folds, with nine folds of 10 projects and one fold of eight projects. After the validation process, nine new cases were collected from June to August 2009 for assessing the model's prediction power.
Initialize generation (Population size, crossover rate, mutation rate)
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
109 Testing data
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Training data Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9
Fig. 2. Tenfold cross-validation algorithm.
The downloaded bid invitation documents included design drawings, unit price analysis tables, detailed price quotes, and construction contracts. Valid data and information were obtained from each
Fitness evaluation of each individual in population
Yes
Fold 2
Fold 1
Fold 10
Update best fitness value
Finish (The best solution)
Fold 1
Termination criteria met? No
Using roulette wheel to choose parents (Selection)
Generate children (Crossover)
Mutate children
Fig. 1. GA flow chart.
Form a new population
110
J.-S. Chou et al. / Automation in Construction 54 (2015) 106–115
Table 1 Data fields and related information. Fields
Units
Range
Type
1. Definition 1.1 Project No. 1.2 Document No.
– –
1–98 1–98
Value Value
2. Bid information 2.1 Project name 2.2 Bid year 2.3 Price index 2.4 Bid award price 2.5 Bid award price & price index 2.6 Contingency reserve 2.7 Contingency reserve & price index 2.8 Budget amount 2.9 Budget amount & price index 2.10 Bid authority
– Year % NT$ NT$ NT$ NT$ NT$ NT$ –
2008–2009 106.00–131.03 330,000–2,531,000,000 311,262–2,387,285,418 342,000–3,060,000,000 322,581–2,886,247,877 368,000–3,187,900,000 347,104–3,006,885,493 0–4
Value Value Value Value Value Value Value Value Category
2.11 Compliance period 2.12 Bid times
Day –
10–1,093 1–11
Value Value
3. Bid details 3.1 No. of unit price analysis table 3.2 No. of detailed estimate
– –
6–243 8–443
Value Value
4. Illustration 4.1 Construction type 4.2 Bridge maintenance 4.3 No. of bridge abutment and piers 4.4 Road constructed area 4.5 Upper constructed area 4.6 Lower bridge structure 4.7 Construction section
– – m2 m2 m3 –
0–3 1–13 0–30 0–24,059.7 0–24,187 0–34,954.24 0–4
Category Value Value Value Value Value Category
4.8 Additional bridge job
–
0–4
Category
document. Table 1 shows the collected factor data and relevant information. Monetary amounts were transformed using the 2006 price index as the standard. All bid award amounts, contingency reserves, and budget amounts were post-price-index-transformation prices. For data prearrangement, regression analysis was used to set appropriate values when the collected data were incomplete; these values were then entered in the empty fields. Plots of information or data distributions for each factor exhibited nonnormal distributions. A Box– Cox transformation was used to convert the nonnormal distribution results into a normal distribution by using Eq. (10) for optimized transformations. The Evolver software was used to identify the λ value for the Box–Cox transformation [36].
2
Minimize : χ Constraints : −10bλb10 where 2
10
χ ¼ ∑i¼1
ðoi −ei Þ2 ei
ð10Þ
The parameter oi is the quantity for which the actual x belongs to the ith fold; ei is the quantity for which the theoretical x should belong to λ
the ith fold; and ∀x ¼ x λ−1. Several factors considered in this study (e.g., road construction area, upper construction area, and lower bridge structure) were set to 0 because these work items did not exist in every set of bid data. When a value is set to 0, the Box–Cox transformation is not continuous and cannot effectively convert the data into a normal distribution. However, approximately normal distributions were achieved for transformations of the remaining data. Next, curve evaluations (i.e., linear, logarithm, reciprocal, quadratic, cubic, compound, power law, S, and index) of
Note
Base year, 2006 (2006 = 100%)
0: Central; 1: East; 2: South; 3: North; 4: Ministry of Transportation and Communications
0: Construction; 1: Conversion; 2: Rebuild; 3: Build
0: All; 1: Bridge; 2: Pier; 3: Bridge foundation; 4: Bridge foundation + pier 0: None; 1: Bank protection; 2: Road; 3: River bed remediation; 4: Bank protection + river bed remediation + road
the independent and dependent variables were performed. In the curve evaluation, the posttransformation bid award amount showed an increased correlation with the posttransformation factor data, especially the posttransformation budget and bid award amounts. Factor data that were not transformed showed declining trends. 4.2. Predictive model construction 4.2.1. Multiple regressions 4.2.1.1. Generalized linear regression. A generalized linear regression (GLR) model was constructed using SPSS statistical software with numerical and categorical variables as the substituted variables. The ANOVA test results showed that the substituted variables influenced the dependent variables; therefore, a stepwise regression was selected as the regression method. In addition, the statistical assumptions of the linear regression method must exhibit normality, consistency, and independence [35]. A finding of normality indicates a residual distribution with a normal distribution. Consistency indicates that the residual variance is the same as that of the dependent variables, and independence indicates that the residual of each data item is independent. Table 2 Stepwise regression factor selected in each fold. Field name
1
2
3
4
5
6
7
8
9
10
Budget amount (after transformation) ● ● ● ● ● ● ● ● ● ● Compliance period (after transformation) ● ● ● ● ● ● No. of detailed estimate ● ● ● ● ● Compliance period (before transformation) ● ● ● ● Bid authority ● ● ● ● Constructed area ● ● ● ● Budget amount (before transformation) ● ● Construction section ●
J.-S. Chou et al. / Automation in Construction 54 (2015) 106–115
Input layer
Output layer
Hidden layer w111 w112
X1
111
b11
w1
w211
13 1
X2
w 12 w 122 w1
w221
b12
b21
Output
23
w
13
1
w231
X3
2
w 13 w133
b13
Fig. 3. Neural network architecture diagram.
After the transformation, the residual distribution and normal probability plots of each fold before and after the conversion were compared. Both distribution plots were used to determine whether the residual was normal. A normal probability plot was used to determine when data and trend lines nearly overlap, which indicates a normal distribution. Otherwise, a distribution is nonnormal. Analysis of the residuals before transformation showed that only fold eight conformed to normality. However, all posttransformation residuals approximated normality. The only general rule established for interpreting residual scatter plots is that a horn shape indicates a violation of consistency. In addition, an increase in residual errors with an increase in dependent variables indicates a violation of independence. After transformation, no residuals of the folds violated consistency or independence assumptions. Nevertheless, the posttransformation residuals were superior to the residuals before transformation regarding these conditions. Thus, the posttransformation model conformed to statistical assumptions. Table 2 sequentially lists the variables obtained through posttransformation stepwise regression according to the number of times that they appear. Because the F test was employed as a basis for selecting variables in stepwise regression, the low p values produced by the variable combinations of each fold were considered preferable. Table 2 shows that the posttransformation budget amount was selected for each fold, indicating that the posttransformation budget amount tended to generate low p values in the model. This suggests that the posttransformation budget amount was highly correlated with the posttransformation bid award amount. The second highest correlation was that between the posttransformation compliance period and number of bid estimates.
4.2.1.2. Genetic-algorithm-based nonlinear regression. GAs from the Decision Tools software were used for the NLR analysis and modeling [36]. The analysis results showed that the relationship between the pretransformation bid award amount and budget amount was highly linear, quadratic, and cubic and in a tetration form. This study used the polynomial or tetration forms of the budget amount to forecast the bid award amount. Using only one variable of the influential factors for forecasting may have increased variation in the model output when outliers were present. However, subjecting all of the linear regression variables to NLR may have caused overfitting. Thus, influential factors with strong correlations with the bid award amount (Z), such as the budget amount (X1), compliance period (X2), and unit-price analysis table quantity (X3), were used to establish the following three equations for regression (Eqs. 11–13): β
β
forecast
¼ β0 þ β1 X 1 þ β2 X 2 3 þ β4 X 3 5
forecast
¼ β0 þ β1 X 1 þ β2 X 2 3
forecast
¼ β0 þ β1 X 1 þ β2 X 3 3
Zi Zi Zi
ð11Þ
β
ð12Þ
β
ð13Þ
After the mathematical model was constructed, the solutions were optimized. GAs are typically used to (a) minimize the mean absolutepercentage error (MAPE) and (b) apply robust-optimization concepts to identify minimized solutions for the linear combinations of the MAPE and its standard deviation. Solutions with fewer variations at a determined MAPE value were identified to reduce model variation.
Not accpeted
Net configuration
Input
Output
Error
Verify
Accepted
Optimized Model Parameters
Randomly set weight
Calculation
Fit
Yes
No Weight (GA) Fig. 4. GA-ANN process flow chart.
Stop regression
112
J.-S. Chou et al. / Automation in Construction 54 (2015) 106–115 Fold 1
Fold 1
Fold 2
Fold 2
Fold 3
Fold 3
Fold 4
Fold 4
Fold 5
Fold 5
Fold 6
Fold 6
Fold 7
Fold 7
Fold 8
Fold 8
Fold 9
Fold 9
Fold 10
Fold 10
Testing data Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Training data Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8 Fold 9 Fold 10
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8
k-fold cross-validation algorithm
Fold 9 Fold 10
Adjust weights
Input case-base Project No. j
Attribute similarity calculation
Project No. i
Calculate case similarity
No Retrieved case verification i=i+1
Most similar project
No
Test case verification
Error estimation
GA convergence? Yes
No
j=j+1
Record weight
Fold Fold Fold Fold Fold Fold Fold Fold Fold Fold 1 2 3 4 5 6 7 8 9 10
Apply the optimized weights to CBR model
Fig. 5. Optimization of weighted CBR flow chart.
When i = 1, 2, or 4, it denotes the regression coefficients. When i = 3 or 5, it denotes the power term, and α is the weight allocation of the MAPE and its standard deviation during minimization. Two separate models with α = 0.9 and 1 were constructed in the study.
The optimization objective or target can be expressed as follows: ½α MAPE þ ð1−α Þ σ ðMAPEÞ Minimize : 11 11 Constraints : −10 bβ0 b10 −10bβ1;2;3;4;5 b10 where MAPE ¼
actual n forecast −Z i 1X Z i n i¼1 Z actual i
ð14Þ
where i represents the ith case; Zforecast is the NLR output value of the ith i is the actual value of the ith case; and the βi represents the case; Zactual i regression parameter. An i equal to 0 denotes a regression constant.
4.2.2. Genetic-algorithm-based artificial neural network ANN models were constructed using the Decision Tools software [36]. Fig. 3 is an ANN framework with an input layer, a hidden layer (numbered as the first layer), and an output layer (numbered as the second layer). The input variables include the budget amount, compliance period, and unit-price analysis table quantity. The output value Z is the bid award amount, wijk is the weight value, i is the input to the ith layer, j is the neuron number of the previous layer, k is the neuron number output to the next layer, and bi,j is the bias value. Several
Table 3 GA search for the weight factors of each fold. Field name
1
2
3
4
5
6
7
8
9
10
Budget amount Compliance period Unit price analysis table No. of detailed estimate Road constructed area Upper constructed area Bid authority No. of bridge abutment and piers Constructed section
85.51 9.51 2.83 0.11 0.43 0.00 0.04 0.00 1.58
75.67 0.06 6.03 3.65 6.16 0.53 2.01 0.54 5.34
80.58 0.27 0.91 0.13 2.49 4.61 2.38 8.54 0.09
71.07 9.36 0.08 0.39 0.19 13.86 2.34 2.56 0.16
49.84 0.12 0.34 32.94 0.55 11.30 0.01 0.19 4.72
78.90 7.44 0.05 0.09 0.01 0.08 0.00 2.99 10.44
45.73 0.05 9.82 0.42 1.21 16.00 0.17 0.07 26.53
49.38 45.26 0.00 3.62 0.00 0.00 0.00 0.35 1.40
83.45 7.76 0.01 6.13 0.93 0.26 0.96 0.00 0.50
63.22 1.03 8.30 0.00 7.61 0.00 5.85 8.78 5.20
J.-S. Chou et al. / Automation in Construction 54 (2015) 106–115
113
Table 4 MAPE rate for the weight in each fold. Verification
1
2
3
4
5
6
7
8
9
10
Average
Weight 1 Weight 2 Weight 3 Weight 4 Weight 5 Weight 6 Weight 7 Weight 8 Weight 9 Weight 10
– 7.80 8.54 7.19 10.65 6.00 15.35 9.61 5.65 12.75
10.28 – 8.10 16.30 11.76 8.64 11.66 15.80 11.56 10.82
11.96 9.02 – 12.11 12.36 9.28 11.50 16.01 12.63 9.07
13.30 13.20 11.36 – 13.68 12.01 20.70 16.23 12.85 16.34
9.84 10.16 10.51 11.22 – 10.54 24.82 23.18 11.39 16.28
7.33 10.25 11.31 10.10 17.95 – 40.72 10.33 7.65 32.15
7.58 7.34 7.96 6.92 9.58 7.44 – 10.09 8.30 8.32
15.92 14.79 13.79 13.47 19.09 21.77 18.52 – 15.92 20.95
8.70 9.48 10.79 13.17 15.87 8.64 12.45 13.68 – 15.10
7.48 6.23 8.51 7.52 11.09 6.04 32.21 7.67 7.48 –
9.24 8.83 9.09 9.80 12.20 9.03 18.79 12.26 9.34 14.18
researchers have constructed suitable architectures for ANNs [25,52]. As a rule of thumb, the number of hidden neurons (n1) can be calculated using Eq. (15) [4]: n1 ¼
NI þ NO þτ 2
ð15Þ
where τ equals to 1 or 2. In this study, because NI = 3 and NO = 1, n1 should be equal to 3 or 4. Therefore, two ANN model frameworks were used, the first with three neurons in the hidden layer and the second with four neurons in the hidden layer. Fig. 4 shows that the optimization was identical to the use of GAs in NLR. The optimization objective can be expressed as follows: ½α MAPE þ ð1−α Þ σ MAPE Minimize : Constraints : upper boundary b wi; j;k b lower boundary where MAPE ¼
actual n forecast −Z l 1X Z l : n l¼1 Z actual l
ð16Þ
The MAPE is a statistical measure that is useful for evaluating the performance of predictive models because it provides relative values; n is the number of data samples; l denotes the lth case in the data samples; Zforecast is the ANN output value of the lth case; Zactual is the actual l l value of the lth case; α = 0.9 or 1; σMAPE is the standard deviation of the MAPE; the upper boundary = c3 × wi,j,k − |c2 × wi,j,k| − c1; the lower boundary = c3 × wi,j,k + |c2 × wi,j,k| + c1; c1, c2, c3 = the set parameters, and c3 N c2; wi,j,k is the weight value; i is the ith layer; j is the input neuron; k is the output neuron; ci represents the parameter of the dynamic range; c1 ensures that the range is not 0; and c2 and c3 denote the rates of increases in the adjustment range. A large disparity
between c2 and c3 indicates a high range expansion rate, and a slow disparity indicates a low expansion rate. 4.2.3. Optimization of weighted case-based reasoning To determine the CBR, the attribute weight must first be obtained. Learning the local and global weights of case features is one of the most common applications of a GA in CBR. Fig. 5 shows that CBR was performed using a GA to set the weights and to minimize the validation error. Cross-validation was then used to obtain the 10-fold weights. For the k data of the folds, the similarity value of the influential factors was calculated for each item of validation and historical data. The overall similarity value calculation included the weighted mean of the weights. After the weights were obtained, the GA used an iterative method to identify the weight with the lowest MAPE. During analysis, the bid award amount for the historical case with the highest overall similarity for a validation case was used as the bid award amount to forecast the validation case. The absolute error rate in comparison to the actual bid award amount was also calculated. The MAPE of these validation cases was the minimization objective of the GA. Next, the CBR weight was determined. Table 3 shows how the regularized or normalized weight was derived by dividing the weighting value by the overall weight of each fold; thus, it was confirmed that the budget had a higher weight. These results and the regression analysis showed a significant correlation between the bid award and budget amounts. The modeling results of this study indicated that the weight obtained from the data for the ith fold was appropriate for that fold; however, the same analysis was required for all other folds. The weights were set under the assumption that the objective was global optimization. Thus, the weight of each fold was substituted into other folds for verification by using CBR to determine which fold weight had the minimal error rate. Table 4 shows that the optimal MAPE rate for the weight of the second fold weight was 8.83%. Hence, the results of this fold were compared with those of other models.
Table 5 MAPE rate of each fold in each model. No.
Predictive method
α
1
2
3
4
5
6
7
8
9
10
Average (%)
Standard deviation (%)
1 2 3 4 5 6 7 8 9 10 11 12 13
Conventional estimate GLR GA-NLR (Eq. 11) GA-NLR (Eq. 12) GA-NLR (Eq. 13) GA-NLR (Eq. 11) GA-NLR (Eq. 12) GA-NLR (Eq. 13) GA-ANN (Model 1) GA-ANN (Model 1) GA-ANN (Model 2) GA-ANN (Model 2) GA-CBR
– – 1.0 1.0 1.0 0.9 0.9 0.9 1.0 0.9 1.0 0.9 –
3.61 8.66 5.10 4.81 3.59 4.79 4.57 4.01 4.53 4.52 4.21 4.69 7.80
10.52 9.34 9.98 11.17 10.88 9.77 9.80 9.68 10.62 10.33 9.89 9.72 –
10.83 19.41 10.41 9.65 10.01 10.07 10.09 10.51 8.92 9.55 8.93 9.34 9.02
6.47 6.76 6.22 6.43 6.40 6.28 6.58 6.66 6.58 6.84 6.51 6.71 13.20
9.08 8.82 8.56 8.81 9.97 8.93 8.91 9.57 9.58 8.78 9.41 8.99 10.16
8.74 10.97 9.31 8.97 9.17 9.25 8.95 8.78 8.81 8.66 8.81 8.65 10.25
5.65 3.45 5.38 5.56 5.22 5.29 5.55 5.25 5.59 4.83 5.54 5.50 7.34
8.36 8.58 8.17 8.51 7.93 8.17 8.64 7.98 8.7 8.73 8.72 8.73 14.79
6.71 7.11 6.45 6.60 6.68 6.33 6.58 6.85 6.75 6.63 6.76 6.64 9.48
7.04 8.39 6.50 6.75 7.11 6.86 6.79 7.51 6.50 6.39 6.50 6.43 6.23
7.701 9.149 7.608 7.726 7.696 7.574 7.646 7.680 7.658 7.526 7.528 7.540 8.830
2.24 4.11 1.93 2.00 2.33 1.91 1.88 2.06 1.94 1.98 1.88 1.76 4.04
114
J.-S. Chou et al. / Automation in Construction 54 (2015) 106–115
Table 6 Verification results for the error rate. Model
1
2
3
4
5
6
7
8
9
Average (%)
Standard deviation (%)
No. 1 No. 10 No. 11
14.28 9.96 10.62
8.09 6.43 8.00
29.40 24.71 25.55
6.44 9.08 8.32
22.86 20.57 20.78
16.28 14.35 15.15
17.35 16.47 17.07
7.16 8.99 9.65
3.89 7.28 6.61
13.97 13.09 13.53
8.46 6.37 6.53
4.3. Validation of results The MAPE rate was used as the standard for model testing. The MAPE is a statistical measure that is commonly used in quantitative forecasting methods because it indicates the relative overall fit. n forecast −Z actual 1X Z i i MAPE ¼ : n i¼1 Z actual i
ð17Þ
This equation yields the absolute difference between the forecast value and the actual value as well as the proportion of this difference to the actual value. The equation also provides an average of the absolute error proportion for each case. Through the aforementioned process, the mean error rate of the models for the overall data was obtained. As noted previously, cross-validation was also performed. The data were divided before modeling, and the training data for each group were used for model construction. In Table 5, α denotes the proportional value of the MAPE and its average value. The numbered models are explained as follows. • • • • • • • • •
No. 1: Budget amount multiplied by 0.9 (experiential rules) No. 2: GLR No. 3–5: GA-based NLR (GA-NLR) for Eqs. (11)–(13) and α = 1 No. 6–8: GA-NLR for Eqs. (11)–(13) and α = 0.9 No. 9: GA-based ANN (GA-ANN) with three neurons in the hidden layer and α = 1 No. 10: GA-ANN with three neurons in the hidden layer and α = 0.9 No. 11: GA-ANN with four neurons in the hidden layer and α = 1 No. 12: GA-ANN with four neurons in the hidden layer and α = 0.9 No. 13: GA-CBR
After cross-validation, Nos. 10 and 11 provided superior results. Finally, after nine additional cases were collected and organized, a final validation was performed using the two group models. Table 6 shows the analysis results. Six of the nine cases of validation data in Nos. 10 and 11 showed superior results compared with the results of the conventional approach (No. 1). In seven cases, the validation data in No. 10 were superior to those in No. 11. No. 10 was superior regarding both the final mean value and standard deviation. In addition to a higher MAPE, the conventional method (fixed coefficient) showed larger estimating variability. 5. Conclusion This study investigated the efficacy of AI techniques in forecasting bid award amounts on the basis of limited available information to provide a baseline for vendors. The analytical results confirmed that the GA-ANN model exhibited optimal performance, indicating that ANNs are superior mathematical models for realistic simulations. NLR had the second highest accuracy, followed by the conventional method and CBR. It was not unexpected that linear regression exhibited the poorest performance. For the test data, the average MAPE value for the GA-ANN was 7.526%; the optimal GA-NLR result was 7.574%; the GA-CBR result was 8.83%; and the linear regression rate was 9.149%. Notably, although the MAPE value for the conventional method was
7.701% in this case study, it is unlike the proposed AI techniques having reliability and flexibility to fit the variability of scaling-up datasets. After the AI models were trained and evaluated, nine additional cases were collected to assess the efficiency of the models. The results showed that the GA-ANN (No. 10), with three neurons in the hidden layer and α = 0.9, achieved the highest performance, yielding an average MAPE value of 13.09%. Compared with the conventional method, the GA-ANN (No. 10) exhibited an improved accuracy rate, approximately 6.3%. The analytical errors of the models did not significantly differ according to statistical testing. Therefore, for bridge construction projects involving high monetary amounts (i.e., procurements with large sums of money), the aforementioned science-based models are recommended. For construction projects with low monetary amounts, the simplified scale factor or experiential rule can be used. Statistical validation and testing showed that the numerical variables, including budget amount, compliance period, unit-price analysis table, number of bid estimates, road constructed area, and upper constructed area, exhibited a relatively high correlation with the bid award amount. Categorical variables that affected the bid award amount were the bid authority, bridge abutment construction, and the construction section. These results indicate that the bid award amount is affected by regional differences, whether to build a bridge abutment, and bridge construction. However, in this study, the authors conducted the analysis under normal conditions in Taiwan. The influence of macroeconomic concerns was not considered in this study. Therefore, future studies can collect data regarding the economic situation, expected inflation, and interest rates to enhance prediction results for complex bidding activities. This study has compared the forecasting accuracy of models based on linear regression, NLR, ANN, and CBR and combined with GA and k-fold cross-validation methods. Future studies can evaluate the use of other algorithms for verification and improvements. For optimized statistical solutions, further work is required to identify the lowest p value in the F test of regression analysis as an alternative to this study's approach (i.e., transforming the singular factors into normal distributions for performing optimization computations). More directions include differentiating critical factors in the NLR model from those in the ANN model and investigating GA-ANN models with different alpha values and fitness functions for the optimization algorithm.
References [1] I. Ahmad, Decision‐support system for modeling bid/no‐bid decision problem, J. Constr. Eng. Manag. 116 (4) (1990) 595–608. [2] H. Ahn, K.-j. Kim, Bankruptcy prediction modeling with hybrid case-based reasoning and genetic algorithms approach, Appl. Soft Comput. 9 (2) (2009) 599–607. [3] G. Arslan, M. Tuncan, M.T. Birgonul, I. Dikmen, E-bidding proposal preparation system for construction projects, Build. Environ. 41 (10) (2006) 1406–1413. [4] B. Kermanshahi, Design and Application of Neural Networks, Chapter 3, Shokodo Publishing Company, Tokyo, Japan, 1999. [5] Y. Baalousha, T. Çelik, An integrated web-based data warehouse and artificial neural networks system for unit price analysis with inflation adjustment, J. Civ. Eng. Manag. 17 (2) (2011) 157–167. [6] E. Cagno, F. Caron, A. Perego, Multi-criteria assessment of the probability of winning in the competitive bidding process, Int. J. Proj. Manag. 19 (6) (2001) 313–324. [7] R.I. Carr, General bidding model, J. Constr. Div. 108 (4) (1982) 639–650. [8] P.-C. Chang, C.-Y. Lai, K.R. Lai, A hybrid system by evolving case-based reasoning with genetic algorithm in wholesaler's returning book forecasting, Decis. Support. Syst. 42 (3) (2006) 1715–1729.
J.-S. Chou et al. / Automation in Construction 54 (2015) 106–115 [9] W.A. Chaovalitwongse, W. Wang, T. Williams, P. Chaovalitwongse, Data mining framework to optimize the bid selection policy for competitively bid highway construction projects, J. Constr. Eng. Manag. 138 (2) (2012) 277–286. [10] M.-Y. Cheng, C.-C. Hsiang, H.-C. Tsai, H.-L. Do, Bidding decision making for construction company using a multi-criteria prospect model, J. Civ. Eng. Manag. 17 (3) (2011) 424–436. [11] M.-Y. Cheng, H.-C. Tsai, E. Sudjono, Conceptual cost estimates using evolutionary fuzzy hybrid neural network for projects in construction industry, Expert Syst. Appl. 37 (6) (2010) 4224–4231. [12] J.-S. Chou, Cost simulation in an item-based project involving construction engineering and management, Int. J. Proj. Manag. 29 (6) (2011) 706–717. [13] J.-S. Chou, Generalized linear model-based expert system for estimating the cost of transportation projects, Expert Syst. Appl. 36 (3, Part 1) (2009) 4253–4267. [14] J.-S. Chou, Web-based CBR system applied to early cost budgeting for pavement maintenance project, Expert Syst. Appl. 36 (2, Part 2) (2009) 2947–2960. [15] J.-S. Chou, M.-Y. Cheng, Y.-W. Wu, C.-C. Wu, Forecasting enterprise resource planning software effort using evolutionary support vector machine inference model, Int. J. Proj. Manag. 30 (8) (2012) 967–977. [16] D. Chua, D. Li, Key factors in bid reasoning model, J. Constr. Eng. Manag. 126 (5) (2000) 349–357. [17] I. Dikmen, M.T. Birgonul, A.K. Gur, A case-based decision support tool for bid mark-up estimation of international construction projects, Autom. Constr. 17 (1) (2007) 30–44. [18] S. Doğan, D. Arditi, H. Günaydın, Determining attribute weights in a CBR model for early cost prediction of structural systems, J. Constr. Eng. Manag. 132 (10) (2006) 1092–1098. [19] S. Dozzi, S. AbouRizk, S. Schroeder, Utility-theory model for bid markup decisions, J. Constr. Eng. Manag. 122 (2) (1996) 119–124. [20] L. Friedman, A competitive-bidding strategy, Oper. Res. 4 (1) (1956) 104–112. [21] D.E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley Longman Publishing Co., Inc., 1989 [22] J. Han, M. Kamber, Data Mining: Concepts and Techniques, 2nd ed. Morgan Kaufmann, San Diego CA, 2006. [23] T. Hegazy, A. Ayed, Neural network model for parametric cost estimation of highway projects, J. Constr. Eng. Manag. 124 (3) (1998) 210–218. [24] S. Ho, L. Liu, Analytical model for analyzing construction claims and opportunistic bidding, J. Constr. Eng. Manag. 130 (1) (2004) 94–104. [25] K.I. Hoi, K.V. Yuen, K.M. Mok, Improvement of the multilayer perceptron for air quality modelling through an adaptive learning scheme, Comput. Geosci. 59 (2013) 148–155. [26] T. Hong, C. Hyun, H. Moon, CBR-based cost prediction model-II of the design phase for multi-family housing projects, Expert Syst. Appl. 38 (3) (2011) 2797–2808. [27] P. Ioannou, S. Leu, Average‐bid method—competitive bidding strategy, J. Constr. Eng. Manag. 119 (1) (1993) 131–147. [28] C. Ji, T. Hong, C. Hyun, CBR revision model for improving cost prediction accuracy in multifamily housing projects, J. Manag. Eng. 26 (4) (2010) 229–236. [29] S.-H. Ji, M. Park, H.-S. Lee, Cost estimation model for building projects using casebased reasoning, Can. J. Civ. Eng. 38 (5) (2011) 570–581. [30] M. Kim, R.C. Hill, The Box–Cox transformation-of-variables in regression, Empir. Econ. 18 (2) (1993) 307–319.
115
[31] R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and model selection, Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol. 2, Morgan Kaufmann Publishers Inc., 1995, pp. 1137–1143. [32] I. Křivý, J. Tvrdík, R. Krpec, Stochastic algorithms in nonlinear regression, Comput. Stat. Data Anal. 33 (3) (2000) 277–290. [33] M.O.T.C., Vol. 2013, Directorate General of Highways, http://www.thb.gov.tw/TM/ Webpage.aspx?entry=702013 (Accessed on June 18, 2013). [34] A. Movahedian Attar, M. Khanzadi, S. Dabirian, E. Kalhor, Forecasting contractor's deviation from the client objectives in prequalification model using support vector regression, Int. J. Proj. Manag. 31 (6) (2013) 924–936. [35] M. Norusis, SPSS 16.0 Guide to Data Analysis, Prentice Hall Press, 2008. [36] Palisade, Evolver—The Genetic Algorithm Slover for Microsoft Excel, Palisade Corporation, N.Y., 2009 [37] Y.-H. Perng, C.-L. Chang, Data mining for government construction procurement, Build. Res. Inf. 32 (4) (2004) 329–338. [38] Public Construction Commission, Government Procurement Law, Executive Yuan, Taiwan, 1998. [39] H. Sackrowitz, S. Lakshminarayanan, T.P. Williams, Analyzing bidding statistics to predict completed project cost, Comput. Civ. Eng. (2005) 1–10. [40] R.M. Sakia, The Box–Cox transformation technique: a review, J. R. Stat. Soc. Ser. D (Stat.) 41 (2) (1992) 169–178. [41] M.A. Salem Hiyassat, Construction bid price evaluation, Can. J. Civ. Eng. 28 (2) (2001) 264–270. [42] K.-K. Seo, A methodology for estimating the product life cycle cost using a hybrid GA and ANN model, in: S. Kollias, A. Stafylopatis, W. Duch, E. Oja (Eds.), Artificial Neural Networks — ICANN 2006, vol. 4131, Springer Berlin Heidelberg, 2006, pp. 386–395. [43] S.K. Shiu, S. Pal, Case-based reasoning: concepts, features and soft computing, Appl. Intell. 21 (3) (2004) 233–238. [44] R. Sonmez, B. Ontepeli, Predesign cost estimation of urban railway projects with parametric modeling, J. Civ. Eng. Manag. 15 (4) (2009) 405–409. [45] W.-C. Wang, R.-J. Dzeng, Y.-H. Lu, Integration of simulation-based cost model and multi-criteria evaluation model for bid price decisions, Comput. Aided Civ. Infrastruct. Eng. 22 (3) (2007) 223–235. [46] Y.-R. Wang, C.-Y. Yu, H.-H. Chan, Predicting construction cost and schedule success using artificial neural networks ensemble and support vector machines classification models, Int. J. Proj. Manag. 30 (4) (2012) 470–478. [47] M. Wanous, A.H. Boussabaine, J. Lewis, To bid or not to bid: a parametric solution, Constr. Manag. Econ. 18 (4) (2000) 457–466. [48] T.P. Williams, Predicting completed project cost using bidding data, Constr. Manag. Econ. 20 (3) (2002) 225–235. [49] T.P. Williams, Predicting final cost for competitively bid construction projects using regression models, Int. J. Proj. Manag. 21 (8) (2003) 593–599. [50] C. Wilmot, B. Mei, Neural network modeling of highway construction costs, J. Constr. Eng. Manag. 131 (7) (2005) 765–771. [51] N.-J. Yau, J.-B. Yang, Case-based reasoning in construction management, Comput. Aided Civ. Infrastruct. Eng. 13 (2) (1998) 143–150. [52] K.-V. Yuen, H.-F. Lam, On the complexity of artificial neural networks for smart structures monitoring, Eng. Struct. 28 (7) (2006) 977–984.