Typically, the design phase consists of various engineering activities that require ..... to accelerated project delivery, variation in workload, and overall changes in ...... Figure 14 shows the user's input worksheet that contains 18 different bridge.
BRIDGE DESIGN EFFORT AND COST ESTIMATION MODELS – PHASE I Final Report July 2016 Principal Investigator H. David Jeong, Ph.D. Associate Professor Civil, Construction, and Environmental Engineering Iowa State University 394 Town Engineering Ames, Iowa 50011 Co-Principal Investigator Doug Gransberg, PhD. P.E. Professor Civil, Construction, and Environmental Engineering Iowa State University 394 Town Engineering Ames, IA 50011 Research Assistant Ahmed Abdelaty Joseph Shrestha Tejas Ostwal Authors Primary Author, Second Author, and Third Author Sponsored by the Iowa Highway Research Board and the Iowa Department of Transportation (IHRB Project TR-xxx) Preparation of this report was financed in part through funds provided by the Iowa Department of Transportation through its Research Management Agreement with the Institute for Transportation (InTrans Project yy-xxx)
A report from Institute for Transportation Iowa State University 2711 South Loop Drive, Suite 4700 Ames, IA 50010-8664 Phone: 515-294-8103 Fax: 515-294-0467 www.intrans.iastate.edu
Technical Report Documentation Page 1. Report No. IHRB Project TR-xxx
2. Government Accession No.
4. Title and Subtitle Bridge Design Effort and Cost Estimation Models – Phase I
3. Recipient’s Catalog No. 5. Report Date July 2016 6. Performing Organization Code
7. Author(s) Jeong, D.H. and Gransberg, D.D.
8. Performing Organization Report No. InTrans Project yy-xxx
9. Performing Organization Name and Address Institute for Transportation Iowa State University 2711 South Loop Drive, Suite 4700 Ames, IA 50010-8664
10. Work Unit No. (TRAIS)
12. Sponsoring Organization Name and Address Iowa Highway Research Board Iowa Department of Transportation 800 Lincoln Way Ames, IA 50010
13. Type of Report and Period Covered Final Report
11. Contract or Grant No.
14. Sponsoring Agency Code IHRB Project TR-xxx
15. Supplementary Notes Visit www.intrans.iastate.edu for color pdfs of this and other research reports. 16. Abstract The estimation of design effort and cost plays a vital role in authorizing funds and controlling budget during the project development process. Typically, the design phase consists of various engineering activities that require substantial efforts in delivering final construction documents for bid preparation. Estimating these efforts accurately and efficiently is critical for transportation agencies in properly allocating funds and assigning appropriate time and resources. Previous studies have reported several problems associated with the estimation of design effort such as lack of predictive tools, inaccurate forecasts and misallocation of efforts. Thus, there is a need for a proactive scheme to estimate more accurate and reliable design efforts and costs in order improve the confidence of the design office at the negotiation table with consulting firms and finally enhance the accountability and transparencies of funding decisions. This Phase I study first develops a master database that includes various attributes of historical pretensioned prestressed concrete beam (PPCB) bridge design projects completed by consultants. The master database is used to develop PPCB design effort and cost estimation models using multivariate linear regression and artificial neural networks. The prediction models are then tested to evaluate the performance of their prediction by using the mean average percentage error. This research also develops a Case based reasoning based tool to make logical inference for estimating design efforts and costs for a new PPCB project. The case based reasoning approach uses the concept of similarity scores to retrieve the records of the most similar historical bridge design projects. Additionally, a spreadsheet tool is developed to automate the retrieval process. In addition to using regression models and neural network models developed in this study, users can make reasonable judgements by comparing with historically similar projects on the required efforts and costs of a new PPCB bridge design project.
17. Key Words Bridge design effort-preconstruction services-linear regression-neural networks-case based reasoning
18. Distribution Statement No restrictions.
19. Security Classification (of this report) Unclassified.
21. No. of Pages
22. Price
xxx
NA
20. Security Classification (of this page) Unclassified.
TABLE OF CONTENTS ACKNOWLEDGMENTS ........................................................................................................... VII EXECUTIVE SUMMARY ........................................................................................................ VIII 1.
INTRODUCTION ...............................................................................................................1 1.1. 1.2. 1.3.
2.
LITERATURE REVIEW ....................................................................................................4 2.1. 2.2.
3.
Objectives ................................................................................................................2 Research Approach ..................................................................................................2 Report Organization .................................................................................................3 Design Cost vs Construction Cost Growth ..............................................................5 Bridge Design Cost Estimation Practices ................................................................5
DATA EXTRACTION ........................................................................................................8 3.1. 3.2.
Available Data .........................................................................................................8 Data Extraction Assumptions ..................................................................................8
4.
DESCRIPTIVE STATISTICS AND DATA TRANSFORMATION ...............................11
5.
QUESTIONNAIRE SURVEY ..........................................................................................15
6.
DATA PROCESSING .......................................................................................................17 6.1. 6.2. 6.3. 6.4.
7.
PREDICTION MODELS ..................................................................................................19 7.1. 7.2. 7.3.
8.
Multivariate Linear Regression..............................................................................19 Artificial Neural Network ......................................................................................23 Performance Comparison.......................................................................................25
CASE-BASED REASONING ...........................................................................................27 8.1.
9.
Missing value imputation .......................................................................................17 Exclusion of Attributes ..........................................................................................17 Outlier Detection and Exclusion ............................................................................18 Inflation Adjustment ..............................................................................................18
Case-Based Reasoning Spreadsheet Tool ..............................................................29
SUMMARY AND CONCLUSION ..................................................................................31
REFERENCES ..............................................................................................................................33 APPENDIX A APPENDIX B
iv
LIST OF FIGURES Figure 1. Contracting Process of Design by Consultant ..................................................................1 Figure 2. Overall research approach ................................................................................................3 Figure 3. Non-availability of individual Workhours .......................................................................9 Figure 4. Availability of individual Workhours ..............................................................................9 Figure 5. Type of work and route type effects on average contract value .....................................11 Figure 6. Bridge width variability and horizontal curve effects on average contract value ..........12 Figure 7. Beam spacing variability and sidewalk effects on average contract value ....................13 Figure 8. Beam type and pier type effects on average contract value ...........................................13 Figure 9. Abutment type and construction staging effects on average contract value ..................14 Figure 10. Significant attributes for actual fee paid .......................................................................23 Figure 11. Typical neural network configuration (Gransberg et al. 2015) ....................................24 Figure 12. CBR main tasks ............................................................................................................27 Figure 13. CBR retrieval process ...................................................................................................28 Figure 14. User inputs worksheet ..................................................................................................30 Figure 15. Most similar projects worksheet ...................................................................................30
v
LIST OF TABLES Table 1. BRIS Bridge Data Attributes .............................................................................................8 Table 2. Adjusted actual fee paid example ....................................................................................10 Table 3. Summary of the questionnaire survey responses .............................................................15 Table 4. Consumer price indexes ...................................................................................................18 Table 5. Design fee and Workhours ..............................................................................................19 Table 6. Performance Indicators .................................................... Error! Bookmark not defined. Table 7. Significant attributes for Consultant’s proposed fee........................................................20 Table 8. Significant attributes for Consultant’s proposed workhours ...........................................20 Table 9. Significant attributes for Iowa DOT’s proposed fee ........................................................21 Table 10. Significant attributes for Iowa DOT’s proposed workhours .........................................21 Table 11. Significant attributes for Contract fee ............................................................................22 Table 12. Significant attributes for Contract workhours ...............................................................22 Table 13. Significant attributes for actual fee paid ........................................................................23 Table 14. MAPE for both MLR and ANN models ........................................................................25 Table 15. Classification of variables and weights of importance ..................................................28 Table 16. Weights of importance and variable output type ........... Error! Bookmark not defined.
vi
ACKNOWLEDGMENTS The authors would like to thank the Iowa Department of Transportation for sponsoring this research.
vii
EXECUTIVE SUMMARY
The estimation of design effort and cost plays a vital role in authorizing funds and controlling budget during the project development process. Typically, the design phase consists of various engineering activities that require substantial efforts in delivering final construction documents for bid preparation. Estimating these efforts accurately and efficiently is critical for transportation agencies to properly allocate funds and assign appropriate time and resources. Previous studies have reported several problems associated with the estimation of design effort such as lack of predictive tools, inaccurate forecasts and misallocation of efforts. Thus, there is a need for a proactive scheme to estimate more accurate and reliable design efforts and costs in order improve the confidence of the design office at the negotiation table with consulting firms and finally enhance the accountability and transparencies of funding decisions. This study develops advanced design effort and cost estimation models using Multivariate Linear Regression (MLR) and Artificial Neural Network (ANN). First, the study develops a master database that consolidates various data points of historical pretensioned prestressed concrete beam (PPCB) bridge design projects conducted by external consultants. The master database includes data attributes such as bridge design attributes, various physical attributes of bridges, consultant’s proposed fees and workhours, Iowa DOT’s proposed fees and workhours, contracted fee and work hours and actual amount of fee paid after the completion of design. Seven MLR models are developed to estimate design fees and work hours at different negotiation and design stages. It is found that horizontal curvature and beam type are statistically significant attributes and common across different models. Additionally, the number of different spans, number of piers, bridge width variability, sidewalk requirement, aesthetic items requirement, bridge length, beam spacing variability and pier type are statistically significant parameters. The adjusted R2 values of the MLR models range from 0.56 to 0.69 which are higher than those of bridge design cost estimation models found in literature. The performance of the MLR models is also measured by using the Mean Average Percentage Error (MAPE). The MAPEs are from 22% to 37%. Similarly, seven ANN models are developed by using a commercial Excel add-in program. The ANN models use 15 bridge attributes to predict the design fees and workhours. The performance of the models is also measured by calculating the MAPEs that range from 20% to 40%. In addition to developing MLR and ANN models, the study also develops a case based reasoning (CBR) tool to assist Iowa DOT in making logical inference for estimating PPCB bridge design efforts. The case based reasoning approach uses similarity scores to retrieve the most similar historical PPCB bridge design projects. A spreadsheet based CBR tool is also developed to automate the retrieval process.
viii
1. INTRODUCTION The estimation of design effort and cost plays a vital role in authorizing funds and controlling budget during the project development process. Typically, the design phase consists of various engineering activities that require substantial efforts in delivering final construction documents for bid preparation. Estimating these efforts accurately and efficiently is critical for transportation agencies in properly allocating funds and assigning appropriate time and resources. However, due to the relatively small proportion of project cost consumed by design service activities, “there has generally been little effort invested in improving management of those activities in the preconstruction phase” (Persad et al. 1995). Additionally, researchers that have focused on preconstruction management have reported a lack of predictive tools to estimate design costs (Knight and Fayek 2002, and Woldesenbet and Jeong 2012). Inaccurate forecast of design costs or misallocation of these efforts can result in a) increased design errors and low quality work leading to project delays, and increased construction costs, or b) lost opportunities to authorize other projects due to unrecognized over-expenditure. Today, the rapid increase in the number and complexity of transportation projects has led agencies to accelerated project delivery, variation in workload, and overall changes in legal and policy requirements that necessitate specialized skills (Persad et al., 2010). To meet these changing project management needs, design and engineering activities are being augmented by private consulting firms that the agencies must negotiate with before funding authorization for consulting agreements. Figure 1 shows a typical process when a consulting company is hired for design and engineering in Iowa DOT. First, the Iowa DOT invites the most qualified consulting company for negotiation based on the company’s expertise, capabilities and previous records. Then, the consulting company proposes a design fee and anticipated work hours based on the scope of work for the design project. The Iowa DOT reviews the consulting company’s proposal and typically makes a counter offer to the consulting company before both parties finally agree on the scope of work, design fee and work hours. The actual amount of design fee paid may be different from the contracted amount because of scope changes and actual work hours different from estimated work hours in the agreement. Design Fee and Design Fee and Workhours Proposed Workhours Proposed by Consultant by Iowa DOT
Negotiation Period
Actual Design Fee paid to Consultant and actual workhours
Design Fee and Workhours agreed by Iowa DOT and Consultant
Final Design Period
Time
Figure 1. Contracting Process of Design by Consultant Federal regulation stipulates that a detailed independent cost estimate must be formed when an engineering work is contracted to an external consultant. Such an estimate is “an important
1
baseline for negotiations with the Consultant” (UDOT 2014) and must include “an appropriate breakdown of specific types of labor required, workhours, and an estimate of the consultant’s fixed fee…for use during negotiations” (23CFR Section 172.7). However, the efficiency and reliability of outsourcing design services and determining reliable costs associated with the engineering activities have been a huge challenge since most highway agencies do not have a structured and well-defined procedure to reasonably estimate design efforts and costs. A limited number of studies in this area revealed drastic variances in terms of the accuracy of design cost estimates. A study done by the Washington DOT (WSDOT, 2002) reported that the estimated design costs of a sample bridge by ASSHTO Subcommittee members from 25 states ranged from 4% to 20% of construction costs (WSDOT 2002). A recent study of 344 bridge projects constructed by the North Carolina DOT (NCDOT) found that preliminary engineering costs ranged from 0.04% to 138% of construction costs. A similar study by Kuprenas (2003) found design costs ranging from 25% to 50% while Liu et al. (2013) obtained 7% to 16% of the construction cost as the design costs of roadway projects. Lastly, Gransberg et al. (2007) found that using historic construction cost percentages for Oklahoma Turnpike Authority bridge projects caused consultant design fees to be underestimated in almost every case and resulted in statistically significant (R2 = 0.92) construction cost growth due to change orders required to correct design errors and omissions. Thus, there is a need for a proactive scheme to estimate more accurate and reliable design efforts and costs in order to meet the federal regulations and improve the confidence of the design office at the negotiation table with consulting firms and finally enhance the accountability and transparencies of funding decisions. The current system that the office of bridges and structures at Iowa DOT uses is working but the office intends to improve and upgrade the system to take advantage of technological advances and historical data that they have accumulated. 1.1. Objectives The goal of this study is to design a practical approach that can systematically estimate, track, and negotiate bridge design efforts and costs in providing the best possible fund allocation, and control decisions. This study will achieve this goal by developing parametric estimating tools based on the historical bridge design contract data. The sub-objectives include:
Develop a master database for historical PPCB bridge design projects Develop parametric bridge design effort and cost estimation models and tools Develop a case-based reasoning tool Validate the developed models and tools
1.2. Research Approach Figure 2 shows the overall research approach. First, the agreement files of PPCB design projects conducted by external consultants are studied and important pieces of data points for this study are extracted. Also, bridge characteristics data that are stored in the bridge information system
2
(BRIS) at Iowa DOT are imported into a spreadsheet file. Additional data points that are stored in a personal computer in the office of bridges and structures are also imported and consolidated into the master database. The master database consists of various data attributes for each PPCB bridge design project including consultant’s proposed design fee and workhours, Iowa DOT’s proposed design fee and workhours, contracted design fee and workhours, actual amount of design fee paid, and several bridge attributes. The design fees are adjusted to account for inflation and some data cleaning techniques are applied to address missing values and other data issues. Three estimation methods are developed in this research as follows:
Multivariate linear regression (MLR) Artificial Neural Network (ANN) Case Based Reasoning (CBR)
Figure 2. Overall research approach 1.3. Report Organization The rest of the report is organized as follows. Chapter Two presents a brief literature review of previous studies. Chapter Three discusses the process of extracting and integrating the data from different sources in developing a master database. Chapter Four presents the results of a high level descriptive statistical analysis on PPCB design projects conducted by consultants. Chapter Five summarizes the results of a brief questionnaire survey to capture the level of importance for each bridge attribute. Chapter Six presents data processing steps and inflation adjustment procedures. Chapter Seven discusses various prediction models developed in this study. Chapter Eight presents the case based reasoning tool and its usage procedure. Finally, Chapter Nine summarizes the overall report and key findings. Additionally, Appendix A presents a step by step process on how to train and use an artificial neural network by using the Microsoft add-in data mining tool.
3
2. LITERATURE REVIEW It has been well known that design fees are typically calculated as a percentage of the construction contract value. However, the American Society of Civil Engineers (ASCE) discourages the use of this method to determine the design fees (Carr and Beyor 2005). Carr and Beyor (2005) investigated the effects of using “outdated percentage of construction fees schedules” to determine design fees. The study found out that design fees were significantly underestimated because of the use of outdated and stagnant fee schedules. Additionally, they reported that there was a significant need to develop a guide for estimating appropriate design fees (Carr and Beyor 2005). Lowe et al. (2006) tried to predict the early stage construction cost of buildings by developing linear regression models. The data was collected from 286 construction projects from the United Kingdom. Forty one independent variables were identified for developing the model. The input variables were categorized as project strategic variables, site related variables and design related variables. They used regression to predict the log (cost), cost per unit area ($/m 2), and log (cost per unit area) instead of using costs at completion of construction. The researchers recommended reviewing error spread in each model by analyzing the dependent variable versus error illustrated on scatter plots. The model performance was evaluated using R 2 and mean absolute percentage error (MAPE). The best performed regression model reported an R 2 value of 0.66 and MAPE of 19.3%. They reported that the models underestimated the costs of very expensive projects and overestimated the costs of inexpensive projects when scatter plots of cost/m2 were compared against error. Bubshait et al. (1998) used data for 60 large building projects in Saudi Arabia to study the relationship between design fee and design deficiencies. The study also developed a statistical model that predicted the level of design quality based on the design fee. The outcome of this study revealed that design fee was inversely proportional to the design deficiency. Woldesenbet and Jeong (2012) used historical data provided by Oklahoma DOT to estimate the preliminary engineering (PE) costs of roadway projects. Using data mining techniques, they identified influential factors that affected the PE costs and developed decision tree and regression models to estimate PE costs. Jianxiong (2015) developed a bottom up design hour estimation template for California Department of Transportation (Caltrans). This study proposed a work plan template development procedure to simplify the workhour estimation process. The bottom up approach is considered to provide a better estimation of design hours, but it is generally labor intensive. Turner and Miller (2015) developed regression models to estimate PE costs. The researchers used project length, project duration, level of environmental review and whether or not the project is administered by Virginia DOT to develop the regression models. Initially, only construction cost was used to develop regression models. Later on, other project characteristics were added, which reduced the mean percentage error of estimation from 134% to 44%. Turochy et al. (2001) conducted a survey of some DOTs in order to understand common procedures in estimating the PE costs of highway projects. The survey results revealed that most 4
DOTs use the percentage of construction costs method in determining the PE costs but with different percentages. For example, Kentucky Transportation Cabinet (KYTC) applied approximately 10% of the estimated construction cost to determine the design costs. Pennsylvania DOT used a percentage between 10% and 20% on estimated construction costs in determining the initial PE costs. Tennessee DOT usually applied 10% and West Virginia DOT typically adopted 8% on estimated construction costs to determine the PE costs. Gransberg et al. (2015) developed a holistic framework which included both top-down and bottom-up approaches based on case studies performed in nine state DOTs. This study published a national guidebook on estimation of preconstruction services costs which describes a systematic step-by-step process of developing Preconstruction Services cost estimating models with data driven approaches such as Multivariate regression modeling, decision tree analysis and neural network. The report recommended a top-down approach for estimating PCS costs during the planning phase when the project information is very limited and a bottom-up approach during the detailed design stage when a work breakdown structure is available at each functional level. Furthermore, the need of maintaining and updating a PCS cost database has been strongly emphasized to continuously improve the estimation results, which would reduce project cost uncertainties. 2.1. Design Cost versus Construction Cost Growth Gransberg et al. (2007) used data of 31 projects which included roadways and bridges provided by the Oklahoma Turnpike Authority to analyze relationship between the design fee expressed as a percentage of construction costs and construction cost growth from the initial estimate. By using linear regression analysis, the study concluded that cost growth from the initial estimate had an inverse relationship with design cost. The study emphasized that the relationship was stronger in bridge design projects (R2 = 0.93). They argued that appropriate estimation of the design fee reduces cost overrun. As such, design fees should be considered as an investment to control the project’s budget by agencies. However, the study concluded that there was a breakeven point between the design fee and design quality which indicated that increasing the design fee above a certain limit would not enhance the design quality. Similarly, Shrestha and Mani (2014) investigated the effect of design cost on project performance for roadway project using data for projects constructed in Nevada and Texas between 1991 and 2009. The study concluded a negative correlation between design cost percentage and the project’s total cost growth. 2.2. Bridge Design Cost Estimation Practices Generally, there are two methods to estimate the cost of a bridge design. The first method is an easy and straightforward one that depends on a percentage of the estimated construction cost. The second method is more coherent as it depends on preparing a detailed work breakdown structure for each design task (Nelson and Waterhouse 2016). It is worth noting that the first method is inherent with considerable risk since the construction budget is fairly uncertain. The fixed percentage method also tends to overestimate the design cost for large construction projects 5
and underestimate the design costs for small construction projects (Nelson and Waterhouse 2016). As such, the detailed scope breakdown method will result in predicting accurate and realistic design cost estimate. However, it is considered to be very time consuming and hence it is considered to be impractical in many cases (Nelson and Waterhouse 2016). Pretensioned Precast Concrete Beam (PPCB) bridges are the most prevalent types of bridges in Iowa (Nelson and Waterhouse 2016). The Iowa DOT developed a multivariate regression model in order to estimate the hours of a PPCB bridge design project based on a database that included following parameters: Length, Width, Number of spans, No of beams, Horizontal curve, Design Methodology, Span arrangement, Pier type, Expansion joint type, Skew, Construction staging, Abutment type, Abutment foundation type, and Beam type. The regression model was developed based on the historical data of 45 projects constructed between 2000 and 2015. The prediction of design hours used the following parameters as significant variables; a) Number of spans, b) Standard span arrangement, c) Pier type, d) Expansion joint type, e) Skew, and f) Construction staging Using the above mentioned parameters, the tool predicted the expected, lower 95%, and upper 95% estimates for the bridge design hours. However, the tool needed to be enhanced by considering other factors such as bridge aesthetics, deep foundation type, accelerated bridge construction, deck area, and barrier rail type (Nelson and Waterhouse 2016). New York DOT has developed a MS Access based tool to search projects with given characteristics (Williams et al. 2013). A regression model was also developed to predict the design hours based on the following project characteristics: a) b) c) d) e) f) g) h) i) j) k) l)
Complexity, Project type, Number of sub-consultants, Construction costs, Number of lanes, Number of plan sheets, State Environmental Quality Review (SEQR) classification, National Environmental Policy Act (NEPA) classification, Predominant bridge type, Number of bridges, Highway classification, and Length of project.
Hollar (2012) studied on calculating estimates on PE costs and PE duration for North Carolina DOT’s bridge projects. In this research the data was acquired from ten sources to form a 6
database on 461 bridge projects. Twenty-eight independent variables were identified through correlation analysis and Analysis of Variance (ANOVA). Regression models were developed for PE cost ratio and PE duration. The modeling strategies like Multivariate linear regression (MLR), hierarchical linear models (HLM), Dirichlet process linear models (DPLM) and multilevel Dirichlet process linear models (MDPLM) were included in the study. For a goodness of fit testing, the mean absolute percentage error (MAPE) was used to rank predictive performance when each candidate model was applied to a validation set. The MLR model including eight variables achieved a MAPE of 18.9%. The HLR approach was investigated as four of the eight variables selected were categorical. Based on categorical variable values, the HLM allows data to be divided into subgroups. Then, a model was uniquely fit to each subgroup using MLR. However, when applied to the validation set, the HLM results for PE cost ratio prediction failed to outperform the baseline MLR (MAPE HLM of 24.2% compared with MAPEMLR of 18.9%. The recommended MLR model was incorporated in the user interface application and included the following eight variables (4 numerical and 4 categorical): Right of Way Cost to Statewide Transportation Improvement Program (STIP), Estimated Construction Cost, Roadway Percentage of Construction Cost, STIP Estimated Construction Cost, Bypass Detour Length, Project Construction Scope, North Carolina DOT Division, Geographical Area of State, Planning Document Responsible Party.
7
3. DATA EXTRACTION 3.1. Available Data Sources The data available for PPCB bridge design effort estimation model development is from three different sources:
Bridge Information System (BRIS) data Design Fee Data and Workhours data Agreement files
Bridge Information System (BRIS) is a database of bridge projects in the Iowa DOT’s main frame computer system that is accessible from within the Iowa DOT only. This database comprises bridge characteristics of all kinds of bridges which have been let by Iowa DOT. The categories of bridge data attributes in the BRIS database are shown in Table 1. Table 1. BRIS Bridge Data Attribute Categories Bridge General Information Bridge Location Bridge Geometrical Features Bridge Design Method Bridge Misc. Features
Bridge Misc. Information Bridge Work Description Bridge Deck Description Bridge Joints Description Bridge Beams Description
Bridge Bearing Description Bridge Abutments Description Bridge Piers Description Bridge Aesthetics Description
Design Fee and Workhours data includes proposed design fee and workhours by consultant, design fee and workhours by Iowa DOT, contracted design fee and workhours. In this study, the total design fee without contingency has been considered for estimation model development. Another spreadsheet file in this data source comprises billing data and actual amount paid every month until the end of the project to the consultant. This gives information about the actual design fee received to date by the consultant. Agreement Files consist of contract documents related to a particular project. These files have data which includes the scope of work, itemized engineering hours and design fee which has detailed breakdown of direct labor costs, overhead costs, direct expense, contingency and sub consultant expenses. 3.2. Data Extraction Assumptions Assumptions made during the data extraction from the agreement files are below: a) Multi-bridge fees – In case of Multi Bridge projects encountered in agreement files or fee data sheets (excel files); the fees are split proportionately based on the available workhours data. For example, in Contract #5020 individual workhours for each lane are not available as shown in Figure 3. 8
Figure 3. Non-availability of individual Workhours The Fee data for this particular contract is merged together for both lanes in North and South direction i.e. ISSN-035-2(295)37--IT-20 and ISSN-035-2(296)37--IT-20. For these kinds of projects with same design characteristics, the design fee and workhours are split into equal ratios (50:50) for both lanes. 𝑊𝑜𝑟𝑘ℎ𝑜𝑢𝑟𝑠 = 𝐷𝑒𝑠𝑖𝑔𝑛 𝐹𝑒𝑒 =
= 704.5 ℎ𝑜𝑢𝑟𝑠 $
,
.
= $73,765.31
Another example with a different case is Contract #5500, in which the workhours for each lane are available as shown in Figure 4.
Figure 4. Availability of individual Workhours The workhours and Design Fee for I-35 SB Median are calculated as follows: 𝑊𝑜𝑟𝑘ℎ𝑜𝑢𝑟𝑠 = 𝐷𝑒𝑠𝑖𝑔𝑛 𝐹𝑒𝑒 =
∗ 2592 = 345 ℎ𝑜𝑢𝑟𝑠 ∗ $270,165 = $35,932.36
b) Increased fee resulting from the change of scope is not included in the database. 9
c) Only one record from each BRIS table is used to develop the preliminary database for modeling. For example, Contract #7904G, Project No. ESIMX-035-1(105)33--1S-20 represents a project that is repeated twice in the BRIS datasheet for east bound and west bound which has the same attributes apart from lane direction. In these cases only one lane is considered because all the technical specifications are identical and it would just be duplication of data. d) If the project has missing data for consultant/Iowa DOT’s proposed workhours and design fee, the actual fee paid is assumed as consultant/ Iowa DOT’s proposed fee. Furthermore, the corresponding workhours have been interpolated based on the contract hours. e) If the data for actual fee paid is missing, the contracted design fee is assumed to be the same as actual fee paid. f) If the percentage of design completion is greater than 90% and the actual fee paid to date is less than the contract amount, then the adjusted actual fee paid is assumed to be the same as the contract amount as shown in Table 2. Adjusted actual fee paid is a value assigned in place of actual fee paid for design projects which have not been completed. g) If the actual fee paid to date is higher than the contract amount with the design close to completion (> 90%); the adjusted actual fee paid to date is considered to be the same as actual fee paid as shown in Table 2. Table 2. Adjusted actual fee paid example Actual Fee Adjusted Actual De sign Pe rcentage Paid Fee Paid Completion Contract Fee Project BRF-006-3(67)--38-25 $131,012.30 $125,202.00 $131,012.30 99% BRFIMX-080-4(54)132--14-77 $216,979.93 $235,515.00 $235,515.00 96.90%
10
4. DESCRIPTIVE STATISTICS AND DATA TRANSFORMATION This section discusses the quantitative results of descriptive statistics applied to the master database of 67 PPCB bridge design projects. The descriptive statistics results are based on the assumption that all other attributes related to bridge design remain constant apart from the attributes which are compared. Another technically important issue is that most of the data attributes used to describe bridge properties are string and text data which make the use of regression models inapplicable. For example, the abutment type attribute is described as integral abutment or stub abutment. As such, it is necessary that string attributes need to be converted to categorical attributes by comparing the average contracted design fee for each category for regression analysis. For example, the average contracted design fee for projects with integral abutment (category 0), is $100,785 while the average contracted design fee for projects with stub abutment (category 1), is $141,980. Similarly, string attributes are converted using the same method. In some cases, some categories have very similar average fees and hence combined together in one category. However, beam type and construction staging attributes are categorized based on the beam type and application of construction staging. Figures 5(a) shows the average contract values of bridge design for new structures and replacement structures. The results show that type of work does not have much impact on the average contract values as the difference between the average contract values is small. Figures 5(b) shows the average contract values of bridge design for various route types. Route type significantly impacts the average contract values as design of bridges built on Interstate cost comparatively higher by 21%-28% when compared to bridges on US Highways and Iowa Roads respectively. The difference between design fees for bridges built on US Highways and Iowa Roads is found to be relatively very small (5.5%), so they have been assigned the same variable (0).
(a) Type of work categories
(b) Route type categories
Figure 5. Type of work and route type effects on average contract value Figure 6(a) shows the average contract values of bridge design with constant and variable widths. A bridge with variable width can increase the average design fee up to 48% compared to bridges with constant widths. Figure 6(b) shows the average contract values of bridge design with and without horizontal curvature. A bridge with horizontal curve would increase design fees 11
by 65% on average compared to bridges with no curves. Hence, width variability and horizontal curves may play an important factor in driving the average bridge design fee.
(a) Bridge width variability type
(b) Horizontal curve categories
Figure 6. Bridge width variability and horizontal curve effects on average contract value Figure 7(a) shows the average contract values of bridge design with constant and variable beam spacing. Variable beam spacing can pose a small impact on bridge design fee as bridges with constant beam spacing can increase the design fees by 12.5% compared to bridge design with constant beam spacing. This deviation may be attributed mainly to the impact of horizontal curvature in the bridge projects with constant beam spacing as 40% of the projects having horizontal curvature are a part of 15 projects with constant beam spacing and furthermore, the average contract fee of the projects with constant beam spacing and horizontal curvature is $152,278.87 which is a major driving factor in increasing average contract fee. Figure 7(b) shows the average contract values of bridge designs with and without sidewalks. Having sidewalks on the bridge can increase the design fees by 40% compared to bridges having no sidewalks because the average engineering hours for designing bridges with sidewalks is 1409 workhours compared to 1175 workhours for bridges designed without sidewalks. Hence, beam spacing and sidewalks poses a significant impact on bridge design fee.
(a) Beam spacing variability
(b) Sidewalk availability 12
Figure 7. Beam spacing variability and sidewalk effects on average contract value Figure 8(a) shows the average contract values of bridge design with different kinds of beams. Beam type can pose a huge impact as the average contract values for bridges with tee beams can increase design fees by at least 26% and up to 86.5% compared to bridges with non-tee beams. Since there isn’t much difference between the average contract values of bulb-tee beams, they have been assigned with the same variable (1), and the other beams have been grouped with the variable (0). Figure 8(b) shows the average contract values of bridge design with different pier types. When compared with bridges with bent pile, bridge designs with T-pier can increase design fee up by 72%; Frame pier can increase design fee up by 88%; Bridge design with a solid concrete diaphragm can increase design fee up by 122% and bridges with no piers can increase the design fee by 154.5%. As a result, pier type poses a large impact on the average bridge design fees. The reason that bridge design with no piers is very high is probably because only two projects have been designed with no piers and furthermore, these two bridge designs have variable width and constant beam spacing which are major driving factors in increasing the design fee. In case of bent pile, only four projects have been designed which have no horizontal curvature, no sidewalks, integral abutment and variable beam spacing which might have drastically reduced the design fee. Furthermore, the Frame pier & T-pier are merged with the same variable (1) due to small difference in average contracted design fee.
(a) Beam type categories
(b) Pier type categories
Figure 8. Beam type and pier type effects on average contract value Figure 9(a) shows the average contract values of bridge design with stub and integral abutments. Bridges with stub abutments can increase the design fees by 41% compared to bridges with integral abutments. Hence, the type of abutment may have a considerable impact on bridge design fee. Figure 9(b) shows the average contract values of bridge designs with and without staging. It appears that construction staging may not have a significant impact on bridge design fees. Having no staging in bridge design can increase the costs by 15% compared to bridges with staging. However, it is important to note that 86% of the projects with horizontal curvature have
13
no-staging and furthermore, 85% of the projects with variable width and 75% of the projects which have sidewalks have no-staging involved. The average contracted design fee for projects with no-staging and horizontal curvature is $157,654.69 and the average contracted design fee for projects with variable width but no-staging is $149,854.91. Projects with sidewalks but with no-staging have $156,130.91 as the average contracted design fee. Thus, probably, the bridge characteristics such as horizontal curvature, variable width, and sidewalks may be much more significant in increasing the design fee of bridge projects with no-staging.
(a) Abutment type categories
(b) Construction staging involvement
Figure 9. Abutment type and construction staging effects on average contract value
14
5. QUESTIONNAIRE SURVEY RESULTS A short questionnaire was distributed to three bridge design experts in Iowa DOT in order to determine the level of importance of the bridge attributes on design efforts and costs. Respondents were first asked to indicate whether the attribute is well known, somewhat known or unknown before receiving a consultant’s proposal. The respondents reported that most attributes are quite well known before the final design starts except for the following two attributes; a) Abutment: Abutment Pile Type and b) Pier: Pile Type. Thus, these attributes are excluded from design effort estimation model development. In addition, respondents were asked to assign a level of importance score on a scale of one to three where one has the lowest effect on design fee and three has the highest effect. The responses of this question are summarized in Table 3. As shown in Table 3, these experts believe that type of work, number of construction stages, bridge width variable, number of piers, and pier type are the most influential factors on design fee and workhours, which appear to be quite accurate based on the results of Chapter 4 except for the type of work. Type of work is not a significant design cost differential when assessed in terms of average contracted design fees. Also, Abutment type and beam type are significant design cost differentials as shown in Chapter 4. However, the experts unanimously reported that those attributes may not be significant factors (total score of 6 and 5 respectively). Table 3. Summary of the questionnaire survey responses Attribute Type of work Number of construction stages Bridge width variable Number of piers Pier type Number of spans Beam spacing variable Bridge length Aesthetic item Aesthetic description Horizontal curve radius Skew Sidewalk Joint expansion type Abutment type Abutment foundation Foundation type Feature crossed Bridge width Bridge out to out width
Response #1 3 3 3 3 3 2 3 1 3 3 1 or 3 2 2 2 2 2 2 1 1
Response #2 3 3 3 3 3 3 3 3 2 2 3 2 2 2 2 2 3 2 2 2
15
Response #3 3 3 3 3 3 3 2 3 2 2 3 2 2 2 2 2 2 1 2 2
Total score 9 9 9 9 9 8 8 7 7 7 7 6 6 6 6 6 5 5 5 5
Number of beam lines Beam type Route Span lengths Beam spacing Design Load parameters (LRFD, LFD, and WSD) Vertical curve type Number of abutments Rail – inside and outside - left and right Deck reinforcement type Beam description Abutment pile type Pier foundation Pile type
1 1 2 1 1
2 1 1 1 2
2 3 1 2 1
5 5 4 4 4
1
1
1
3
1 1
1 1
1 1
3 3
1
1
1
3
1 1 1 2 1
1 1 1
1 1 1
1
1
3 3 3 2 3
16
6. DATA PROCESSING 6.1. Missing value imputation Some data attributes have missing values and most of those missing values are filled out using reasonable assumptions. Missing values in one or more fields will make the entire record of a bridge design project unusable when developing estimation models such as regression models or neural networks. As such, it is important to fill those missing values to retain the highest possible number of records. Missing value imputations are conducted using the following rules:
No construction staging if the number of construction stages is zero. Feature crossed field is assumed based on the “Over” field. The “Over” field contains the following attributes: creeks, rivers, roads, multiple and railroad. Route type is determined based on the route number (Interstate, US, or Iowa) stored in project number. Bridges with a blank “bridge with variable” record are assumed to have a constant bridge width. Number of spans is calculated based on the count of span lengths field. Number of piers is calculated as number of spans minus one. Beam spacing variable is assumed based on the span lengths. If a bridge has different span lengths, then there is variability in beam spacing. Bridges with a blank “sidewalk” record are assumed to have no sidewalk. Bridges with a blank “Aesthetic items” record are assumed to have no aesthetic items. Number of aesthetic items is counted and added as a separate field as the number of aesthetic items.
6.2. Exclusion of Attributes Based on the questionnaire survey responses and the availability of the data, the following attributes are excluded:
Design load parameter because the majority of PPCB bridges are designed by using LRFD Bridge out to out width because of missing data Number of abutments because of missing data Deck reinforcement type because of missing data Beam description because of missing data Abutment foundation because the majority of the bridges have piles except one bridge that has a drilled shaft Pile type because of missing data Outside/inside rail because of missing data Work complexity and urgency because almost all PPCB design projects in the database have the same value and differences in values were negligible.
17
6.3. Outlier Detection and Exclusion A modified z-score which is calculated based on the median absolute deviation (MAD) and median of a dataset is used to find outliers of data points (Iglewicz and Hoaglin 1993). The modified z-score is calculated as shows in equation (1):
𝑀 =
0.6745(𝑥 − 𝑥) … … … … … … … … . (1) 𝑀𝐴𝐷
Where 𝑋 is the observation value, 𝑋 is the median, and MAD is the absolute difference between the median of data and observation value. As noted by Iglewicz and Hoagline (1993), modified z-scores that are greater than 3.5 are considered as outliers. As such, a total of seven projects are removed based on the criteria. 6.4. Inflation Adjustment Available projects for analysis are spread from 2008 to 2016. Thus, design fees need to be adjusted for inflation. In this study, all the design fees are adjusted to 2015 dollars. In order to make that adjustment, the Consumer Price Index (CPI) is used. Table 4 shows the CPI indexes obtained from the Bureau of Labor and Statistics (2016). Table 4. Consumer price indexes (Bureau of Labor and Statistics 2016) Year 2008 2009 2010 2011 2012 2013 2014 2015 2016
CPI 215.303 214.537 218.056 224.939 229.594 232.957 236.736 237.017 237.855
18
7. DESIGN EFFORT ESTIMATION MODELS 7.1. Multivariate Linear Regression Linear Regression is an approach in which a relationship is modeled between a dependable variable and an independent variable. In case of multiple regression, a relationship is modeled between several independent variables and a dependent variable. In design effort estimation, the dependent variables predicted are design fees or workhours. Independent variables are design parameters and bridge characteristics. A regression model is represented as shown in equation (2). Estimated Design Fee/workhours = 𝐼 + 𝑉 x 𝐶 + 𝑉 x 𝐶 + 𝑉 x 𝐶 + ⋯ + 𝑉 x 𝐶 … (2) Where; 𝐼 = Intercept; 𝑉 = ith input variable; 𝐶 = Coefficient with respect to ith input variable; n= Number of input variables. If the coefficient associated with a corresponding variable is positive, then the Design Fee/workhours increases with the increase in value of the corresponding variable. Similarly, if the coefficient of an input variable is negative, then the Design Fee/workhours decreases with the increase in value of the variable. In estimating bridge design fee/workhours, the input variables have to be in the numerical form as multiple regression cannot use categorical variables as input data. Conversion of categorical data into numerical data is described in the previous sections of the report. In total, seven models are developed for four different types of design fees and three different types of workhours as shown below.in Table 5. Table 5. Multivariate Linear Regression Models Design Fee Estimation Models - Design Fee Proposed by the Consultant - Design Fee Proposed by Iowa DOT - Contracted Design Fee - Actual Design Fee paid
Workhour Estimation Models - Workhours Proposed by the Consultant - Workhours Proposed by Iowa DOT - Contract Workhours
Various data mining tools are available for developing regression models. In this study, the Microsoft data mining client for Excel is used to develop regression models. Once a preliminary model is developed, the model is then optimized by discarding the input/independent variables one by one starting with a variable with the highest p-value. The final model consists of statistically significant independent variables with p-values less than 0.05. A p-value less than 0.05 means that the independent variable has a significant impact on the dependent variable to the tune of 95%. Any p-value less than 0.05 is considered acceptable in the model. The adjusted R-squared value is used to measure the performance of a regression model developed. The adjusted R-squared value ranges from 0% to 100%. If the value gets closer to 1, that indicates 19
that the model explains the data better. The adjusted R-squared value increases only if the addition of a new input/independent variable improves the performance of the existing model. 7.1.1. Design Fee Proposed by Consultant The regression model developed for estimating consultant’s proposed fee is provided in Equation (3) below. The significantly variables are shown in Table 6. Y=22,429.43-11,811.15(A) +30,612.75(B) +42,072.21(C) +81,520.82(D) +37,384.97(E) + 35,238.38(F) … … … … … … … … . (3) Table 6. Significant attributes for Consultant’s proposed design fee Symbol A B C D E F
Significant Attributes Number of different spans Number of Piers Bridge Width Variability Horizontal Curve requirement Sidewalk requirement Beam type
The adjusted R-squared value of 63.28% is obtained for the model, which indicates that independent variables in this model can explain variability between the independent variables and the dependent variable up to 63.28%. 7.1.2. Workhours Proposed by Consultant The regression model developed for estimating work hours proposed by consultant is presented in equation (4) and Table 7 shows the statistically significant variables. By comparing both Table 6 and Table 7, the significant variables for both estimation models are found to be the same. The adjusted R-squared value of 56.25% is obtained for this model. Y=305.45-106.36(A) +273.16(B) +319.75(C) +693.56(D) +323.15(E) +252.98(F)…. (4) Table 7. Significant attributes for Consultant’s proposed workhours Symbol A B C D E F
Significant Attributes Number of different spans Number of Piers Bridge Width Variability Horizontal Curve requirement Sidewalk requirement Beam type
20
7.1.3. Design Fee Proposed by Iowa DOT The regression model developed for estimating the design fee proposed by Iowa DOT is given in Equation (5). The independent variables shown in Table 11 are found to be significant and the adjusted R-squared value of 57.94% is obtained for this model. Y= 15,915.82+18,780.38(B) +36,909.27(C) +43,173.86(D) +28,097.69(F) … … … . (5) Table 8. Significant attributes for estimating design fee proposed by Iowa DOT Symbol B C D F
Significant Attributes Number of Piers Bridge Width Variability Horizontal Curve requirement Beam type
7.1.4. Workhours Proposed by Iowa DOT The regression model developed for estimating the design fee proposed by Iowa DOT is given in Equation (6). The independent variables shown in Table 9 are found to be significant. In addition to four variables in Table 11, two additional variables including Aesthetic items requirement and pier type are significant variables. The adjusted R-squared value of 61.74% is obtained for this model. Y=146.44+163.85(B) +222.72(C) +403.93(D) +191.17(F) -146.82(G) +160.87(H) … … … … … … … … . (6) Table 9. Significant attributes for Iowa DOTs proposed workhours Symbol Significant Attributes B Number of Piers C Bridge Width Variability D Horizontal Curve requirement F Beam type G Aesthetic Items requirement H Pier Type 7.1.5 Contracted Design Fee The regression model developed for estimating the contracted amount of designed fee is provided in Equation (7) with its statistically significant variables in Table 10. The adjusted Rsquared value of 66.02% is obtained for this model.
21
Y=12,192.78+20,975.37(B) +27,391.62(C) +70,225.85(D) +29,895.70(E) + 36,689.15(F) -22,642.62(G) … … … … … … … … . (7) Table 10. Significant attributes for Contract fee Symbol B C D E F G
Significant Attributes Number of Piers Bridge Width Variability Horizontal Curve requirement Sidewalk requirement Beam type Aesthetic Items requirement
7.1.6 Contracted Workhours The regression model developed for estimating the contracted amount of designed fee is provided in Equation (8) with its statistically significant variables in Table 11. The adjusted Rsquared value of 68.56% is obtained for the model. This regression model is found to be the best performing model of all as it can explain variability between the independent variables and the dependent variable up to 68.56%. Y= 397.34+1.68(I)-117.18(A) +674.99(D)-434.41(J) +303.88(E) +239.03(F)-284.10(G) +154.04(H) … … … … … … … … . (8) Table 11. Significant attributes for Contract workhours Symbol I A D J E F G H
Significant Attributes Bridge Length Number of different spans Horizontal Curve requirement Beam Spacing Variable Sidewalk requirement Beam type Aesthetic Items requirement Pier Type
7.1.5. Actual Fee Paid: The regression model for estimating the actual amount of design fee paid to consultant is provided in Equation (9) with its significant variables explained in Table 12. The adjusted Rsquared value of 65.69% is obtained for this model. 22
Y= 6,191.59+175.18(I) +60,911.04(D)-31,787.23(J) +29,223.82(F) -25,986.10(G) +25,720.88(H) … … … … … … … … . (9) Table 12. Significant attributes for actual fee paid Symbol I D J F G H
Significant Attributes Bridge Length Horizontal Curve requirement Beam Spacing Variable Beam type Aesthetic Items requirement Pier Type
7.1.6. Summary Figure 10 shows the summary of significant attributes of all regression models developed in this study. It can be inferred that horizontal curvature and beam type are the most common and significant attributes derived from these regression models. Furthermore, the adjusted R-squared values of all models are between 55% and 70% which indicates that this model does not explain the variability of remaining 30%-45% of the available data. This can be attributed to inaccuracy of data and the low number of data points used in this study. In this study, only 69 projects are considered in model development.
Sl No. 1 2 3 4 5 6 7 8 9 10
Significant Attributes Number of different spans Number of piers Bridge width variability Horizontal curve requirement Sidewalk requirement Beam type Aesthetic items requirement Bridge length Beam spacing variability Pier type Adj. R-squared Values
Consultant Fee Work Hours P P P P P P P P P P P P
Iowa DOT Fee Work Hours
63%
58%
56%
Fee
P P P
P P P
P
P P
P P P P P P
P 62%
66%
Contract Actual Work Hours Fee P
P P P P P P P 69%
P P P P P P 66%
Figure 10. Significant attributes for Regression Models 7.2. Artificial Neural Network Artificial Neural Network (ANN) is defined as a heuristic learning technique that aims at finding non-linear patterns and relationship between inputs and outputs by using a training dataset (Hsu et al. 1995). The technique has been used in the construction industry for the past two decades. For example, Hegazy and Ayed (1998) as well as Adeli and Wu (1998) used the ANN to estimate highway projects costs. 23
Figure 11 shows the main components of ANN. Typically, ANN consists of an input layer, a hidden layer(s), and an output layer. Additionally, each layer consists of a number of neurons that are assigned with different values during the training phase. As the number of neurons increase, the trained ANN is expected to perform better by reducing the prediction error. However, increasing the number of neurons may cause an overfitting problem. This means that even when the percentage error within the training dataset is very small, the percentage errors will be very large when new data is introduced to make predictions. As such, it is important to hold a portion of the data and use them as a testing dataset to measure the percentage error in prediction and ensure that the trained ANN is not over fitted. In this research, 15% of data points are used for testing while the other 85% are used for training.
Figure 11. Typical neural network configuration (Gransberg et al. 2015) Just like MLR models, seven ANN models are developed to estimate the design fees and work hours. The following fifteen input variables are used for developing ANN models.
Bridge length number of different spans type of work route type skew sidewalk requirement abutment type pier type bridge width number of piers number of construction stages bridge width variability horizontal curvature beam type number of aesthetic items
It is worth noting that ANN can deal with any type of variables and hence no data transformation is needed. The hidden node ration or the number of neurons in the hidden layer is an important 24
parameter that should be tuned in order to improve the performance of the neural network. In this study, ANN models are trained by using a hidden node ratio of 10 since it achieved better performance than models with other ratios. When training a new ANN, users are recommended to try different hidden node ratios until the best performance is reached. Appendix A contains a series of illustrative figures that describe how to use a commercial software tool to train ANN and make a prediction using the ANN. The commercial software tool used is an Excel add-in developed by Microsoft. More information regarding the tool is found through this link < https://msdn.microsoft.com/en-us/library/dn282385.aspx>. There are other available tools that can be used to develop ANN models such as Weka < http://www.cs.waikato.ac.nz/ml/weka/>, JMP , and Mat lab neural network toolbox < http://www.mathworks.com/ products/neural-network/>. 7.3. Performance Comparison Fifteen percent of the available data,(i.e. nine records), are kept for testing purposes. The performances of the MLR and ANN models are evaluated by calculating the mean absolute percentage error (MAPE) that is measured with equation (10):
𝑀𝐴𝑃𝐸 =
100 𝑛
𝑃 −𝐴 … … … … … … … … … (10) 𝐴
Where, n is the number of testing data-points, 𝑃 is the predicted design fee or engineering workhours, 𝐴 is the actual predicted design fee or engineering workhours. As such, the comparison between performances of MLR and ANN to make accurate predictions is assessed and the results are summarized in Table 13. Table 13. MAPE for both MLR and ANN models Consultant’s proposed fee Consultant’s proposed workhours Iowa DOT’s proposed fee Iowa DOT’s proposed workhours Contracted design fee Contracted workhours Actual paid design fee
MLR-MAPE % 30.97% 28.89% 31.00% 22.62% 25.58% 37.90% 28.75%
ANN-MAPE % 40.38% 35.02% 30.89% 30.54% 23.75% 20.77% 37.87%
The MLR regression models perform better in estimating the consultant’s proposed fee and workhours. In estimating the Iowa DOT’s proposed fee, there is no significant difference
25
between the two models. However, the MLR model performs better in estimating the Iowa DOT proposed workhours. The performances of both MLR and ANN models are very similar in estimating the contracted design fees while the ANN model outperforms the MLR model in estimating the contracted work hours. Finally, the MLR model shows a better performance compared to the ANN in estimating the actual amount of design fee paid. It should be noted that the performance of the ANN model can be improved by increasing the number of data points used for training and/or changing the number of nodes/neurons. As the number of data points increase, the training algorithm will be able to detect and fine-tune the relationships between the inputs and outputs more accurately. In this study, only 51 data points are used for training ANN models. The small size of the data points might be the main reason for the ANN’s poorer performance than expected. It is expected that the performance of the ANN will continue to improve if the number of data points for training increase in the future.
26
8. CASE-BASED REASONING Case-based reasoning (CBR) is a four step process that aims at solving new problems by referencing similar historical cases. Generally, CBR consists of four main tasks (see Figure 12) as follows:
Retrieve Reuse Revise Retain
Figure 12. CBR main tasks The retrieval task involves finding the most similar cases to the new problem by using a measure of similarity. This measure of similarity could be any measure that best determines the similarity between a new case and historical cases. The reuse task aims at analyzing the retrieved historical cases to find a solution for the new problem while the revise task aims at evaluating the proposed solution. Finally, the revised solution is retained in the library or database of cases in order to be used for future inference. In this study, CBR is used to retrieve similar PPCB bridge design projects based on the similarity between the attributes of a new bridge and the attributes of past projects as shown in Figure 13. The similarity score (SC) consists of two main components as follows:
Matching score and Weight of importance.
The SC is calculated based on attribute type and hence, attributes are grouped into two different types. Table 14 shows the classification of bridge attributes. The first type of attributes contains continuous and discreet attributes such as bridge length and number of piers while the second type includes binary attributes such as horizontal curvature (Yes (1) or No (0)) and bridge width variability (Yes (1) or No (0)).
27
Figure 13. CBR retrieval process As for the first type of attributes, similarity is measured based on the absolute difference between the new bridge’s attribute and the same attribute of the historical bridge design projects. For example, a new bridge with a length of 350 feet is compared to six previous bridge design projects with lengths of 150, 200, 375, 600, 750, and 900 feet respectively. The absolute length differences between the new project and the historical projects are 200, 150, 25, 250, 400, and 550 feet respectively. Then, these absolute length differences are ranked in ascending order to determine the similarity. As a result, the similarity ranking will be 3, 2, 1, 4, 5, and 6 respectively. The project that received the first rank is given the full matching score while the second, third, and fourth projects receive partial matching scores. Finally, the rest of the projects receive no matching score. Some historical projects may be assigned the same ranking score when they have the same absolute difference. Table 14. Classification of variables and weights of importance
Group 1 (50%) Group 2 (30%)
Group 3 (20%)
Variable Type Continues or discrete Binary Number of piers Bridge width variability Degree of horizontal Horizontal curvature curvature Bridge length Beam spacing variable Bridge width sidewalk requirement Number of aesthetic items Aesthetic items requirements Pier type Number of different spans Type of work Number of construction Route type stages Beam type Skew Abutment type
28
On the other hand, the matching score for the second type of attributes is calculated based on a match or no match criteria. For example, if a new project has variable bridge width, then all past projects with variable bridge width receive one point (match) while the other projects receive a matching score of zero (no-match). The second component of the similarity score is the weight of importance. Attributes are classified into three groups according to their statistical significance and the questionnaire survey results. The first group is assigned a weight of 50% and includes attributes such as the number of piers, bridge width variability and horizontal curvature. The second group of attributes is assigned a weight of 30% and includes bridge length, bridge width, beam spacing variability, sidewalk requirement, aesthetic items requirement and pier type. Finally, the third group has a weight of 20% and includes number of different spans, type of work, number of construction stages, route type, skew, beam type, and abutment type. However, these weights can be adjusted if necessary in order to find better matches and reflect the experiences and knowledge of the user. The most similar projects are then retrieved after calculating the similarity scores for all past projects. At this point, the user can compare the project on hand with the most similar projects and make inferences regarding the expected design fee and expected workhours. Additionally, the user can also compare the predicted design fee and workhours using MLR and ANN models with the most similar projects. 8.1. Case-Based Reasoning Spreadsheet Tool A spreadsheet tool is developed to help Iowa DOT engineers find the most similar past projects by using CBR. Figure 14 shows the user’s input worksheet that contains 18 different bridge attributes. Using the tool is simple as the user is only required to insert the value for each attribute. The weights of importance, used to calculate the similarity score, are already assigned by default values. However, the spreadsheet tool is also flexible that the user can change the weights of importance to reflect their own experience and judgements.
29
Attribute Bridge length Bridge width Number of different spans Number of piers Type of work Number of construction stages Route type Bridge width variability Skew Horizontal curve existence Degree of horizontal curvature Beam spacing variability Left/right sidewalk Beam type Abutment type Aesthetic items requirement Number aesthetic items Pier type
Input Value
Weight of importance 5.0% 5.0% 2.9% 15.0% 2.9% 2.9% 2.9% 15.0% 2.9% 15.0% 5.0% 5.0% 5.0% 2.9% 2.9% 2.5% 2.5% 5.0% 100%
Total Weights
Figure 14. User input worksheet Once the user enters input values for the required attributes, the most similar past projects are automatically listed (see Figure 15). The spreadsheet also highlights different attributes in light brown color between the project under study and past projects to help the user quickly recognize different attributes. Additionally, the spreadsheet tool calculates the predicted design fees and workhours by using the regression models developed in this study. This information will also allow the user to compare the predicted design fee with the most similar projects. Most Similar Projects Attribute
Predicted Fee/Hours
Project #1
Project #2
Project #3
Project #4
Project Number Consultant Proposed Fee Consultant Proposed Hours Iowa DOT Proposed Fee Iowa DOT Proposed Hours Contract Price Contract Hours Actual Paid Bridge length Bridge width Number of different spans Number of piers Type of work Number of construction stages Route type Bridge width variability Skew Horizontal curve existence Degree of horizontal curvature Beam spacing variability Left/right sidewalk Beam type Abutment type Aesthetic items requirement Number aesthetic items Pier type Similarity Score
Figure 15. Most similar projects worksheet
30
Project #5
9. SUMMARY AND CONCLUSION This research project developed three different methods including a) multiple linear regression models, b) artificial neural network models, and c) case based reasoning tool to assist the Iowa DOT’s office of bridges and structures in reasonably estimating PPCB bridge design efforts and costs. It first developed a master database by extracting and consolidating relevant data for estimating PPCB bridge design efforts and costs from three different databases available in Iowa DOT. The three different databases include BRIS database, agreement files and Design Fee and Workhours data. In total, seven different MLR models were developed to estimate PPCB bridge design fees and workhours. They include models to estimate a) design fee proposed by consultant, b) work hours proposed by consultant, c) design fee proposed by Iowa DOT, d) work hours proposed by Iowa DOT, e) contracted design fee, f) contracted work hours, and finally g) actual design fee paid to consultant. It is found that horizontal curvature and beam type are consistently significant attributes and common among the seven models. Additionally, number of different spans, number of piers, bridge width variability, sidewalk requirement, aesthetic items requirement, bridge length, beam spacing variability and pier type are statistically significant parameters for many models. The adjusted R2 value ranges from 0.56 to 0.69 where the highest R2 is obtained from the model that estimates contracted workhours. The performance of the MLR is measured by using the MAPE calculated for 15% of the dataset that are originally held for testing purposes. The MAPEs are from 22% to 37% where the best performance of 22% was obtained from the model that estimates work hours proposed by Iowa DOT. Similarly, seven ANN models were developed by using a commercial Excel add-in tool. The ANN models used 15 bridge attributes as input variables. The performances of the ANN models were also measured by calculating the MAPEs that ranged from 20% to 40%. The best performing ANN model is the model that estimates contracted workhours. In this study, the performances of both MLR and ANN models are quite similar. Generally, the performance of ANN models is better than that of MLR models when the relationship between input variables and the output variable appears to be difficult to mathematically define. It is because ANN is a heuristic method and its performance can be improved by tuning the parameters of the learning algorithm or by adding more data points to the learning dataset. It should be noted that the number of projects used in developing the prediction models is only 51 projects which are relatively a small size of data points. As a result, the models developed showed relatively high percentage of error (20% to 40%). It is expected that the performance of the models can be significantly improved by increasing the sample size in the future as more project data will be added to the database. A CBR spreadsheet tool was developed to retrieve the most similar bridge design projects and make logical reasoning to estimate a new project’s design efforts and costs. The most similar projects are retrieved based on a similarity score calculated by using 19 bridge attributes. The spreadsheet tool gives the Iowa DOT engineers flexibility to change the weight associated with each bridge attribute in order to reflect their definition of bridge design similarity and 31
experience. The CBR tool can be used side by side with the other estimation models to compare the predicted design fees and workhours with those of the most similar projects identified. The performance of the CBR tool can also be improved by increasing the number of projects in the historical database.
32
REFERENCES Bubshait, A. A., & Cunningham, M. J. (1998). Comparison of delay analysis methodologies. Journal of Construction Engineering and Management,124(4), 315-322. Adeli, H., & Wu, M. (1998). Regularization neural network for construction cost estimation. Journal of Construction Engineering and Management,124(1), 18-24. Bubshait, Abdulaziz A., and Michael J. Cunningham. 1998. “Comparison of Delay Analysis Methodologies.” Journal of Construction Engineering and Management 124 (4): 315–22. Bureau of Labor and Statistics. (2016). Consumer price index. Washington: US Department of Labor. Carr, P. G., & Beyor, P. S. (2005). Design fees, the state of the profession, and a time for corrective action. Journal of Management in Engineering,21(3), 110-117. Cochran, J., Crocker, J., Kingsley, G., & Wolfe, P. (2004). Best practices in consultant management at state departments of transportation. Transportation Research Record: Journal of the Transportation Research Board, (1885), 42-47. Gransberg, D. D., Lopez del Puerto, C., & Humphrey, D. (2007). Relating cost growth from the initial estimate to design fee for transportation projects. Journal of construction engineering and management, 133(6), 404-408. Hegazy, T., & Ayed, A. (1998). Neural network model for parametric cost estimation of highway projects. Journal of Construction Engineering and Management, 124(3), 210-218. Hollar, D. A., Rasdorf, W., Liu, M., Hummer, J. E., Arocho, I., & Hsiang, S. M. (2012). Preliminary Engineering Cost Estimation Model for Bridge Projects. Journal of Construction Engineering and Management, 139(9), 1259-1267. Hsu, K. L., Gupta, H. V., & Sorooshian, S. (1995). Artificial neural network modeling of the rainfall-runoff process. Water resources research, 31(10), 2517-2530. Hunter, Kate. 2014. “Estimating Preconstruction Services Costs for Highway Projects.” Graduate Theses and Dissertations, January. http://lib.dr.iastate.edu/etd/13774. NCHRP 15-51 (2013-2015), “Preconstruction Services Cost Estimating Guidebook”, Conducting data mining and database development efforts in determining preconstruction cost estimate Nelson, James S., and Emery J. Waterhouse. 2016. “The Development of a Parametric Design Time Estimating Tool for Pretensioned Prestressed Concrete Beam Bridges in Iowa.” In . Shrestha, P. P., & Mani, N. (2013). Impact of design cost on project performance of design-bidbuild road projects. Journal of Management in Engineering, 30(3), 04014007. Turner, B. L., & Miller, J. S. (2015). Forecasting Project Preliminary Engineering Costs. In Transportation Research Board 94th Annual Meeting(No. 15-0975). Turner, Bethany L., and John S. Miller. 2015. “Forecasting Project Preliminary Engineering Costs.” In . http://amonline.trb.org/trb57535-2015-1.1793793/t005-1.1820706/2371.1820967/15-0975-1.1801947/15-0975-1.1801948. Turochy, R. E., Hoel, L. A., & Doty, R. S. (2001). Highway project cost estimating methods used in the planning stage of project development (pp. 1-290). Virginia Transportation Research Council. Williams, Trefor P., Neville Parker, and James Klotz. 2013. “Development of an Estimation Tool for Transportation Project Design Effort.” In .
33
Woldesenbet, A., & Jeong, D. H. S. (2012). Historical data driven and component based prediction models for predicting preliminary engineering costs of roadway projects. In Construction Research Congress 2012: Construction Challenges in a Flat World (pp. 417-426). ASCE. Yu, J. (2015). Development of Baseline Work Plan Templates for Transportation Projects. In Transportation Research Board 94th Annual Meeting (No. 15-0085).
34