International Journal of Soft Computing and Metaheuristics Vol. x, No x, March 2018, pp. xx-xx
ISSN: xxxx-xxxx
1
Big Data and Machine Learning in Construction: A Review Kashif Hussaina,1, Mohd Najib Mohd Salleha,2,*, Shabana Talpura, Noreen Talpura a Universiti
Tun Hussein Onn Malaysia, Batu Pahat, 86400, Malaysia.
[email protected]; 2
[email protected]* * corresponding author
ACCEPTED: AUTHORS’COPY
1
ARTICLE INFO.
ABSTRACT
Article history: Received Revised Accepted
Today’s businesses, including construction firms, collect data in petabytes (1015 bytes) every day. This data comes from sensor networks, internet of things, building information modelling (BIM), enterprise resource planning (ERPs), and social media to form big data which is expected to increase exponentially. An effective and efficient use of this data brings numerous opportunities for better data-driven decision making in construction projects. This research presents a review of big data and machine learning applications in literature related to construction industry. The survey shows that even though big data and machine learning algorithms have been utilized in construction and related functions, but it is still in infancy stage due to some major pitfalls. Advanced data analytics and computational intelligence is still to benefit the area of construction up to full potential. This has been duly highlighted in this study in terms of shortcomings and open research lines for future researchers to work on.
Keywords: Big Data Machine Learning Artificial Intelligence Data Analytics Construction
Copyright © 2018 xxxxxxxxxxxxx. All rights reserved.
I. Introduction
There is steady increase in data as every organization (formal or informal), being collected from internet of things devices, social media, and other sources. This makes unprecedented amount of data in petabytes (1015 bytes), processed in four steps: ingest, store, process and analyse, and explore and visualize [1]. In the first step, raw data is pulled from sensors, scanners, mobile-apps, and other devices. The pulled data is then stored in suitable format so that it can be used in later stages. Once this raw data is stored properly, it is ready to be processed into actionable information. The last step is to extract meaning information through data-mining tools to draw insightful analysis for datadriven decision making. The voluminous data, so called Big Data, has four characteristics volume, variety, velocity, and veracity [2]. Big data is with high volume (e.g., a dataset with over 50,000 instances and hundreds of dimensions), it has variety of data formats (text, numbers, images, audio, and videos, including missing values), it is generated with high velocity, and its quality may not be necessarily high as poor data with missing values can be captured in Big data. Big data brings extraordinary opportunities for organizations (including construction firms) to devise out of the box solutions. Construction firms collect and process significant data from diversified sources involved in construction lifecycle. These sources are internal (data entry points, sensors, RFIDs, construction designs, customer database, etc.) and external (suppliers, logistics, contractors, government agencies, social media, etc.). Moreover, the massive data is in variety (temperature data, employees data, financial data, schedules, graphs, 3-D models, and CCTV videos, etc.) [3]. Construction firms may utilize this data with machine learning and data mining techniques for understanding hidden patterns and trends, predicting future events, finding correlations and other insights. These insights will help in construction cost optimization, faster and better decision making, and offering new products and services [4]. Today’s machine learning techniques, including deep learning algorithms, are able to process big data. The most common tools that provide machine learning solutions (classification, clustering,
DOI:
W : http://ascee.org | E :
[email protected]
2
International Journal of Soft Computing and Metaheuristics Vol. x, No. x, March 2018, pp. xx-xx
ISSN: xxxx-xxxx
prediction, and regression) are Spark [5], Hadoop [6], Python [7], R [8], and Oryx [9]. Similar to other industries, the adaptation of these techniques in construction industry is no longer a luxury but necessity. The rapid development in technology demands construction industry to make use of Internet of Things (IoT), big data analytics, and artificial intelligence for discovering opportunities of enhanced productivity with intelligent solutions. This paper reviews existing literature on big data and machine learning techniques implemented in construction. The remainder of this paper is organized as follows. In section II, we introduce big data analytics and machine learning techniques. Section III provides an overview of the literature on the implementations of these techniques in different areas of construction. Section IV highlights research gap and opportunities as potential future directions. Finally, Section V provides a conclusion of this work.
ACCEPTED: AUTHORS’COPY
II. Big Data Analytics and Machine Learning
Data has been of great economical and social value today. It is being generated by smallest device in our hands to sensors and satellites to form diverse, distributes, and heterogenous datasets. Moreover, advances in computing power, cloud computing, storage of massive data, and data mining techniques have provided numerous opportunities of driving innovations, efficiency, and socio-economic growth [10]. A. Big Data Analytics
Big data analytics is the process of examining large data for analyzing, visualizing, and extracting information for understanding hidden patterns, predicting market trends, knowing customer preferences, enhancing productivity, optimizing costs, and much more. The data scientists, statisticians, and decision makers utilize big data analytics for efficient data-driven decision making for competitive advantage [11]. There are multidisciplinary stakeholders that make data analytics possible. As shown in Fig. 1, there are involved data engineers for database design, data scientists for data mining, statisticians for statistical modeling, site engineers for data collection, business analysts for decision making, etc. In fact, all these data related tasks were already being performed, however the scale of analytics has grown bigger in big data analytics [12]. The approach is extended to deal with large, complex, and multidisciplinary datasets that are difficult to handle with traditional database management systems. Recently, research is trended towards enhancing computational power and speed of analysis algorithms to deal with voluminous data. There also exist some critical challenges pertaining to big data analytics; such as, data is unstructured and complex, it is inconsistent and incomplete, and distributed among various resources [13]. Mostly, professionals and researchers involved in big data analytics use tools which can be categorized into two groups, big data storage and analytics.
Statistics
Big Data Analytics Data Mining
Pattern Recognition
Computational Neuroscience
Machine Learning
Databases Knowledge Discovery AI
Fig. 1. Multidisciplinary nature of Big Data analytics [12]
Hussain, K. et.al (BD and ML in Construction: A Review)
ISSN: xxxx-xxxx
International Journal of Soft Computing and Metaheuritics Vol. x, No. x, March 2018, pp. xx-xx
3
Apache Hadoop is among the top solutions when it comes to big data storage and processing. It is a platform that consists of multiple tools like Hadoop kernel, Map/Reduce and Hadoop Distributed File System (HDFS), Apache Hive, Apache HBase, etc. HDInsight from Microsoft is another big data processing tool used to create clusters, analyze big data, and create solutions with other Hadoop-based tools. To handle unstructured data, NoSQL is an efficient tool that stores unstructured data with no specific schema – meaning that each row may contain different number of columns. To query big data, Hive is useful tool that runs over Hadoop platform. The main purpose is to provide easy to use platform for data mining tasks. Spark is also another open source powerful tool which comes with SparkSQL for unstructured and structured data processing, MLib for machine learning, GraphX for graph processing, and Spark Streaming for handling data streams. There is consistent inflow of plenty of other useful tools for fast, efficient, and easy to build solutions for big data problems.
ACCEPTED: AUTHORS’COPY
B. Machine Learning
Once big data has been collected and stored in suitable database systems, it is time to make sense of it and incorporate data-driven decisions into processes. Wal-Mark benefits from machine learning techniques to explore patterns from large volume of transactions for devising competitive pricing strategies and advertisement campaigns [13]. Machine learning techniques are used to learn and extract useful information from data, including knowledge discovery, clustering, classification, prediction, and regression. In context of big data, the implementation of traditional machine learning techniques (supervised and unsupervised) is difficult to implement. The traditional techniques are enhanced to deal with extensive processing of huge data. This has established a new frontier, called Deep Machine Learning. For example, Parallel Support Vector Machine (PSVM) [14] is enhanced version of traditional support vector machine for big data and Deep Stacking Network (DSN) [15] is neural network version for big data. According to [16] Extreme Machine Learning (ELM) is another efficient machine learning technique which has been successfully implemented on big data problems in the domains of biomedical engineering, computer vision, system identification, and control and robotics. Generally, machine learning on big data involves cloud environment, big data analytics, machine learning, users, and business intelligence (Fig. 2). Sensors, RFIDs, and other big data sources store data in cloud environment in relatively unstructured raw data format. Next step is big data engineering and analytics where data is transformed into ready to use data. Here, statistical and visualization methods are employed on big data to analyze feasibility. Once, big data has been analyzed and found useful, machine learning techniques are employed for clustering, prediction, and classification tasks. The users who work on big data and machine learning algorithms are statisticians, data engineers, and data scientists who provide input to decision makers.
Cloud Environment Unstructured Big Data
Big Data Engineering and Analytics • Data Cleaning • Missing Values and Outliers • Feature Engineering and Feature Reduction
Data Partitioning
Machine Learning
• Training Data • Testing Data
• Data Visualization • Statistical Methods
• • • • •
Neural Networks Deep Machine Learning Metaheuristics Random Forest Support Vector Machine, etc.
Users Statisticians, Data Engineers, Data Scientists
Decision Makers Business Intelligence, Experience, Business Goals
Fig. 2. Machine learning on big data Hussain, K. et.al (BD and ML in Construction: A Review)
4
International Journal of Soft Computing and Metaheuristics Vol. x, No. x, March 2018, pp. xx-xx
ISSN: xxxx-xxxx
The subsequent section explores applications of big data and machine learning techniques in different operations or functions of construction industry. III. Applications in Construction Industry
There exist some efficient algorithms that can be used to create machine learning models that learn from data. Literature shows that plenty of useful insights have been achieved using construction big data and machine learning algorithms. We categorize the literature based on functions in construction projects. A. Structure Damage Analysis
ACCEPTED: AUTHORS’COPY
Timely structure fault or damage determination plays important role in construction project success. Major functions in this area can be: prediction of damage occurrence, diagnosing damage, determination of damage place in a structure, and determination of damage degree. Machine learning techniques can be applied on previous construction damage data for classification and prediction of damages in structures. A fuzzy neural network based machine learning technique Adaptive Neuro-Fuzzy Inference System (ANFIS) was employed by [17] for damage detection using structural dynamic vibration data. This research proposed technique that used feature data by processing ANFIS output data. The study found that ANFIS did not only produce accurate results but also it helped determine damage characteristics. Another application of neural network application in construction industry can be found in [18]. In this research, a hybrid of wavelet packet decomposition, multi-sensor feature fusion theory and neural network pattern classification method was proposed for diagnosing structure damage. The model used data which was collected from wide-spread sensors. The neural network was used for feature selection and classification of damage features. The researcher found high accuracy in diagnosing structure damage. [19] proposed support vector machine (SVM) in integration with genetic algorithms (GA) for identifying damage in bridge. The proposed SVM-GA model accurately assessed damage conditions in bridge with five girders. Apart from the illustrative machine learning applications on structure damage analysis, literature consists of many other researches implemented big data and machine learning techniques in this area of construction industry; such as, [20][21], and [22]. B. Construction Project Delay
A construction project involves multiple stakeholders having invested their shares in the project. Anything that causes delays in the project completion, affects all the parties engaged. Predicting delay in advance proves to be an efficient strategy to mitigate chances of delay. Using the data of previously completed projects and Knowledge Discovery in Databases (KDD) technique, [23] determined the factors in an on-going construction project that may contribute to expected project delays. [24] comprehensively implemented machine learning and data mining tools to identify causes of delay in construction projects. The step by step detailed methodology explained in this paper is not only feasible for analyzing construction delays, but also can be usefully adopted to determine cost overrun and quality assurance in various other construction projects. With Qatar perspective, [25] applied machine learning techniques on data collected from ongoing mega construction project completed in Qatar. In this research, the authors attempted to find factors contributing to project delays and developed predicting model using WEKA software. The neural network applications of analyzing project delaying factors in construction industry have been proposed in [26] and [27]. C. Construction Site Safety
Jobs on construction site include constructing, repairing, and maintaining infrastructures like buildings, roads, and tree forts, which involve height, noise, dust, electrical tools and machines that may cause injuries or other accidents. However, regular enforcement of standard operating procedures may reduce the chances of occurrence. Using data mining and machine learning techniques, [28] found some specific patterns in occupational injuries in construction projects in Taiwan. The research used the injury data of five years’ construction projects. The researchers found rain as the significant reason of on-site injuries and can be mitigated by effective inspection strategies and accident prevention programs. Another Hussain, K. et.al (BD and ML in Construction: A Review)
ISSN: xxxx-xxxx
International Journal of Soft Computing and Metaheuritics Vol. x, No. x, March 2018, pp. xx-xx
5
similar application of data mining and machine learning technique CART (Classification and Regression Tree) can be found in [29]. This is also Taiwan based research which used on-site accidents data of ten years and found patterns of falls and collapses in public and private construction projects. [30] developed and efficient site monitoring system using vision tracking technique employing deep machine learning technique and big video data. The proposed system could efficiently detect moving objects including workers, machines, other moving objects with split of time duration. The system can be effectively utilized for site safely and accident prevention. D. Construction Waste Management
ACCEPTED: AUTHORS’COPY
Big construction projects demand effective building waste management system which endorses waste minimization, elimination, and recycling of waste materials. Today, big data and machine learning techniques provide waste intelligence solutions for construction industry. According to [31], the research provided for the first time the so called “big-data based architecture for construction waste analytics.” The research [31] collected big data of 200,000 waste disposals records from 900 construction projects. As compared to existing waste management software, the proposed data-driven approach provided intelligent solutions to waste analytics. [32] employed big data analytics on 2 million waste disposal records of 5700 construction projects in Hong Kong. A novel gene expression programming-based model for forecasting construction waste was proposed [33]. In this research, the authors used the data of two decades for training and testing the machine learning model. The proposed model outperformed neural network for generating more accurate waste forecast. E. Construction Project Documentation
The involvement of multiple parties in a large construction project results in massive amount of documentation for future references. These generated documents demand extraordinary effort in managing and efficiently using these documents. Various machine learning techniques have been employed on document management in construction projects in literature. The research in [34] employed machine learning technique Support Vector Machine (SVM) and natural language processing method Latent Semantic Analysis (LSA) on automatic document classification. As compared to Gold Standard of human agreement measures, the proposed approach generated 71%-91% accuracy. A semantic, machine learning-based text classification algorithm was proposed by [35] for classifying clauses and subclauses of general conditions for supporting automated compliance checking (ACC) in construction projects. The researchers employed different machine learning techniques, text processing methods, and feature selection approaches for effectively designing the solution. The proposed classification model generated efficient document classification results. Another ACC solution was proposed by [36], where rule-based natural language processing technique was used. [37] also applied natural language processing technique for automatically extracting regulatory information from large set of construction documents. The proposed model produced classification results with higher precision and recall. F. Scheduling and Assignment Problems in Construction Projects
Large construction projects involve large resources (men, machine, and material) that need to be conducted in technological order or assigned efficiently to reduce cost and project delays. This is part of construction project management planning. Literature shows that researchers have effectively used big data and machine learning algorithms to optimize resource planning. An integration of Building Information Model (BIM), Object Sequencing Matrix (OSM), and Genetic Algorithm (GA) was performed in [38] to model a system for crew assignment problem. The constrained crew workforce assignment model provided opportunity to optimize resource distribution for cost reduction. [39] used evolutionary algorithm to optimize resource scheduling and assignment for minimizing construction project duration. The research successfully illustrated the use of evolutionary algorithms for project planning with constraints. [40] used neural network and support vector machine for predicting project scheduling success. The proposed model used early planning as model input and project success as output. The model generated satisfactory prediction results when validated on the data of 92 construction projects. Particle swarm optimization (PSO) algorithm was used to solve resource-constrained project scheduling problem (RCPSB) for minimizing project duration in [41].
Hussain, K. et.al (BD and ML in Construction: A Review)
6
International Journal of Soft Computing and Metaheuristics Vol. x, No. x, March 2018, pp. xx-xx
ISSN: xxxx-xxxx
G. Construction Project Budgeting
ACCEPTED: AUTHORS’COPY
Regardless of the size of construction project, there is always budget constraint that needs to be handled carefully. This implies to all the stakeholders involved in a project, since customer wants to procure a construction project within limited economic resources while a builder wants to maximize economic efficiency. Using big data and machine learning techniques, researchers have already found useful solutions for cost estimation and budgeting problems. Using big data and a case study of metro station project, [42] evaluated tender price of construction project. The study evaluated cost data to propose reasonable cost range to obtain evaluation criterion to support the tender price controls. [43] used neural network ensembles to propose final project cost estimation models. The cost related information of 1600 construction projects was used to train and test the machine learning models. The proposed models were practically utilized by industry partners for reliable and efficient initial cost estimates. Multiregression model was developed by [44] for conceptual cost estimation of conventional and sustainable college buildings in North America. The proposed model was able to predict initial cost per square feet for both steel and concrete structures. The model was tested on real-time data and it produced satisfactory predictions. H. Transportation Problems in Construction Projects
Transportation of construction materials, machines, and equipment is crucial to timely and successful delivery of large-scale construction projects. It needs an efficient route planning to transport the construction resources from distribution site to construction site through towns, highways, and sometimes difficult hazardous routes. There is extensive literature which presents solutions to transportation (or vehicle routing problem in general) for various industries including construction with the help of machine learning techniques and big data. An efficient machine learning technique, so called GEneralized ROute Construction Algorithm (GEROCA), was proposed in [45] to solve vehicle routing problem in construction project. The proposed approach not only outperformed previously published in literature but also improved upon the conventional practices of the company taken as case study. The bi-objective results were achieved: (a) total distribution cost minimization and (b) minimization of hazardous vehicles used in the project. To solve material transportation problem of a large construction project in China, [46] proposed a PSO variant Global-Local-Neighbor Particle Swarm Optimization (GLNPSO). The GLNPSO was embedded with customer satisfaction and travelling time as constraints. The results provided robust transportation solution with potential savings. I. Other Applications
Apart from the applications of big data and machine learning in construction industry mentioned above, literature presents various other related successful endeavors. Smart buildings using IoT and Big Data [47,48,49], Big data with augmented reality for problem simulation and solution generation in huge buildings, towns, and neighborhoods [50, 51], concrete strength prediction [52], construction project success prediction using neural networks [53], and big data analytics for automation in civil engineering domain [54], SVM applications in civil engineering [55], metaheuristics for construction project management [56] are among the other relevant applications. IV. Research Gap and Future Directions
Big data and machine learning techniques offer numerous opportunities of tremendous success in construction industry. However, unlike other domains, research in construction lags when it comes to big data and computational intelligence. It is because of inherent challenges in construction industry. These challenges pertain to ready-to-use availability of data, expertise in data analytics and machine learning. Construction projects generate enormous amount of data throughout project lifecycle, but it needs to be systematically collected and stored in cloud environments. Following are few of the potential open research lines to work on: • Densified urban cities demand hassle-free infrastructures that efficiently serve communities. With the advent of big data analytics, the concept of smart cities if prevailing. Research in this particular line will provide futuristic and efficient urban designs.
Hussain, K. et.al (BD and ML in Construction: A Review)
ISSN: xxxx-xxxx
7
International Journal of Soft Computing and Metaheuritics Vol. x, No. x, March 2018, pp. xx-xx
• Intelligent or smart buildings and infrastructures need to be designed that collect, store, and utilize big data and machine learning algorithms for energy efficiency and saving maintenance costs through predicting possible future events. • Use of social media and big data paradigm, efficient data-driven decision models can be designed that propose best suitable building location and design. • Construction industry needs standardization in data collection, there is substantial need of uniformity in data, as data from various stakeholders needs to be collected. This data may vary in format and standards. There is significant gap in research where data collection issues to be addressed. This will raise the quality of data since future construction technologies will heavily rely on data available.
ACCEPTED: AUTHORS’COPY
• Research is needed on the current status of big data and machine learning applications in construction industry. Such research should also focus on opportunities and conceptual frameworks that may lead implementations in future. V. Conclusions
Construction industry is already generating enormous data through stakeholders involved in construction project lifecycle. However, because of immaturity in big data and machine learning paradigm, construction industry lags behind as compared to other fields. Although, literature in this area shows applications of big data and machine learning techniques for solving various problems, but it is still limited. Based on limited survey performed in this study, it can be inferred that big data and machine learning is creeping in construction industry but the use of latest technologies like Hadoop, Spark, and other big data processing tools are still to be adopted by the industry practitioners and researchers. There are plenty of opportunities, highlighted in this research, that are still to be explored in relevance to big data, machine learning, and construction industry. Acknowledgment The authors would like to thank Universiti Tun Hussein Onn Malaysia (UTHM), Malaysia for supporting this research under Postgraduate Incentive Research Grant, Vote No. U560. The research is partially supported by NWO Education Programme, Nishat Welfare Organization (NWO), Pakistan. References [1] Data Lifecycle on Google Cloud Platform. (2018, https://cloud.google.com/solutions/data-lifecycle-cloud-platform.
March
23).
Retrieved
from
[2] De Mauro, A., Greco, M., & Grimaldi, M. (2016). A formal definition of Big Data based on its essential features. Library Review, 65(3), 122-135. [3] Aouad, G., Kagioglou, M., Cooper, R., Hinks, J., & Sexton, M. (1999). Technology management of IT in construction: a driver or an enabler?. Logistics Information Management, 12(1/2), 130-137.. [4] Davenport, T. H., & Dyché, J. (2013). Big data in big companies. International Institute for Analytics, 3. [5] Karau, H., Konwinski, A., Wendell, P., & Zaharia, M. (2015). Learning spark: lightning-fast big data analysis. " O'Reilly Media, Inc.".. [6] White, T. (2012). Hadoop: The definitive guide. " O'Reilly Media, Inc.". [7] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of machine learning research, 12(Oct), 28252830. [8] Ihaka, R., & Gentleman, R. (1996). R: a language for data analysis and graphics. Journal of computational and graphical statistics, 5(3), 299-314. [9] Agneeswaran, V. S. (2014). Big data analytics beyond hadoop: real-time applications with storm, spark, and more hadoop alternatives. FT Press. [10] Tene, O., & Polonetsky, J. (2012). Big data for all: Privacy and user control in the age of analytics. Nw. J. Tech. & Intell. Prop., 11, xxvii. [11] Provost, F., & Fawcett, T. (2013). Data science and its relationship to big data and data-driven decision making. Big data, 1(1), 51-59.
Hussain, K. et.al (BD and ML in Construction: A Review)
8
International Journal of Soft Computing and Metaheuristics Vol. x, No. x, March 2018, pp. xx-xx
ISSN: xxxx-xxxx
[12] Bilal, M., Oyedele, L. O., Qadir, J., Munir, K., Ajayi, S. O., Akinade, O. O., ... & Pasha, M. (2016). Big Data in the construction industry: A review of present status, opportunities, and future trends. Advanced engineering informatics, 30(3), 500-521. [13] Chen, C. P., & Zhang, C. Y. (2014). Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. Information Sciences, 275, 314-347. [14] Graf, H. P., Cosatto, E., Bottou, L., Dourdanovic, I., & Vapnik, V. (2005). Parallel support vector machines: The cascade svm. In Advances in neural information processing systems (pp. 521-528). [15] Deng, L., Yu, D., & Platt, J. (2012, March). Scalable stacking and learning for building deep architectures. In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on (pp. 2133-2136). IEEE. [16] Huang, G., Huang, G. B., Song, S., & You, K. (2015). Trends in extreme learning machines: A review. Neural Networks, 61, 32-48.
ACCEPTED: AUTHORS’COPY
[17] Zhu, F., & Wu, Y. (2014). A rapid structural damage detection method using integrated ANFIS and interval modeling technique. Applied Soft Computing, 25, 473-484. [18] Liu, Y. Y., Ju, Y. F., Duan, C. D., & Zhao, X. F. (2011). Structure damage diagnosis using neural network and feature fusion. Engineering applications of artificial intelligence, 24(1), 87-92. [19] Liu, H. B., & Jiao, Y. B. (2011). Application of genetic algorithm-support vector machine (GA-SVM) for damage identification of bridge. International Journal of Computational Intelligence and Applications, 10(04), 383-397. [20] Jiang, X., & Mahadevan, S. (2008). Bayesian probabilistic inference for nonparametric damage detection of structures. Journal of engineering mechanics, 134(10), 820-831. [21] Dehestani, D., Eftekhari, F., Guo, Y., Ling, S., Su, S., & Nguyen, H. (2011). Online support vector machine application for model based fault detection and isolation of HVAC system. International Journal of Machine Learning and Computing, 1(1), 66. [22] Marwala, T., & Chakraverty, S. (2006). Fault classification in structures with incomplete measured data using autoassociative neural networks and genetic algorithm. Current Science, 542-548. [23] Kim, H., Soibelman, L., & Grobler, F. (2008). Factor selection for delay analysis using knowledge discovery in databases. Automation in Construction, 17(5), 550-560. [24] Soibelman, L., & Kim, H. (2002). Data preparation process for construction knowledge generation through knowledge discovery in databases. Journal of Computing in Civil Engineering, 16(1), 39-48. [25] Asadi, A., Alsubaey, M., & Makatsoris, C. (2015). A machine learning approach for predicting delays in construction logistics. International Journal of Advanced Logistics, 4(2), 115-130. [26] Kim, S. Y., Van Tuan, N., & Ogunlana, S. O. (2009). Quantifying schedule risk in construction projects using Bayesian belief networks. International Journal of Project Management, 27(1), 39-50. [27] Chau, K. W. (2007). Application of a PSO-based neural network in analysis of outcomes of construction claims. Automation in construction, 16(5), 642-646. [28] Liao, C. W., & Perng, Y. H. (2008). Data mining for occupational injuries in the Taiwan construction industry. Safety science, 46(7), 1091-1102. [29] Cheng, C. W., Leu, S. S., Cheng, Y. M., Wu, T. C., & Lin, C. C. (2012). Applying data mining techniques to explore factors contributing to occupational injuries in Taiwan's construction industry. Accident Analysis & Prevention, 48, 214-222. [30] Park, M. W., & Brilakis, I. (2012). Construction worker detection in video frames for initializing vision trackers. Automation in Construction, 28, 15-25. [31] Bilal, M., Oyedele, L. O., Akinade, O. O., Ajayi, S. O., Alaka, H. A., Owolabi, H. A., ... & Bello, S. A. (2016). Big data architecture for construction waste analytics (CWA): A conceptual framework. Journal of Building Engineering, 6, 144-156. [32] Lu, W., Chen, X., Ho, D. C., & Wang, H. (2016). Analysis of the construction waste management performance in Hong Kong: the public and private sectors compared using big data. Journal of Cleaner Production, 112, 521-531. [33] Wu, Z., Fan, H., & Liu, G. (2013). Forecasting construction and demolition waste using gene expression programming. Journal of Computing in Civil Engineering, 29(5), 04014059. [34] Mahfouz, T., Jones, J., & Kandil, A. (2010). A machine learning approach for automated document classification: A comparison between SVM and LSA performances. Int. J. Eng. Res. Innov, 2(2), 53-62. [35] Salama, D. M., & El-Gohary, N. M. (2013). Semantic text classification for supporting automated compliance checking in construction. Journal of Computing in Civil Engineering, 30(1), 04014106.
Hussain, K. et.al (BD and ML in Construction: A Review)
ISSN: xxxx-xxxx
International Journal of Soft Computing and Metaheuritics Vol. x, No. x, March 2018, pp. xx-xx
9
[36] Zhang, J., & El-Gohary, N. M. (2013). Semantic NLP-based information extraction from construction regulatory documents for automated compliance checking. Journal of Computing in Civil Engineering, 30(2), 04015014. [37] Zhang, J., & El-Gohary, N. (2012). Automated regulatory information extraction from building codes: Leveraging syntactic and semantic information. In Construction Research Congress 2012: Construction Challenges in a Flat World (pp. 622-632). [38] Chen Y.J., Feng, C. W., and Wu, H. M. (2011). Using BIM model and genetic algorithms to optimize the crew assignment for construction project planning. International Journal of Technology, 3, 179-187. [39] Jaśkowski, P., & Sobotka, A. (2006). Scheduling construction projects using evolutionary algorithm. Journal of Construction Engineering and Management, 132(8), 861-870. [40] Wang, Y. R., Yu, C. Y., & Chan, H. H. (2012). Predicting construction cost and schedule success using artificial neural networks ensemble and support vector machines classification models. International Journal of Project Management, 30(4), 470-478.
ACCEPTED: AUTHORS’COPY
[41] Zhang, H., Li, H., & Tam, C. M. (2006). Particle swarm optimization for resource-constrained project scheduling. International Journal of Project Management, 24(1), 83-92. [42] Zhang, Y., Luo, H., & He, Y. (2015). A system for tender price evaluation of construction project based on big data. Procedia Engineering, 123, 606-614. [43] Ahiaga-Dagbui, D. D., & Smith, S. D. (2014). Dealing with construction cost overruns using data mining. Construction Management and Economics, 32(7-8), 682-694. [44] Alshamrani, O. S. (2017). Construction cost prediction model for conventional and sustainable college buildings in North America. Journal of Taibah University for Science, 11(2), 315-323. [45] Tarantilis, C. D., & Kiranoudis, C. T. (2007). A flexible adaptive memory-based algorithm for real-life transportation operations: Two case studies from dairy and construction sector. European Journal of Operational Research, 179(3), 806-822. [46] Yan, F., Xu, J., & Han, B. T. (2015). Material transportation problems in construction projects under an uncertain environment. KSCE Journal of Civil Engineering, 19(7), 2240-2251. [47] Elghamrawy, T., & Boukamp, F. (2010). Managing construction information using RFID-based semantic contexts. Automation in construction, 19(8), 1056-1066. [48] Meadati, P., Irizarry, J., & Akhnoukh, A. K. (2010). BIM and RFID integration: a pilot study. Advancing and Integrating Construction Education, Research and Practice, 570-78. [49] Curry, E., O’Donnell, J., Corry, E., Hasan, S., Keane, M., & O’Riain, S. (2013). Linking building data in the cloud: Integrating cross-domain building data using linked data. Advanced Engineering Informatics, 27(2), 206-219. [50] Williams, G., Gheisari, M., Chen, P. J., & Irizarry, J. (2014). BIM2MAR: an efficient BIM translation to mobile augmented reality applications. Journal of Management in Engineering, 31(1), A4014009. [51] Jiao, Y., Zhang, S., Li, Y., Wang, Y., & Yang, B. (2013). Towards cloud augmented reality for construction application by BIM and SNS integration. Automation in construction, 33, 37-47. [52] Lee, S. C. (2003). Prediction of concrete strength using artificial neural networks. Engineering Structures, 25(7), 849-857. [53] Chua, D. K. H., Loh, P. K., Kog, Y. C., & Jaselskis, E. J. (1997). Neural networks for construction project success. Expert Systems with Applications, 13(4), 317-328. [54] Alavi, A. H., & Gandomi, A. H. (2017). Big data in civil engineering. [55] Dibike, Y. B., Velickov, S., & Solomatine, D. (2000, March). Support vector machines: Review and applications in civil engineering. In Proceedings of the 2nd Joint Workshop on Application of AI in Civil Engineering (pp. 215-218). [56] Liao, T. W., Egbelu, P. J., Sarker, B. R., & Leu, S. S. (2011). Metaheuristics for project and construction management–A state-of-the-art review. Automation in construction, 20(5), 491-505.
Hussain, K. et.al (BD and ML in Construction: A Review)