Review article
Big Data in the construction industry: A review of present status, opportunities, and future trends

https://doi.org/10.1016/j.aei.2016.07.001Get rights and content

Highlights

  • Existing works for Big Data Analytics/Engineering in the construction industry are discussed.

  • It is highlighted that the adoption of Big Data is still at nascent stage

  • Opportunities to employ Big Data technologies in construction sub-domains are highlighted.

  • Future works for Big Data technologies are presented.

  • Pitfalls of Big Data technologies in the construction industry are also pointed out.

Abstract

The ability to process large amounts of data and to extract useful insights from data has revolutionised society. This phenomenon—dubbed as Big Data—has applications for a wide assortment of industries, including the construction industry. The construction industry already deals with large volumes of heterogeneous data; which is expected to increase exponentially as technologies such as sensor networks and the Internet of Things are commoditised. In this paper, we present a detailed survey of the literature, investigating the application of Big Data techniques in the construction industry. We reviewed related works published in the databases of American Association of Civil Engineers (ASCE), Institute of Electrical and Electronics Engineers (IEEE), Association of Computing Machinery (ACM), and Elsevier Science Direct Digital Library. While the application of data analytics in the construction industry is not new, the adoption of Big Data technologies in this industry remains at a nascent stage and lags the broad uptake of these technologies in other fields. To the best of our knowledge, there is currently no comprehensive survey of Big Data techniques in the context of the construction industry. This paper fills the void and presents a wide-ranging interdisciplinary review of literature of fields such as statistics, data mining and warehousing, machine learning, and Big Data Analytics in the context of the construction industry. We discuss the current state of adoption of Big Data in the construction industry and discuss the future potential of such technologies across the multiple domain-specific sub-areas of the construction industry. We also propose open issues and directions for future work along with potential pitfalls associated with Big Data adoption in the industry.

Introduction

The world is currently inundated with data, with fast advancing technology leading to its steady increase. Today, companies deal with petabytes (1015 bytes) of data. Google processes above 24 petabytes of data per day [1], while Facebook gets more than 10 million photos per hour [1]. The glut of data increased in 2012 is approximately 2.5 quintillion (1018) bytes per day [2]. This data growth brings significant opportunities to scientists for identifying useful insights and knowledge. Arguably, the accessibility of data can improve the status quo in various fields by strengthening existing statistical and algorithmic methods [3], or by even making them redundant [4].

The construction industry is not an exception to the pervasive digital revolution. The industry is dealing with significant data arising from diverse disciplines throughout the life cycle of a facility. Building Information Modelling (BIM) is envisioned to capture multi-dimensional CAD information systematically for supporting multidisciplinary collaboration among the stakeholders [5]. BIM data is typically 3D geometric encoded, compute intensive (graphics and Boolean computing), compressed, in diverse proprietary formats, and intertwined [6]. Accordingly, this diverse data is collated in federated BIM models, which are enriched gradually and persisted beyond the end-of-life of facilities. BIM files can quickly get voluminous, with the design data of a 3-storey building model easily reaching 50 GB in size [7]. Noticeably, this data in any form and shape has intrinsic value to the performance of the industry. With the advent of embedded devices and sensors, facilities have even started to generate massive data during the operations and maintenance stage, eventually leading to more rich sources of Big BIM Data. This vast accumulation of BIM data has pushed the construction industry to enter the Big Data era.

Big Data has three defining attributes (a.k.a. 3V‘s), namely (i) volume (terabytes, petabytes of data and beyond); (ii) variety (heterogeneous formats like text, sensors, audio, video, graphs and more); and (iii) velocity (continuous streams of the data). The 3V‘s of Big Data are clearly evident in construction data. Construction data is typically large, heterogeneous, and dynamic [8]. Construction data is voluminous due to large volumes of design data, schedules, Enterprise Resource Planning (ERP) systems, financial data, etc. The diversity of construction data can be observed by noting the various formats supported in construction applications including DWG (short for drawing), DXF (drawing exchange format), DGN (short for design), RVT (short for Revit), ifcXML (Industry Foundation Classes XML), ifcOWL (Industry Foundation Classes OWL), DOC/XLS/PPT (Microsoft format), RM/MPG (video format), and JPEG (image format). The dynamic nature of construction data follows from the streaming nature of data sources such as Sensors, RFIDs, and BMS (Building Management System). Utilising this data to optimise construction operations is the next frontier of innovation in the industry.

To understand the subtleties of Big Data, we need to disambiguate between two of its complementary aspects: Big Data Engineering (BDE) and Big Data Analytics (BDA). The domain of BDE is primarily concerned with supporting the relevant data storage and processing activities, needed for analytics [9]. BDE encompasses technology stacks such as Hadoop and Berkeley Data Analytics Stack (BDAS). Big Data Analytics (BDA), the second integral aspect, relates to the tasks responsible for extracting the knowledge to drive decision-making [9]. BDA is mostly concerned with the principles, processes, and techniques to understand the Big Data. The essence of BDA is to discover the latent patterns buried inside Big Data and derive useful insights therefrom [10]. These insights have the capability to transform the future of many industries through data-driven decision-making. This ability to identify, understanding and reacting to the latent trends promptly is indeed a competitive edge in this hyper-competitive era.

Contributions of this paper: While some data-driven solutions have been proposed for the fields of the construction industry, there is currently no comprehensive survey of the literature, targeting the application of Big Data in the context of the construction industry. This paper fills the void and presents a wide-ranging interdisciplinary study of fields such as Statistics, Data Mining and Warehousing, Machine Learning, Big Data and their applications in the construction industry.

Organization of this paper: The discussion in this paper follow the review structure shown in Fig. 1. We start with a thorough review of extant literature on BDE and BDA in the construction industry in Sections 2 Big Data Engineering (BDE), 3 Big Data Analytics, respectively. After which, opportunities of Big Data in the construction industry sub-domains are presented in Section 4. Discussions about open research issues and future work, and pitfalls of Big Data in the construction industry are then presented in Sections 5 Open research issues and future work, 6 Pitfalls of Big Data in construction industry, respectively.

Section snippets

Big Data Engineering (BDE)

Big Data Engineering (BDE) provide infrastructure to support Big Data Analytics (BDA). Some discussions about the Big Data platforms worth consideration to understand the BDE adequately. Various Big Data platforms are developed so far with varied characteristics, which can be divided into two groups: (i) horizontal scaling platforms (HSPs), the ones that distribute processing across multiple servers and scale out by adding new machines to the cluster. (ii) And vertical scaling platforms (VSPs),

Big Data Analytics

Big Data Analytics has a rich intellectual tradition and borrows from a wide variety of fields. There have been traditionally many related disciplines that have essentially the same core focus: finding useful patterns in data (but with a different emphasis). These related fields are Statistics (1830

Resource and waste optimization

Rapid urbanisation has escalated construction activities globally, which triggered construction industry to consume the bulk of natural resources and produce massive construction and demolition (C&D) waste [93]. The adverse impact of construction activities on the environment has serious implications worldwide [94]. Existing waste management approaches are based on Waste Intelligence (WI), which suggests remedial measures to manage waste only after it happens [95]. These systems mostly answer

Open research issues and future work

There are many interesting open research issues within the construction industry for Big Data. Some of these include (but are not limited to) the following:

Pitfalls of Big Data in construction industry

Despite the opportunities and benefits accruable from Big Data in this industry, some challenging issues remain of concern. This section discusses some of these challenges and provides suggestions to deal with them for the successful implementation and dissemination of Big Data technologies across various domain applications of the construction industry.

Conclusions

Although the construction industry generates massive amounts of data throughout the life cycle of a building, the adoption of Big Data technology in this sector lags the progress made in other fields. With the commoditisation of the technology necessary for storing, computing, processing, analysing, and visualising Big Data, there is immense interest in leveraging such technologies for improving the efficiency of construction processes. In this exploratory study, we have analysed the extent to

References (173)

  • A. Aibinu et al.

    The effects of construction delays on project delivery in nigerian construction industry

    Int. J. Project Manage.

    (2002)
  • M. Sambasivan et al.

    Causes and effects of delays in malaysian construction industry

    Int. J. Project Manage.

    (2007)
  • K. Pietrzyk

    A systemic approach to moisture problems in buildings for mould safety modelling

    Build. Environ.

    (2015)
  • Q. Chen et al.

    Structural fault diagnosis and isolation using neural networks based on response-only data

    Comput. Struct.

    (2003)
  • X. Fang et al.

    Structural damage detection using neural network with learning rate improvement

    Comput. Struct.

    (2005)
  • C.H. Caldas et al.

    Automating hierarchical document classification for construction management information systems

    Autom. Construct.

    (2003)
  • N. Ur-Rahman et al.

    Textual data mining for industrial knowledge management and text classification: a business oriented approach

    Expert Syst. Appl.

    (2012)
  • S. Liu et al.

    A review of structured document retrieval (SDR) technology to improve information access performance in engineering document management

    Comput. Indus.

    (2008)
  • L. Soibelman et al.

    Management and analysis of unstructured construction data types

    Adv. Eng. Inform.

    (2008)
  • H. Fan et al.

    Retrieving similar cases for alternative dispute resolution in construction accidents using text mining techniques

    Autom. Construct.

    (2013)
  • M. Al Qady et al.

    Automatic clustering of construction project documents based on textual similarity

    Autom. Construct.

    (2014)
  • H.-T. Lin et al.

    A concept-based information retrieval approach for engineering domain-specific technical documents

    Adv. Eng. Inform.

    (2012)
  • Z. Wu et al.

    Quantifying construction and demolition waste: an analytical review

    Waste Manage.

    (2014)
  • W. Lu et al.

    An empirical investigation of construction and demolition waste generation rates in shenzhen city, south china

    Waste Manage.

    (2011)
  • L.L. Ekanayake et al.

    Building waste assessment score: design-based tool

    Build. Environ.

    (2004)
  • V. Mayer-Schönberger et al.

    Big Data: A Revolution that Will Transform how We Live, Work, and Think

    (2013)
  • E. Siegel

    Predictive Analytics: The Power to Predict who Will Click, Buy, Lie, Or Die

    (2013)
  • Blog post by Anand Rajaraman on how ‘More Data Usually Beats Better Algorithms’, 2013....
  • C. Anderson

    The end of theory

    Wired Mag.

    (2008)
  • Y. Jiao et al.

    An augmented Mapreduce framework for building information modeling applications

  • J.-R. Lin et al.

    A natural-language-based approach to intelligent data retrieval and representation for cloud BIM

    Comput.-Aid. Civil Infrastruct. Eng.

    (2015)
  • G. Aouad et al.

    Technology management of it in construction: a driver or an enabler?

    Logist. Inform. Manage.

    (1999)
  • F. Provost et al.

    Data Science for Business: What you Need to Know about Data Mining and Data-Analytic Thinking

    (2013)
  • F.T. Matsunaga et al.

    Data mining techniques and tasks for multidisciplinary applications: a systematic review

    Revista Eletrôn. Argentina-Brasil Tecnol. Inform. Comun.

    (2015)
  • D. Singh et al.

    A survey on platforms for big data analytics

    J. Big Data

    (2015)
  • J. Dean et al.

    Mapreduce: simplified data processing on large clusters

    Commun. ACM

    (2008)
  • V.S. Agneeswaran

    Big Data Analytics Beyond Hadoop: Real-Time Applications with Storm, Spark, and More Hadoop Alternatives

    (2014)
  • T. White

    Hadoop: The Definitive Guide

    (2012)
  • P. Helland

    If you have too much data, then’good enough’is good enough

    Commun. ACM

    (2011)
  • M. Das et al.

    BIMCloud: a distributed cloud-based social BIM framework for project collaboration

  • S. Jeong et al.

    A data management infrastructure for bridge monitoring

  • J.C. Cheng et al.

    A cloud computing approach to partial exchange of BIM models

  • L. Goldman

    The origins of British social science: political economy, natural science and statistics, 1830–1835

    Historical J.

    (1983)
  • R.O. Duda et al.
    (1973)
  • L. Wasserman

    All of Statistics: A Concise Course in Statistical Inference

    (2013)
  • D.S. Moore et al.

    Introduction to the Practice of Statistics

    (1989)
  • P. Carrillo et al.

    Knowledge discovery from post-project reviews

    Construct. Manage. Econom.

    (2011)
  • T.S. Mahfouz

    Construction legal support for differing site conditions (DSC) through statistical modeling and machine learning (ML)

    (2009)
  • X. Jiang et al.

    Bayesian probabilistic inference for nonparametric damage detection of structures

    J. Eng. Mech.

    (2008)
  • Y. Huang et al.

    Novel sparse Bayesian learning for structural health monitoring using incomplete modal data

  • Cited by (469)

    View all citing articles on Scopus
    View full text