Having been intensely paper-based for decades, the construction industry has developed sophisticated approaches to inter-company transactions, document management, and collaboration over the past 15 to 20 years.

Now, partly as a result of Building Information Modeling (BIM), the industry is also becoming more data-centric. However, even in leading BIM markets, the great majority of construction information exchanges still rely on processing large volumes of unstructured data.

There is a growing opportunity for project-oriented construction and property-related businesses to collate and drill into their data. By reaching beyond their structured data and interrogating related unstructured data (plus data from relevant external sources), organisations can identify previously hidden patterns, correlations, and anomalies. These can reveal inefficiencies and improvement opportunities, and maybe even unlock new value streams. Ultimately, the industry could be harnessing Big Data to help owners of capital projects get better, more cost-effective and more sustainably built assets.

The construction information challenge

Construction is synonymous with huge volumes of written and graphical information, amassed during the planning, design, and build processes, at handover, and during the life of the built asset through to its eventual decommissioning, dismantling, or demolition.

Much of this information will of course, be about the built asset itself (briefs, specifications, drawings, models, operation and maintenance manuals, service records, and so on). However, there will also be numerous information exchanges relating to the process of creating that built asset. There will be various contracts and agreements, insurance documents, estimates, schedules, orders, invoices, and numerous forms (transmittals, RFIs, change orders, an so on) to request, capture, and process information, as well as to provide records of who did what and when. In addition to this documentation, there will often be large volumes of correspondence.

Ultimately, the industry could be harnessing Big Data to help owner-operator clients get better, more cost-effective and more sustainable built assets.

Traditionally, all of that information exchange was paper-based. Depending on the age of the built asset and the associated information, some information will still be in paper form and some will be on computers or related storage devices. Paper may be stored in archive boxes or filing cabinets, while some documentation may be captured on microfilm. For some older buildings and other assets, written records may have been lost or destroyed altogether, or there may only be partial records. Digitised information may be stored on computer disks, removable hard-drives, USB sticks, or tapes.

The information may also be spread across several different organisations involved in the delivery of the project. Within a single organisation, that information may be spread across several different locations or departments. Even within single departments, information may be stored in multiple devices, often with numerous duplicates.

Another feature of the traditional project delivery process is that every company involved would need to manage and maintain its own records of what they did and when. As a result, each business would retain information archives, often duplicating some of what was contained in other businesses’ systems (the tendency of some email users to share copies of information indiscriminately with every colleague will also exacerbate this.)

The information management challenge does not end once a built asset is handed over to the owner-operator. As the asset begins to be used for its intended purpose, further operational information will be created to manage maintenance and repairs, to facilitate periodic refurbishment or extension projects, and – importantly – to relate use of the built asset to the business operations of the owner-operator. This information may also be generated for a much longer period of time; for example, design and construction may have taken two years, but the owner-operator may then be operating the facility for decades, and business operations may also generate much larger volumes of data.

From email to extranet

Some parts of the construction industry have made progress in the last couple of decades, and that integration push was helped by the emergence of new communication technologies. The development of web-based tools, coupled with the implementation of reliable internet connectivity, helped to change how organisations shared information.

The traditional reliance on paper-based information and the ‘silo mentality’ of many companies working on delivering construction projects has meant huge waste and inefficiency.

Email still dominates in many sectors, but some now routinely set up a construction collaboration ‘extranet’ that creates a single, secure, shared central repository which all authorised project team members can use to disseminate and access project-related communications.

These extranet platforms are typically provided on a Software-as-a-Service (SaaS), with the vendor taking responsibility for managing the hosting hardware and software and ensuring the system remains secure, is backed-up, remains constantly available 24/7, delivers information with minimal latency, and more. Such platforms provide ‘a single version of the truth’ for all members of a project team while building an audit trail of who did what and when, as well as help compile information for the asset owner-operator’s future operation and maintenance requirements.

Repeated use of a collaboration platform to deliver a series of projects or to manage a concurrent program of projects means vendors such as Viewpoint are well placed to help clients extract Business Intelligence (BI) from the information and processes that systems manage. Vendors typically have long-term relationships with owner-operators, main contractors, project managers, and lead consultants, and so, in theory, could mine customers’ data to help identify trends in delivery (time and cost), common factors in the best or worst performing projects, supply chain partner process performance metrics, and more.

Structured and unstructured data

Much of this rich information captured in a collaboration vendor’s system will be highly structured.

Structured data tends to be defined by fixed fields and types of data, and is often held in relational databases, spreadsheets, software logs, and – increasingly important going forward in the construction industry – Building Information Models (BIMs). It is easily entered, stored, queried, and analysed. As projects progress, increasing volumes of structured data are accumulated, being created and then augmented and enriched at each stage.

Unstructured data is, by its very nature, more difficult to define by fixed fields or data types, and isn’t as easily accessed, queried, or analysed. Photographs, graphics, videos, audio (voicemails, for instance), web pages, PDFs, PowerPoints, email content, Word documents, and OCR-scanned documents are just a few examples of unstructured data. Meta-data might define some characteristics of these files, but their content is generally not so simply analysed.

The importance of this unstructured data should not be underestimated. Notwithstanding the emergence of cloud-based, centralised project collaboration platforms, email is still at the core of most construction organisations’ communications, and huge volumes of information are routinely exchanged internally and externally through email and attachments. Moreover, construction remains an intensely contractual industry with most contract documents and schedules being hundreds of pages long. A complex megaproject may be broken down into many separate contracts and a typical project will involve multiple contracts between different tiers of suppliers of goods and services.

BIM and Big Data

While building information models will increasingly incorporate a wealth of data, they are not exactly Big Data. BIM files may be significantly larger than traditional CAD files, but even when information from multiple models is combined, it is still readily processed by various BIM authoring, analysis, checking, and collaboration “common data environment” applications.

However, while the use of BIM authoring and coordination tools will create more structured data, there will still be extensive reliance on unstructured data. Industry estimates suggest around 70 per cent of data in a BIM-enabled project will remain non-graphical, and the process of delivering a project will still involve an eco-system of other standard IT tools to exchange proposals, correspondence, contracts, reports, schedules and masses of commercial and process-related information.

Genuine Big Data is being created constantly and in ever-increasing volumes. Meteorology, complex physics, biological research, and financial services are just some of the prodigious generators of data. The data is being created by a growing array of devices, from mobile telephones and barcode scanners to cameras, laser scanners, and microphones; from RFID readers and wireless sensors to streaming instrumentation (“the Internet of things”). Population growth, expanding technology use, and increased literacy are accelerating generation.

While BI is highly appropriate to analytics challenges focused on structured data, Big Data analytics involves the additional collation and processing of substantial volumes of unstructured data, often from multiple sources, and combining various data- and text-mining, data optimisation, and search techniques. More importantly, Big Data analytics are often conducted on massively parallel software running across multiple servers, allowing literally millions of files and hundreds of millions of pieces of data – structured, semi-structured, and unstructured – to be explored, patterns to be detected, and anomalies to be identified.

The built environment BIM and Big Data opportunity

As described earlier, the architecture, engineering, and construction industry has traditionally operated in a very silo-type way, with datasets being confined to individual organisations and/or construction phases. Growing adoption of BIM-based principles is helping to break down these barriers and achieve higher levels of data interoperability during the design and construction phases This initial investment in BIM is also set to revolutionise the operational phase, which accounts for the majority of the lifetime cost of a built asset.

BIM and Big Data begin to converge when data about multiple built assets, their design, construction, and operation needs to be interrogated, perhaps alongside data from other sources (there is a growing array of open data sources, plus numerous commercial data services). Here are some example scenarios:

  • A main contractor may want to evaluate the design and construction efficiency of different consultants and specialist subcontractors across all projects undertaken during a defined period. Interrogating BIM and other structured data alongside other, more unstructured data can help the contractor identify which specialists deliver best design, highest client satisfaction, lowest site waste, and so on.
  • A building materials supplier or product manufacturer may want to understand how certain components have been used across multiple buildings. By analysing both structured data from BIM and/or the ‘Internet of things’ and unstructured data from correspondence, voicemails and the like, the supplier might learn how the components were initially specified and selected, when and how they were installed, how reliably or efficiently they have performed, and how they are regarded by specifiers or facilities managers.
  • An owner-operator managing hundreds of built assets may need to know the exact quantity, all the locations, and details of the performance and operational efficiency of every instance of a particular system or equipment item. This information could be used to identify issues with reliability, to optimise maintenance programmes, to negotiate long-term supply contracts with manufacturers, and so on.

Such insights will rarely, if ever, be delivered purely by business intelligence tools, and the quality of business insight may be significantly enhanced if a customer has further corporate information to shed additional light on the BI data.

For example, in addition to all the content objects (drawings, reports, forms, and so on) and associated metadata shared on a collaboration platform, a customer may well have information relating to other projects undertaken without using a collaboration system, or where further work was undertaken outside the system, or after a project had been archived.

Of the scenarios outlined previously, the needs of the construction industry’s clients (owner-operators) should be paramount in the thinking of all project participants. As awareness of the Big Data opportunity grows, major owner-operators of multiple built assets will want to capitalise on the immense volumes of data they generate from their operations. Only a fraction of this data, including BIM, will concern the design and construction of the physical asset and far more will concern the day-to-day operation and use of those assets (energy use, employee behaviors, customer transactions and the like).

Owners, corporate occupiers, and investment managers may also want to aggregate building-related data across entire property portfolios. Currently, this is difficult considering that data is captured by different systems, from different sources, at different levels of granularity, quality, and frequency, and in different formats.

However, owner-operators that have been using collaboration platforms have a potential advantage: they already have a major data-source at their fingertips that can be augmented by other information under their control. By bringing together their collaboration technology vendor and their key supply chain members, and by collating additional information captured in other corporate systems, they can use Big Data to get heightened visibility of how their supply chains and their business assets perform.