The Elephant in the Smart Building

There are three types of data commonly cited:

  • Critical data – data that is used in day-to-day operations for profit generation (14%)
  • Redundant data – data that is timed-out, superseded or no longer has value (32%) 
  • Dark data –  data that is unknown, untapped or unused. (54%)

Dark data is useless. Stored and forgotten. Ignored and going rotten.

A survey by Splunk of 1,300 business and IT leaders found that 55% of organisations’ data is dark. Yet 90% of those leaders insisted that data is “very” or “extremely” valuable to business success.

In 2025, it was estimated that dark data represented 80 zettabytes (or 80 trillion gigabytes) of the 149 zettabytes of global storage. With the volume of data stored is doubling every two-three years, this problem is growing exponentially.

Costs of dark data

Data storage costs – Storing data, even if it’s not actively used, requires physical or digital storage infrastructure. This can include servers, data centers, cloud storage solutions and backup systems. The more data in your ecosystem, the more data storage capacity you need, which leads to increased infrastructure costs. Estimates for storing 1000Tb (1Pb) on-premises are in the region of $1.3million over a 5 year period, with cloud providers varying from $0.4-1.7million over a similar period.

Inefficiency costs – Managing large volumes of data, including dark data, can slow down data retrieval and analysis processes. Employees may spend more time searching for relevant information, leading to reduced productivity and increased labor costs. According to a McKinsey report, employees spend 1.8 hours every day — 9.3 hours per week, on average — searching and gathering information. Put another way, businesses hire 5 employees but only 4 show up to work; the fifth is off searching for answers, but not contributing any value.

Another report by IDC in 2013 concluded that an enterprise employing 1000 knowledge workers wastes $48,000 per week, or nearly $2.5million per year, due to an inability to locate and retrieve information.

Bring it all together, an article in HBR back in 2016 citing IBM research asserted that the global yearly cost of poor data quality was a staggering $3.1trillion . At the time, global GDP was estimated by The World Bank at $75.8 trillion, meaning poor data was costing about 4% of output. We know that one of the reasons for dark data is poor data quality and as they said in the article, “While most people who deal in data every day know that bad data is costly, this figure stuns.” 

Risk costs – Dark data can pose risks in terms of insufficient cybersecurity, data breaches, compliance violations and data loss. These risks can result in reputational damage and financial consequences, with loss of private data potentially costing millions in fines.

Sustainability costs – Dark data is also a huge sustainability issue consuming 5.8 million tonnes of CO2 annually. Each Gb of data stored contributes annually to 

  • 63g of CO2 emitted
  • 205 litres of water used
  • 0.7cm2 of land required

Dark data in real estate

In real estate, the estimated level of dark data is even higher. For example, it is estimated that over 90% of IoT data generated from smart buildings goes unused

Yet in real estate, dark data represents a huge opportunity to improve operational efficiency, reduce risk of failures and, importantly, address the immense carbon emissions that buildings generate.

It is widely reported that “smart” real estate could cut 30% or more from energy costs and carbon emissions if it used data more effectively. There is also compelling evidence that smarter buildings earn more for landlords in rent (8% higher) and capital values (11-15% higher). So, we have to question why there is so much dark data and what we can do about it.

Reasons for dark data

Buildings are uniquely designed, large systems with lots of moving parts that are the culmination of years long projects with a myriad of actors contributing. Construction is a complex, messy process subject to last minute design changes, part substitution and cost engineering. Against this backdrop, it is not surprising that the data that originates from these assets is not well documented, connected or even available. According to IBM, some of the reasons for dark data below sound very familiar to people working in the real estate and construction industries:

  • Lack of awareness –  the owners, users or managers of buildings simply don’t know what data is being collected and stored.
  • Stuck in silos – When different stakeholders, suppliers and managers working within a building collect and store data independently, it leads to data fragmentation, isolation and access issues.
  • Legacy systems – Data produced and stored in legacy systems goes dark if it can’t be integrated with modern analytics tools or working methods. This is especially true of machine data that can lose its value quickly if not used.
  • Incomplete integration – Incomplete or ineffective data integration processes can result in data gaps and inconsistencies that limit the future use of that data. Competing standards in real estate and the differing views of consultants and integrators in the industry as to the best standards to adopt amplifies this.
  • Quality and standardisation – Poor data quality, such as inaccurate or incomplete, can lead to it being discounted or ignored. Data perceived as unreliable or badly described is less likely to be utilised, effectively rendering it dark.

Real estate and construction lag in adopting advanced technologies despite their benefits with general challenges of digital transformation such as resistance to change, skill gaps and cautionary stance on access to data. Construction is second to last in digitalisation, with only Agriculture being a poorer adopter of technology.

Yet, the relatively slow rate of adoption brings the benefit that these sectors have a lower technology debt than most sectors. In this light, new technologies can be adopted quickly without organisations having to write off earlier investments.

Leveraging dark data for better efficiency

Given the low rate of technology adoption and the opportunities that new technologies such as Agentic AI present, it seems obvious that the way to reach the kind of savings and efficiency levels that are needed in real estate to meet ESG goals, improve stakeholder satisfaction and reduce waste stem from unused data. If data is the new oil, as is frequently cited, then we need to improve our distillation techniques and extract more value from the precious commodity now being created in vast amounts.

To do this, as an industry, we need to equip ourselves with automation tools and imagine the services we want to implement with them. This requires insight at the crossroads of business and technology, needing more than expertise in smart technologies. It requires knowledge of relationship management, contracting, resource management and security. If we are to truly grasp the nettle of AI-based automation, the new mantra for real estate shouldnt be location, location, location – it should be data, data, data.

Leave a Reply

Your email address will not be published. Required fields are marked *