Data maybe the new gold – but, if so, many businesses are having a Midas moment, when they realise that too much of a good thing can be very, very bad
King Midas is infamous for being rewarded by the Greek god, Bacchus, with the ability to turn everything he touched to gold. Initially delighted with the reward, Midas set about turning rags to riches before discovering that his gift was, in fact, a curse. Unable even to eat or drink, the king was miserable until freed him from his burden.What’s this got to do with data? Well, for many businesses, the narrative is the same. Starved of data on which to make their decisions, businesses towards the end of the last century were hungry for actionable insight, placing a high value on any information that could give them business advantage.
This kickstarted a boom for data collection, which drove a raft of benefits. Shops that could previously only work out from their till receipts that ‘someone’ had bought ‘something’ for 99p, introduced data capture solutions that first told them exactly what had been bought, through barcoding, then told them who bought it, through loyalty cards, and then, linking customer data to online activity, were able to build customer profiles that showed not only what people bought but what they looked at too. And for how long. And what they also considered.
This heralded a golden age of data insight, where the possibilities seemed endless. Retailers could better predict purchasing habits, helping to get the right stock on the shelves at the right time and reducing food waste. Doctors got more detailed medical records on which to make diagnoses. SatNavs could route us round traffic incidents and get us to work on time. Entertainment companies could predict our likes and recommend music or films for us to enjoy. It was the peak rags-to-riches moment.
Too much of a good thing
But, just as with Midas, many organisations are now finding that getting all they wanted – and more – can be a double-edged sword. Data is only powerful if it’s accurate. So, if data is missing, corrupted or unavailable, the whole system falls down. And, the more data there is, the harder it is to manage. It’s like the puzzle games that you might play on your phone: easy to begin with but as the tiles start falling faster and in greater numbers, you ultimately get overwhelmed.
In a world where it’s almost impossible to touch anything without creating data, many enterprises now have more data coming into their networks than any human team can monitor.
Drowning in information, data-driven decision making starts to become problematic. Can you make a good healthcare diagnosis when the results of some tests are missing? Can you quickly and safely route traffic without the data for some key roads?
Often the easiest option for a business is to focus on the data that they know to be high-value and then to ‘park’ everything else in a cloud storage vault to assess its importance and deal with later. This is one reason why Veritas research revealed that just 16% of business data is ‘actionable’ and being used, while the rest is either ‘ROT’ (Redundant, Obsolete or Trivial) or ‘dark’ (the team storing it doesn’t know what it is).
Storing all of this unused data comes at a cost – and not just a financial one. The servers that store this data, on a global scale, require huge amounts of electricity, which create enormous volumes of carbon pollution. Veritas calculated that, in 2020 alone, the storage of dark data contributed 5.8 million tonnes of CO2 waste to the Earth’s atmosphere. That’s the same carbon footprint as 80 of the world’s countries put together.
So how do we change this? Well, the plan of coming back to assess this ‘parked’ data only works if there’s a material change to circumstances. Either the company needs to stem the incoming flow of data, or it needs more resources to deal with it. According to IDC, data volumes are far from decreasing. In fact, the analyst house predicts continued data growth at a CAGR of 23%. Recent Veritas research also highlighted that businesses lack the IT specialists to cope with even the most critical actions. The average enterprise stated that they would need to hire an additional 22 members of staff to bring their data protection up to speed, let alone sort out their wider data management issues.
Teaming with technology
So, what of this dark data piling up so fast that you’d have to be superhuman to ever get through it? Well, the answer is, perhaps, less about one person with superpowers and more about a team with augmented skills. Where people are great at creativity and decision making, technology is great at processing a lot of information at speed. Harnessing Artificial Intelligence (AI) and Machine Learning (ML), and using them to augment the skills of the existing IT team, is the route not just to retaining good data-driven decision making, but also reducing the environmental impact of data storage.
This is called Autonomous Data Management and it relies on technology platforms learning data-management practices and independently applying them to new data sets. Applying these policies is historically a manual task. Someone has to tell a system where data is to be stored, how it’s used and when, ultimately, it needs to be deleted. Doing this on a micro basis, item by item, is time consuming so, often, organisations take a more blanket approach to data management, implementing a policy ‘for all data created in Europe’, for example. This is how you get a build-up of unused – and probably unusable – data that sits forever on unaccessed servers that slowly, and unnecessarily, consume electricity.
But, when Autonomous Data Management takes over, AI can enable proactive decision making and policy application at a much more granular level. It can learn the idiosyncrasies of different data types and apply whatever storage, protection or deletion policies that make sense. So, when new data is created it will automatically be protected, be stored securely, have access limited, and be deleted at the right time.
Reducing the data load
From a sustainability perspective, this can help to radically reduce the volume of data stored and the pollution associated with it. Not only can businesses delete the data that they know for sure is not needed, they’re also able to reduce the storage space that they need by optimising the way that data is held.
For example, lots of information that businesses hold is duplicated multiple times. If I have a contract and I email my colleague a copy, then not only do I have the original document, but I now have a copy in my sent items folder in my email. And my colleague now has one in their inbox. If I “CC” someone in Legal, someone in Finance and the three members of my team who work on that account, then that’s now eight copies of the same file that are all being stored, probably for years, on our company servers.
In a dark data environment, each of those files need to be kept separately because no one knows that they’re the same document. It’s like having eight sealed envelopes: until you look inside, you can’t know if the letters they contain are the same or different. With Autonomous Data Management, technology is used to monitor files across the whole enterprise, indexing which data is the same, storing only unique data, and replacing duplicates with links to the original versions.
This “deduplication” is especially useful in backup data, where ADM-driven solutions are sometimes able to reduce the amount of power required to store this data, and the associated CO2 emissions, by around 95%.
From a business perspective, it means that data risks can be minimised – or eliminated – from networks. This deluge of data that business have not been able to address is vulnerable. Veritas research shows that organisations implementing digital transformation projects during the pandemic expected there to be a two-year lag between deploying new applications, and having protection in place to secure them.
That’s two years of being vulnerable to ransomware. Two years of potential compliance breaches. Two years of risk that could be quickly banished with Autonomous Data Management.
Autonomous Data Management has the potential to restore the powers of big-data-decision making and to put businesses back in control, heralding a new “golden age” of data.
About the Author
Mark Nutt is Senior Vice President at Veritas Technologies LLC. Veritas Technologies is a leader in multi-cloud data management. Over 80,000 customers—including 87% of the Fortune Global 500—rely on us to help ensure the protection, recoverability and compliance of their data. Veritas has a reputation for reliability at scale, which delivers the resilience its customers need against the disruptions threatened by cyberattacks, like ransomware. No other vendor is able to match Veritas’ ability to execute, with support for 800+ data sources, 100+ operating systems, 1,400+ storage targets and 60+ clouds through a single, unified approach.
Featured image: ©Chartphoto