Data may be the new gold – but, if it is, many companies are having a Midas moment, when they realize that too much of a good thing can be very, very bad.
King Midas is infamous for being rewarded by the Greek god Bacchus with the ability to turn anything he touched into gold. Initially delighted with the reward, Midas set about turning rags into riches before discovering that his gift was, in fact, a curse. Unable even to eat or drink, the king was miserable until he was freed from his burden. What does this have to do with data? Well, for many companies the narrative is the same. Deprived of data on which to base their decisions, companies at the end of the last century were hungry for actionable information, placing a high value on any information that could give them a commercial advantage.
This sparked a boom in data collection, which brought many benefits. Stores that previously could only determine from their till receipts that ‘someone’ had bought ‘something’ for 99 pence, introduced data capture solutions that first told them exactly what had been bought, by barcode, and then told them who had bought it, by means of loyalty cards. , then, by linking customer data to online activity, were able to create customer profiles that showed not only what people were buying, but also what they were looking at. And for how long. And what they also considered.
It heralded a golden age of data literacy, where the possibilities seemed endless. Retailers could better predict shopping habits, help get the right stock on shelves at the right time, and reduce food waste. Doctors obtained more detailed medical records on which to make diagnoses. SatNavs could guide us around traffic incidents and get us to work on time. Entertainment companies could predict our tastes and recommend music or movies for us to enjoy. It was the peak time from poverty to wealth.
Too much of a good thing
But, just like with Midas, many organizations are now finding that getting everything they wanted – and more – can be a double-edged sword. Data is only powerful if it is accurate. So, if any data is missing, corrupted, or unavailable, the whole system crashes. And the more data there is, the more difficult it is to manage. It’s like the puzzle games you might play on your phone: easy at first, but as the tiles start falling faster and in greater numbers, you end up getting overwhelmed.
In a world where it’s nearly impossible to touch anything without creating data, many companies now have more data entering their networks than any human team can monitor.
Drowning in information, data-driven decision-making begins to become problematic. Can you make a correct medical diagnosis when certain test results are missing? Can you route traffic quickly and safely without key route data?
Often the easiest option for a business is to focus on the data it knows to be of high value, then “park” everything else in a cloud storage vault to assess its importance. and deal with it later. This is one of the reasons why Veritas research found that only 16% of enterprise data is “actionable” and used, while the rest is either “ROT” (redundant, obsolete, or trivial) or ” obscure” (the team that stores them doesn’t know what it is).
Storing all this unused data has a cost, and not just financial. The servers that store this data, globally, require huge amounts of electricity, which creates huge amounts of carbon pollution. Veritas calculated that in 2020 alone, dark data storage contributed 5.8 million tons of CO2 waste to the Earth’s atmosphere. That’s the same carbon footprint as 80 countries in the world combined.
So how can we change that? Well, the plan of coming back to assess that “parked” data only works if there is a significant change in circumstances. Either the business needs to stem the inflow of data, or it needs more resources to deal with it. According to IDC, data volumes are far from decreasing. In fact, the analyst house predicts continued data growth at a CAGR of 23%. Recent research from Veritas has also highlighted that companies lack IT specialists to deal with the most critical actions. The average business said it would need to hire 22 more people to upgrade its data protection, not to mention address its broader data management issues.
Teaming up with technology
So what about that dark data that’s piling up so fast you’d have to be superhuman to sift through it? Well, maybe the answer is less about a person with superpowers and more about a team with augmented skills. Where people are good at creativity and decision-making, technology is good at processing large amounts of information quickly. Harnessing artificial intelligence (AI) and machine learning (ML) and using them to augment the skills of the existing IT team is the path to not only retaining good data-driven decision-making, but also reducing the environmental impact of data storage.
This is called autonomous data management and it relies on technology platforms learning data management practices and independently applying them to new datasets. Enforcing these policies has historically been a manual task. Someone has to tell a system where data should be stored, how it is used, and when, ultimately, it should be deleted. Doing this at the micro level, piece by piece, takes time. As a result, organizations often take a more holistic approach to data management, implementing an “all data created in Europe” policy, for example. This is how you get an accumulation of unused – and probably unusable – data that sits indefinitely on unreachable servers that slowly and unnecessarily consume electricity.
But, when autonomous data management takes over, AI can enable proactive decision-making and policy enforcement at a much more granular level. It can learn the idiosyncrasies of different data types and apply the storage, protection, or deletion policies that make sense. So when new data is created, it will be automatically protected, securely stored, have limited access, and be deleted at the right time.
Reduced data load
From a sustainability perspective, this can help drastically reduce the volume of data stored and the pollution associated with it. Not only can businesses delete data they know for sure is not needed, but they can also reduce the storage space they need by optimizing how data is retained.
For example, much information held by companies is duplicated several times. If I have a contract and email a copy to my colleague, not only do I have the original document, but I now have a copy in my folder of the items sent in my email. And my colleague now has one in his inbox. If I “Cc” someone from the legal department, someone from finance, and the three members of my team who work on this account, that now makes eight copies of the same file that are all stored, probably for years, on the servers of our company.
In a dark data environment, each of these files should be kept separate because no one knows they are the same document. It’s like having eight sealed envelopes: until you look inside, you can’t tell if the letters inside are the same or different. With Autonomous Data Management, the technology is used to monitor files across the enterprise, indexing identical data, storing only unique data, and replacing duplicates with links to the original versions.
This “deduplication” is particularly useful in backup data, where ADM-driven solutions are sometimes able to reduce the amount of energy needed to store this data, and the associated CO2 emissions, by around 95%.
From a business perspective, this means that data risks can be minimized – or eliminated – from networks. This deluge of data that companies have been unable to process is vulnerable. Veritas research shows that organizations implementing digital transformation projects during the pandemic expected a two-year lag between rolling out new apps and putting protection in place to secure them.
That’s two years of vulnerability to ransomware. Two years of potential compliance failures. Two years of risk that could be quickly banished with autonomous data management.
Autonomous data management has the potential to restore decision-making powers over big data and put businesses back in control, heralding a new “golden age” of data.
About the Author
Mark Nut is Senior Vice President at Veritas Technologies LLC. Veritas Technologies is a leader in multi-cloud data management. More than 80,000 customers, including 87% of the Fortune Global 500, rely on us to ensure their data is protected, recovered and compliant. Veritas has a reputation for large-scale reliability, which provides the resilience its customers need against disruptions threatened by cyberattacks, such as ransomware. No other vendor can match the execution capability of Veritas, with support for over 800 data sources, over 100 operating systems, over 1,400 storage targets and 60+ clouds through a single, unified approach.
Featured Image: ©Chartphoto