Skip to main content
data hoarding a ticking time bomb

Data Hoarding: a ticking time bomb?

The amount of information created by organisations is growing at an ever increasing rate and shows no sign of abating, not least as more organisations have accelerated their digital transformation strategies in the light of the Covid crisis. 

Neil Maude, Director of Technology for Arena, looks at some of the trends, issues and what you can do to mitigate your business risk.

Documents and Data are on the rise 

“Once upon a time, records meant paper documents.  They lived in filing cabinets…”.  So starts the introduction to an Association of Information and Image Management (AIIM) report on document retention practices published over ten years ago.  This report highlighted that we are keeping more and more information, when actually we shouldn’t be. And things have only accelerated since then, not least over the last few months of remote working.

HMRC rules state that we must keep financial records for the current tax year and preceding 6 years.  For many organisations, this leads to an annual chucking out in April. As one year is closed off, the oldest goes in the shredder (I hope it doesn’t just go in the bin!)  This is as reasonably quick process as you know that the box marked, for instance, “2009-10” can be destroyed.

However, what about documents that have different dates of retention within the same file?  HR records are the classic example – you may have to keep pension records for a very long time, whilst you are supposed to remove disciplinary records quite quickly (retain only for 1 year).  This is a much more onerous task – weeding out particular documents from some files and retaining others.  It is easy to foresee that this painful task might be relegated for a “rainy day”.

Digital data: out of sight, out of mind

Be warned! The convenience of digitisation can make the problem worse, rather than better. We used to be able to see the problem – simply running out of office space could be a compelling reason to do something about it.  But digital information is somewhat out of sight and even when the disks get full, it’s cheap to add more. 

Clearly, email and transactional records can be added to this digital data store, on top of the digitisation of paper records.

59 zettabytes (ZB) of data will be created, captured, copied, and consumed in 2020 - 2021

But it seems likely that this situation will get worse. IDC predict that amount of data created over the next three years will be more than the data created over the past 30 years, and the world will create more than three times the data over the next five years than it did in the previous five (i).

Organisations that have no process for information governance are not only failing to control the growth of data storage - they’re not classifying it or controlling security either. 

The tightening grip of the law

This deluge of data coincides with increasing levels of legislation and regulation.  Increasingly, poor data governance and data “leakage” is being punished – particularly under the jurisdiction of the EU's General Data Protection Regulation (GDPR) and corresponding legistation in the Data Protection Act (2018) which is enforced in the UK by the Information Commissioners Office.  Many industries also carry their own specific codes of practice and regulations, such as those laid down by the Financial Conduct Authority (FCA). Each of these pieces of legislation carries the potential for fines or even removal of the perpetrator’s ability to continue trading.

In particular, the Data Protection Act contains some clear guidance as to how long information should be held.  One of the central points is that data about an individual should not be kept longer than the minimum time required for the processing of that data (i.e. the purpose for which it is held in the first place).  In my opinion, this is an excellent starting place for information governance – if you don’t need it, don’t keep it.

Risky business

Besides the cost and the regulation aspect, why wouldn’t you keep some information? It might come in useful someday, mightn’t it?  And why does it need to be “governed”?

Well, there are some risks around keeping everything forever:

If you don’t have a decent structure for your data, you will waste time looking for what you think you might have. Locating a specific document can be painful if you don’t have a good policy to govern where you should be able to find it. Add to that some doubt as to whether the document is there at all (ie; whether it actually exists) and it may be a total waste of time.  As your data stores become bigger, this will get worse – so if you’re not organising your data, you might as well not have it.

You might be legally compelled to find some information and that might be costly. For example, if you are defending a court case, you may be instructed to provide all the information related to a particular transaction or individual.  If you have a “digital landfill” (an AIIM term for an unmanaged bucket of data) this could be very expensive to do – it may include searching all of your hard drives and even backup tapes in order to find comprehensive documents or to prove the non-existence of data.  This might require external experts, who don’t work cheaply.

Steps towards risk mitigation 

  • Implement a robust information governance policy and processes. If your processes are compliant with relevant legislation, then you know where your information is and you cannot be required to provide information which your procedures state you no longer have (ie; you can evidence the non-existence of the data).

  • Invest in technology. Implementing an electronic document management solution (EDMS), should allow you to classify information and to apply retention rules for different types of information.  This can automate your processes for disposal of digital documents.  Noting that AIIM research found that more than half of business documents were “born digital” (i.e. created on computer and then printed), this can be combined with a reduction in print costs (no more printing copies to file).

  • Consider e-mail. Tools are available to capture e-mails and subject them to automated classification and retention. This can be an entirely machine-driven process, taking place behind the scenes and independent of the storage of e-mails against the relevant business transactions – providing peace of mind without additional labour costs.

  • Look at what remains. Once the issues of documents and e-mails are resolved, then what remains is likely to be business transactional data (e.g. sales history in your accounts system).  This data is likely to be quite minimal and also less likely to contain the most highly regulated “personally identifiable” information.  So it’s less of a problem.  Depending on your industry you may also have specific data types to manage – such as phone recordings.  But some of the main challenges will have been tamed by the actions above.

  • Get started now. If you are not already actively managing your information, chances are that you will have a large data set to tackle and, as we’ve seen, it’s getting bigger all the time. Unfortunately, retrospective classification of data is difficult and labour intensive. Automation (e.g. pattern based classification) can get you a part of the way, but chances are that you’ll end up holding some of your existing data in a part of your system called “archive” – at some point you may be able to get rid of it, but only when you’re sure that all of the different types of data stored there are no longer required.  So the sooner you start, the less painful this problem becomes.


My late Grandfather – an engineer by profession, who lived through the “make do and mend” years following the great wars, had a clear rule regarding the items he kept in his workshop.  “Keep everything for 7 years.  If it hasn’t come in useful, just keep it for another 7…”.  This made him a very useful man to know if you needed a 3/8” bolt for your lawnmower – if you could wait long enough for him to find just the right bolt.  But it was his workshop and his bolts.  Hoarding didn’t create any problems, except an ever decreasing space to actually do any work.

But hoarding information in this way isn’t viable for a modern business.  Even though the problem is out of sight, it’s subject to laws and brings unexpected costs.  Fortunately there are tools to solve these problems and the sooner you get started, the less it will cost. 

And you never know, being able to locate information quickly may pay back far more than the investment you make to keep it organised.  After all, this is why we keep information in the first place.

About Neil Maude

Director of Technology Tel: 0344 863 8000 | Email: | LinkedIn:

Neil joined the Arena Group in 2006 and has over 25 years of experience in the electronic document management industry, working with both private and public sector customers.  Neil sits on Arena’s board of directors and manages the delivery operations of Arena’s EDM business.  His team spend their time developing software, implementing solutions for customers and providing after-sales software support services, both in the UK and internationally.