Defining Grey Data: What Is It and What’s the Potential Value?

Bill Tolson


Although the information governance (IG) profession has long recognized the existence of dark data, it has yet to address the issue of grey data. Much of an organization’s information — over 80% — consists of unstructured data, the majority of which is directly controlled by individual employees on their individual workstations and file shares. This means that this data is mostly unmanaged.

In fact, a 2012 Compliance, Governance & Oversight Council (CGOC) survey revealed that in the average enterprise data store, 1% of data was subject to litigation hold, 5% subject to regulatory retention and 25% had some business value and was worth retaining — leaving 69% potentially valueless.

This grey data has been suggested as a prime target of defensible disposition. However, having worked in the IG consulting industry for many years, I know for a fact that much of this so-called “valueless” data actually does still have value to an organization, and that its disposal could create issues, or at least affect end-user productivity down the line.

One example is departed employee data. Many legal departments now capture and quarantine departing employee data for a period of time corresponding to the local statute of limitations for wrongful termination lawsuits. In these cases, the legal department wants access to all of the ex-employee’s files and emails to refer back to in the event of a lawsuit. Data value is in the eye of the beholder.

The point is this: A percentage of this “valueless” data can still retain value (and affect productivity) for individual employees. As an example, Gartner Research used to track how many times per year the average employee searched for a previously saved file for reference or reuse, how much time they spent looking for it and if they couldn’t find it, how much time they spent recreating the data. These calculations produced an annual “wasted” productivity number that could be assigned a dollar amount.

A major issue with huge, unmanaged volumes of corporate grey data is finding particular information when you need it. Finding content that has not been managed or indexed is a major cost for companies and greatly affects end-user productivity. A CEO from my past, T. M. Ravi of Mimosa Systems, once made the observation that “[i]t costs up to 500 times more to find and utilize information once, than to store it untouched for 20 years.” The key is not to let your company’s unstructured data become abandoned and unmanaged.




More in Storage & Destruction