Four Steps to Taming the Data Explosion with Information Management

Drowning in the Data Deluge

The amount of data in the world is predicted to rise to 44 zettabytes by 2020, according to Northeastern University. To put that in perspective, one zettabyte is equivalent to 44 trillion gigabytes. Meanwhile, 2.5 exabytes are produced every day – equivalent to 250,000 libraries of Congress.
Enterprises are a key driver of this data-driven wave, with individual companies now producing several terabytes of electronic information each year. Email is a prime culprit. In 2012, total email traffic worldwide was a whopping 144 billion per day.
The net result is that corporations across the world are seriously struggling to store, understand and manage their information. As key industries including financial services, property, and pharmaceuticals become increasingly regulated, data not only needs to be properly stored and easily accessible, but it also needs to meet stringent records management and discovery obligations.
By getting a firm grip on their data, businesses are in a better position to meet legal requirements, and will benefit from the resulting advantages associated with organizing their data, such as reduced storage costs and increased efficiency.

Four Steps to Information Management

Start with Policy

As a first step to taming the data explosion, corporates need to apply data management policies. Focusing an organization’s attention on managing unstructured data can be a daunting task, but putting a governance structure around the problem allows organizations to develop a process-based, technology-enabled, and targeted approach to keeping what it needs, and disposing of the rest.
Existing policies and processes are the best place to start to guide the strategic direction of a data management project. A company’s approach will vary based on its organizational priorities and structure. Each organization’s data-management problem areas will differ. It may be managing high-risk or high-volume data repositories, or masses of data slated for migration to the cloud. Organizations can avoid completely reinventing the wheel by seeking out best-practice models to serve as a frame of reference.

Tame with Technology

Policies on their own won’t bring into line all your unstructured data. It’s certainly important to know what information you own and understand its value to the organization, but there’s typically too much information to examine it piece by piece and make individual determinations. Enterprises that have successfully tamed their data have typically enforced policy-backed goals with a documented process and technology.
“Crawling” technology can assist legal or IT teams in locating information and documenting how the data and repositories are organized. Data visualization tools can provide an idea of the type of content you have. Technology-assisted review tools can be trained to “learn” and “assess” whether documents are of value or not.
In addition to helping inform governance decisions, these tools can apply compliance functions to unstructured information. Whether you’re dealing with personally identifiable information, protected health information or data subject to cross-border scrutiny, characterization and predictive coding tools can help determine whether certain data needs special attention.

Characterize, Classify, Act

Given the amount of information typically at hand, organizations must be willing to sacrifice exact classification in favor of characterization. Characterization aims to sort information within a margin of risk tolerance defined by the organization and particularized in the context of the information at issue. Characterization allows you to associate information with general retention policies and procedures, and then consider taking action against it.

After characterization or classification, corporations need to act on that information. All actions should be defined, documented and communicated to any information stakeholders through a communication plan, audience-appropriate training and on-demand support. Information stakeholders include everyone involved in both the data governance initiative and in the daily creation and use of unstructured information.
Taking action on information assets without the adequate involvement of the information stakeholders is sure to raise issues. The same tools that can help to identify, classify, and characterize the information also may be able to act upon it. Depending on the tool and the desired action, many tools can move or remove selected data at the push of a button. But action should be taken only according to your overarching plan.

Importance of a Documented Process

Ultimately, you will need to decide what stays and what goes as a result of your analysis. Each category should be the result of a deliberate process — one that is well-defined and documented, and one that closely mirrors existing records retention approaches. Decisions should be driven by legal, business and risk considerations, and tempered by any technical limitations known to the organization.
Accountability should also be defined, documented and well-known. Information stakeholders should be aware of the cost, risk, and value inherent in creating and storing data, because that information belongs to the organization, and whether it is an asset or turns into a potential liability should be everyone’s responsibility.

Benefits of Information Management 

So what benefits can corporations expect from conquering their data? Taking action can reduce the cost of information storage, backup, and disaster recovery by reducing the amount of data that must be stored. Right-sizing an organization’s information repositories can also reduce the cost of complying with regulatory or legal requests as well-organized and appropriately sized data are easier to navigate and cheaper to collect, process, and review.
A successfully implemented data management plan also reduces risk by reducing exposure. For example, by disposing of information defensibly that is not subject to a retention period or legal hold rather than storing it indefinitely, organizations could reduce their exposure to future litigation or costly discovery requests. A well-run data management program also can increase business agility by making it easier to locate critical business information and improving reaction time. How many “final,” “really final” and “really, really final” revised documents do you need to sift through before you find the one you need?
In today’s information-fueled society, organizations cannot afford to be in the dark about their data. Creating written policies and well-understood workflows around data management is a must. Well-crafted and diligently implemented data management plans will help organizations avoid potential sanctions and administrative headaches, while reducing costs. Taking proactive steps with data management also pays big dividends for IT infrastructures and future workloads.

Filed under: big data, eDiscovery, esi, information governance