infoTECH Feature

April 20, 2020

Data Always Wins - How to Beat the Data Economy One Petabyte at a Time

Data is without a doubt, now, the world’s most valuable resource. But data is not “the new oil”.Unlike oil, which is a depleting natural resource, data grows exponentially and is far from organic. The more data the world creates, the more machines we need to process it. This results in a compounding factor of petabytes, or even exabytes of data generated, and there are no signs of things slowing down.

The problem with massive amounts of data is how to keep it organized. This is not clean or ordered data. Rather, the unstructured data that drivesour data economy is messy and difficult to manage.

So Just How Much Data is Out There?

In a recent survey commissioned by Igneous, 60% of respondents reported managing more than one billion files. The top 10% of this group said they manage at least 150 billion files, amounting tomore than 83 petabytes. If you’ve been around long enough to remember a 1.44-megabyte floppy disk, it’s truly amazing to fathom how a single organization can amass that much data.

              Shaun Kehrberg 

An example from the healthcare industry helps explain how we reached this tipping point. While traditional cancer treatments focusedon surgery, chemotherapy, and radiation-  the adoption of personalized medicine has changed everything. Now, with the use of ground-breaking techniques like DNA sequencing, doctors are tailoring treatment plans to a patient’s specific genetics. Not surprisingly, genetic sequencing creates massive amounts of unstructured data. A single machine can generate up to 2 terabytes of data per run, leading researchers to predict total healthcare data storage requirements will soon exceed 200 exabytes.

The Untapped Value of Unstructured Data

Healthcare professionals are not the only ones dealing with massive data growth. It’s the fuel powering emerging technologies like Artificial Intelligence, Machine Learning, and more. In fact, people responding to the Igneous survey indicated that two-thirds of their data is moderately to extremely valuable. Moreover, 40% of the respondents said they manage what they consider business-critical data so valuable, it defines the value of their organization. Clearly these large, valuable amounts of data must be well-managed and protected. Easier said than done (for most). 

There are two main issues that hinder a business from effectively managing their unstructured data. First, most organizations have a fundamental lack of visibility into this type of data. Second, unstructured data is very difficult to move. And data that cannot be easily moved is difficult to backup or archive.

To remain competitive in the modern data economy, organizations must prioritize and solve these fundamental issues of unstructured data visibility and movement.

Gaining Visibility

Timely and accurate information is the key for IT professionals to effectively manage unstructured data. Modern organizations need a data management solution that provides at-scale visibility across petabytes or even exabytes of data.

Consider the following parameters when evaluating a data visibility solution:

  • Scale- many traditional solutions fail long before reaching a billion files. Find a solution that can scan billions of files in hours (versus weeks or months).
  • Scope - most modern organizations have a mix of storage systems: on-prem, cloud, different file systems, NAS—everywhere! Settle for nothing less than full visibility into ALL data, regardless of where it resides.
  • As-a-Service - we all have enough things to manage. Don’t add data visibility to that list.

Moving Data

With better data visibility enabled, the next step to properly managing data is moving it quickly and efficiently. The key is to find a tool that can handle this at scale.

When it comes to moving data be sure to evaluate the following:

  • Scale - again, look for a solution that can work with billions of files and petabytes of data. It will need to transfer data very close to the theoretical limits of your network bandwidth—something never achieved by traditional solutions. The solution also needs to scale out horizontally in an efficient manner to handle peak transfer loads.
  • Latency Awareness - scale is important, but not at the cost of bringing your network to its knees. A modern solution will intelligently monitor overall network latency in real time andback off when data movement is affecting user experience.
  • Cloud Savvy - storing data in the cloud is extremely cost effective, but moving data to and from the cloud is tricky. The cost of moving data to and from the cloud incorrectly can easily dwarf the cost of storing your data in the cloud. Select a solution that is purpose-built to minimize data movement charges in the cloud.

The data economy is not going anywhere. It is here to stay and is increasing in size and complexity every day. Organizations who work constantly to tap the value of their unstructured data will be the winners. Are you ready?

About the Author:Shaun is Director of Product Marketing, responsible for bringing Igneous’ SaaS (News - Alert)- based data management platform to market. Prior to Igneous, Shaun was Director of Customer and Product Marketing at Marchex, and Head of Marketing at Scale Model, astartup launched out of the Betaworks (News - Alert) studio. He received a bachelor’s degree in Marketing and Finance from Western Washington University. Shaun is a big Seattle sports fan, part-time soccer coach, and spends as much time as possible camping with his wife and two sons.

Edited by Maurice Nagle

Subscribe to InfoTECH Spotlight eNews

InfoTECH Spotlight eNews delivers the latest news impacting technology in the IT industry each week. Sign up to receive FREE breaking news today!
FREE eNewsletter

infoTECH Whitepapers