Subscribe to the InfoTech eNewsletter

infoTECH Feature

January 17, 2017

Data Corruption: The Silent Data Thief

By Special Guest
Gary Watson, CTO, Nexsan

IT professionals are focused on the outside threats that might steal, encrypt or destroy their organization’s data. But what about those serious threats that are not from external adversaries or rogue employees, which happen without you even knowing?

Silent data corruption is real and must be taken seriously. This threat is not an abstract “theoretical possibility,” rather a real-world risk that’s been reported by researchers for several years. As early as 2007, the CERN research organization tested 3,000 servers attached to RAID subsystems; in three weeks, it found 500 instances of corrupted files in 17 percent of the RAID arrays. In short, the equivalent of one in every 1,500 files had become corrupt. That’s bad, but even worse is how easily silent data corruption can take place without notifications.

With active files (i.e., ones that are constantly being accessed and opened), any data corruption or missing files will quickly be noticed. But it’s a completely different scenario with your archive files, which are rarely opened — usually only when they are critically needed. It could be months or even years until you discover one of your files is damaged...or gone. Corrupted or missing files are obviously a huge problem for healthcare, financial services and governmental institutions because they’re subject to rigorous regulatory requirements. But this problem really threatens any organization that archives high-value data. A great deal of companies are affected, and odds are you’re at risk, too. What’s the solution? As noted, end-to-end integrity checking is the only way that silent data corruption can be detected and corrected. Simply put, any archive solution that lacks this integrity checking cannot credibly claim to offer secure archiving...period.

Conventional archive solutions can’t monitor the availability and health of every file, and manually verifying the existence and integrity of those files (by opening millions, perhaps billions of them) would be a nightmare. If you want true end-to-end integrity checking, you must have a secure archive solution that’s been specifically designed to maximize data security, integrity and privacy from the moment a file is ingested into the archive.

Maintaining a second, redundant copy of every original file is key because it enables your archive storage to perform crucial comparative analyses of your files using these two powerful data protection technologies: file serialization and file fingerprinting.

File serialization is where every file ingested has a unique serial number assigned to it (used for both copies of a file, the original and its redundant copy). This file serialization enables the archive solution to periodically verify the existence and location of every file in the archive, both at the archive’s primary site and at its secondary site. If a missing file is detected, it can notify your administrator and automatically replace it using its serialized redundant copy.

File fingerprinting guarantees file-level integrity within the archive, by generating a unique gold-standard “fingerprint” of each file when it is ingested and when it is copied. Subsequent copies of the original file (for example, stored in a remote location) can be validated as a correct copy of the original file after the copy’s fingerprint is compared to the original’s fingerprint. These fingerprints enable periodical audits of the integrity of each file against its original fingerprint in order to confirm the data has not been changed (due to silent data corruption, disk error, virus, tampering or replication error). Should this process reveal that one of your archived files has been altered, the audit reports the corruption and then automatically replaces the corrupted file with its undamaged copy

Some secure archive solution vendors also utilize fingerprints to monitor file integrity, but they only check the fingerprint against its file when you access the file. That’s a big problem, because you’ll find out you’ve been hit by data corruption only when you try to open the file—and by then all hope of replacing that damaged data may be lost. The only way you can combat this is to purchase expensive backup/dedupe appliances and tape backup. So in the end, your investment in these “secure” archive solutions doesn’t really guarantee you anything except the need for a massive increase in your IT spending.

Silent data corruption is an unfortunate reality in the IT landscape; its occurrence is not a question of “if” but “when.” Just hoping this phenomenon won’t eventually strike conventional archive solutions is hardly an effective strategy. The answer is to proactively protect your archived files by deploying a truly secure, archive solution, purpose-built to maximize your data’s integrity and security.

Edited by Alicia Young

Subscribe to InfoTECH Spotlight eNews

InfoTECH Spotlight eNews delivers the latest news impacting technology in the IT industry each week. Sign up to receive FREE breaking news today!
FREE eNewsletter