CHANNELS

infoTECH Feature

October 14, 2014

The Cost of (NOT) Monitoring

By TMCnet Special Guest
Leon Adato, Head Geek, SolarWinds

What does a wireless thermometer have in common with ping? Both can keep a business from losing cash.

One of the ways businesses stay in business is by keeping a tight rein on costs. So, it should come as no surprise that convincing executives to allocate budget money toward IT monitoring software can be a challenge. To the average executive—and let’s be honest, the mid-level manager listening to the technical ramblings of an excited but fiscally vague IT Pro—monitoring seems like a pure sunk cost with no possibility of return.

However, IT pros know this couldn’t be further from the truth. All that’s required to help others understand why, is to answer a single question: how much will not monitoring cost?

Case in point: recently, a 300-bed hospital considered implementing a $5,000 automated temperature monitoring system for the freezers where the hospitals supply of food was stored. The system would have saved staff time by measuring the current temperature in each of the coolers and freezers, and sending notifications if the temperature was out of acceptable range.

Hospital administration declined, deeming the solution too expensive just to know that a freezer was five degrees too cold. Needless to say, one of the staff members eventually left the door to the main cooler open, which caused the compressor to run all evening until it failed completely. The next morning, staff arrived only to find all the food in that cooler had spoiled. Recovering from this failure required emergency food orders, extra staff, repair services and a lot of overtime.

The total cost of the outage came to a cool $1 million—200 times more than the cost of the monitoring system deemed to be “too expensive.” This kind of scenario, where a small upfront investment could have prevented costly problems down the road, should sound hauntingly familiar to IT pros.

With this example in mind, it behooves us as IT professionals to be able to explain—in clear terms that non-technical staff can understand—what is intuitively obvious to those of us in the trenches: the cost of not monitoring is often far greater than the tools that could help us avoid failures in the first place.

Convincing non-IT staff of the need for monitoring tools after a critical system failure is probably a little easier, as outages tend to remain fresh in people’s minds for a long time. But just how can IT pros make the case for monitoring without first experiencing an actual IT resource failure? Or, if an organization has experienced a failure with a particular system, how can IT pros make the case for purchasing monitoring tools to protect other mission critical systems?

It really comes down to identifying the potential costs of a failure. Every management team feels differently; what leadership at one organization feels is catastrophic, others might simply consider the cost of doing business. Therefore, IT pros need to highlight costs that are eminently avoidable. Some things to consider are:

The ultimate end result of a problem if it goes undetected
The amount of time a particular failure could go unreported
The amount of time it would take to fix the system from as a result of a failure
Regular hourly staff cost for the system in question
Emergency and overtime staff cost for the system in question
Planned vendor maintenance costs versus emergency vendor repair costs
Lost sales or other income per hour if the system in question is unavailable

To understand how all this fits together, consider the simple example of a hard drive failure on a primary email server.

To begin with, no self-respecting IT pro would be caught dead without some form of fault-tolerance for a critical system such as email. So, in this example, let’s say a mirrored drive was in place, but it failed a couple days prior to the second drive’s failure. Since there was no monitoring solution in place, nobody noticed, effectively making it a single drive system.

The end result is that the system would crash. You would think an email system crash would be immediately noticeable, but email clients like Outlook do a great job of offline caching, so it can actually take a while before anyone notices. In this example, let’s say it takes 30 minutes.

Recovering from a hard drive failure takes time unless there are spare parts immediately on hand and some kind of instant recovery option. Let’s estimate that replacing the drive itself takes about an hour, and restoring from backup takes another hour. However, this is a vendor repair. That’s either a four hour lead time or one hour for emergency service.

Now let’s look at the costs. Let’s say regular staff time is $53 per hour while overtime is $75 per hour. Standard vendor repair is free, but remember that four hour lead time. Emergency vendor repair is $150 per hour with a two hour minimum.

This means email will be offline for between three and a half to six and half hours, with a cost of between $106 and $450. This may not seem like a big deal. However, that is the cost of just one drive failure. Consider a company that experiences 350 drive failures a year (something I have personally witnessed). Now we’re talking about between $37,000 and $157,000 per year—not counting company revenue lost while email is down and productivity plummets as a result.

Now, of course, drives fail whether they are monitored or not. However, in the above example, catching the first drive failure, replacing it at a convenient time and avoiding both the outage and the time spent performing data recovery could save between $18,500 and almost $140,000 over the course of a year.

It’s important to go through a similar exercise for all mission critical systems in the IT environment—including email, CRM and Web services—combined with different types of outages, such as disk failure, application crashes and network failure.

To avoid becoming overwhelmed, prioritize. Take a hard look at the IT environment and honestly assess what systems are rock-solid and which are a bit shakier. Also, leverage other team members where necessary by asking them how long it takes to identify when their systems are offline, and how long it takes to bring them back up.

This process may seem tedious, but all too often it’s what it takes to help non-IT executives and other decision makers understand that proper monitoring is crucial, and that the cost of not monitoring can far exceed that of doing so. Simply put: speak their language, which is the language of money.

Leon Adato is a Head Geek at SolarWinds (News - Alert), an IT management software provider based in Austin, Texas. Adato boasts more than 25 years of IT experience, including 14 years working with systems management, monitoring and automation solutions for servers, networks and the Web. Adato is also a Microsoft Certified Systems Engineer, Cisco (News - Alert) Certified Network Associate and SolarWinds Certified Professional. Prior to his role at SolarWinds, Adato served as a Senior Monitoring Consultant for Cardinal Health.

Edited by Stefania Viscusi

FOLLOW US

Webinars

Vonage Elevates Cloud Communications with Advanced Noise Cancellation and AI Integration
Rich Tehrani

Cyara Enhances AI and Assurance Services in the CX Space
Rich Tehrani

MiaRec Demos AI-Powered Quality Assurance Tools for Contact Centers
Rich Tehrani

What is a Data Lake
Rich Tehrani

How LeapXpert Combines Consumer Messaging with Enterprise Security and Compliance
Rich Tehrani

CHANNELS

infoTECH Feature

The Cost of (NOT) Monitoring

infoTECH Headlines

What Is AWS EFS? Features, Use Cases, and Critical Best Practices

Cost-Effective Approaches to s1000d Conversion

A virtual crossroads for technology enthusiasts

Benefits of employee monitoring software in preventing overworking of workers

CI/CD: Trends and Predictions for 2024

Technical Documentation for IT: A Practical Guide

Managing Your Costs on AWS: A 2024 Guide

What Is Application Dependency Mapping?

Top 5 Kubernetes Errors and How to Solve Them

How Artificial Intelligence Can Improve the World of Online Gaming Platforms

Subscribe to InfoTECH Spotlight eNews

infoTECH Whitepapers

What is Information Tehnology

Subscribe to InfoTECH Spotlight eNews InfoTECH Spotlight eNews delivers the latest news impacting technology in the IT industry each week. Sign up to receive FREE breaking news today!
FREE eNewsletter