infoTECH Feature

August 03, 2015

Declare a Permanent Truce in the IT War Room

While the development of cloud and SaaS (News - Alert)-based applications and services has done wonders to improve how knowledge workers collaborate and share information, it is also wreaking havoc on enterprise IT departments. Enterprises are increasingly adopting a hybrid IT infrastructure model, which keeps mission-critical applications and information stores in the data center and migrates others to the cloud. Simultaneously, IT organizations have become increasingly specialized, with different teams and tools focused on very specific parts of the IT infrastructure. Often, this prevents any one individual from understanding the end-to-end infrastructure, which makes solving a system problem more difficult. As a result, the so-called “war room”—where IT personnel gather to diagnose and fix a problem—devolves into a new episode of The Blame Game.

The Fog of War

These end-to-end complexities can quickly overwhelm human abilities and makes the job of resolving problems and maintaining systems increasingly difficult and time-consuming. Furthermore, that often misunderstood and unpredictable thing called human nature gums up the process. This in turn negatively impacts service quality, and often stresses relationships between teams when tense finger-pointing conversations take place.

IT grew up around domains such as the network, applications, servers, database, etc., and they needed domain data to do their specialized jobs. That has driven a proliferation of domain-centric point tools, which helps each domain group, but also means that for even very simple transactions, domain teams only see part of the transaction, such as packet data or metrics from an app server. This incomplete visibility means domain teams see different things due to incomplete data sets, variable degrees of fidelity and differing analytic approaches.

This model ensures the collaborative diagnosis effort is slow, manually correlated and not real-time. Instead of using the war room to work together to battle and fix a problem, IT personnel use it instead to battle each other—often with egos and opinions getting in the way of fact finding. Each silo uses its own specialized tools to evaluate the issue, and often determines the fault lies with another group, but does not know which one. So the problem—and the blame—gets passed from group to group. Meanwhile, the issue remains unsolved, and business impacted.

Gaining Situational Awareness

So how can a company take the war out of the war room? Enterprises should implement a new approach to managing systems and application performance that enables IT to become more proactive and more collaborative in addressing issues, even as they continue working day-to-day on their primary areas of responsibility. 

The foundation is to integrate real-time high-fidelity performance metrics into one global portal that provides broad end-to-end system monitoring capabilities that can be abstracted and analyzed in a way that ties together end user experience, applications, operating systems, virtualization, hardware and the network components. Providing this single source of truth will reconcile technology silos, improve incident management processes, and build team work. Leveraging anomaly detection, automated baselines, minimum deviations, intelligent correlations and other advanced analytics will greatly streamline your ability to identity and resolve issues before they escalate into big problems. Pivoting from a global portal deep into raw forensic data, such as a right-click into packet analysis or transaction analysis, is critical to go from knowing about a problem to quickly troubleshooting and resolving it.

Developing Allies

We will always have different, specialized groups within IT organizations charged with overseeing the multitude of specialized systems, such as end-user experiences, applications monitoring, database monitoring, transactions mapping and infrastructure monitoring. What we must do is work cohesively as a team to streamline performance monitoring and troubleshooting processes so we can reduce Mean Time to Repair (MTTR) while increasing business velocity. Integrated performance portals enable each team member to focus on the right balance of factual information and teamwork so they can achieve the company’s shared goal of high business velocity.




Edited by Dominick Sorrentino
FOLLOW US

Subscribe to InfoTECH Spotlight eNews

InfoTECH Spotlight eNews delivers the latest news impacting technology in the IT industry each week. Sign up to receive FREE breaking news today!
FREE eNewsletter

infoTECH Whitepapers