At Cloud Expo in New York City this week, the buzz is obviously about cloud, but the other hot topic (as if you haven’t heard it yet) is Big Data – the vast amounts of information being created everyday that businesses are leveraging for critical insights that were before unseen.
At this point, business leaders are starting to realize they can use the cloud to glean key insights into the data they own, but figuring out how to master that information and actually use it for business intelligence is the next step. Organizations in every industry, regardless of size or geography are embracing cloud computing as a way to reduce the complexity and costs associated with traditional IT approaches.
During the “Big Data and The Cloud” session on Tuesday afternoon at the Javits Center, IBM’s (News - Alert) Vice President of Big Data and Information Management, Anjul Bhambhri, discussed the next phase in Big Data: determining how to monetize the data companies have.
According to Bhambhri, cloud-based, “real-time” analytics can help save more than 50 percent in business insight-related costs, and collaborative business process services can help increase employee productivity by 25 percent.
But first, a look at where this so-called Big Data is coming from…
By its nature, it is considered “noisy” data, with no structure, growing in very large volumes. The numbers are staggering, in fact: 12+ billion terabytes of tweet data are created every day; 25 billion terabytes of Facebook (News - Alert) log data is created every day; there are 4.6 billion camera phones worldwide; there are 76 million smart meters worldwide; and there were over two billion people on the Web by the end of 2011.
“Analyzing this data doesn’t lend itself well to the structured world – the character of the data makes it difficult,” Bhambhri said. “There is this desire that the data needs to be leveraged, but something is preventing that from happening.”
Clearly there is extremely large volume of data being generated, but how do companies analyze data that has no structure?
According to Bhambhri, this new era of computing requires information from everywhere, radical flexibility and extreme scalability. Most companies are only leveraging a subset of the data that they are using in the enterprise and don’t have the capabilities to look at the whole pool. The challenge for organizations lies in the makeup of the data – it’s often fractured so it doesn’t lend itself to existing databases to analyze the data and leverage it for business use.
With log data, for example, “the challenge is not just when you are done analyzing the infrastructure, but it has to be able to scale because there is more data coming in,” Bhambhri said.
Furthermore, exploring Big Data and using it for insights needs to go beyond IT – other areas of the business are involved, including the marketing division, financial analysts, billing dept., etc.
The data pouring into all areas of businesses is not new, but the volume is unprecedented.
Adding to the complexity is the rise of smartphones, which is increasing the volume of call data among businesses.
“It’s not that the data is new, there are business reasons why it’s important that this data be analyzed, ingested and that insights be gained from this data,” said Bhambhri. “The biggest line of business wants to have access to these insights.”
Getting this data into some kind of a cluster is the first challenge. Having a Big Data platform is where cloud is playing a role – organizations need to have this data in one central place and all of the data integrates into existing siloes.
“Creating another silo amidst all of these other siloes that we already have is not going to solve the problem,” Bhambhri said. “But it’s very encouraging to see that there is technology now that we have the ability to build a platform that can handle this data. Applications can run on this data.”
However, most businesses today are taking a warehousing approach – and some valuable information can be lost in the transformation.
“If you have to build another application on top of that, that solution may not be desirable,” Bhambhri said.
Businesses need to not only ingest the data, they need to maintain the fidelity of that data, she added. “They couldn’t analyze this data before, couldn’t analyze it without the costs going through the roof. You can create a sandbox where you can explore this data, see what’s available and then go from there.”
Any Big Data platform in an enterprise has to be able to bring in data from a variety of sources and handle large volumes so that more mission-critical apps can ride on these platforms. Characteristics of a Big Data platform need to be able to:
“Slicing and dicing of data alone is not sufficient,” concluded Bhambhri. “If [the data] cannot be easily visualized to see the patterns emerging from this data, it is not very useful. At a very high level, these are the characteristics that are a must as we are dealing with Big Data.”