Big Data Buyer’s Guide, Part Two: IBM, SAS, Pentaho and More
Updated · Feb 27, 2012
In Part One of this series on Big Data solutions, we compared Oracle Exalytics and SAP HANA. But what about the alternatives? SAP and Oracle aren’t the only games in town. IBM, SAS, Pentaho, Birst, Terracotta and others are moving forward with their own approaches to Big Data.
“Users face the challenges of analyzing zettabytes of information and scaling existing data warehouses and analytics to match new data types within Big Data,” said Wayne Kernochan, an analyst at Infostructure Associates.
A torrential downpour of data has businesses worried. A recent IBM/MIT Sloan Management Review survey of 3,000 executives found that 60 percent felt they had more data than they could effectively use. Further, 71 percent of marketing officers admitted their employers were unprepared to deal with the explosion of Big Data.
Here’s what a few companies are doing about it:
SAS was doing analytics long before companies like Oracle and SAP cottoned onto the term. Now it is taking its savvy into the Big Data space. Its most recent product in this arena is SAS High-Performance Analytics , in-memory analytic software that runs on a Teradata or Greenplum appliance.
“Big Data analytics is an extension of what we have been doing for a while,” said Keith Collins, chief technical officer at SAS. “Big Data in and of itself is not the biggest issue, but good information management practices are crucial to managing and finding value in Big Data.”
Other vendors tend to consider Big Data as part of a technology discussion related to Hadoop, NoSQL and other data processing methods, Collins said. SAS, on the other hand, focuses on the information management side by providing a strategy and supporting solutions that allow Big Data to be analyzed, regardless of the storage mechanism or technology employed.
“It’s not just Big Data, it’s what you do with the data to improve decision making that will result in business gain,” Collins said. “SAS High-Performance Analytics helps organizations find value in Big Data, solving difficult problems.”
Collins said SAP HANA uses an in-memory database to process high volumes of transactional data and queries and reports. Oracle Exalytics’ in-memory hardware and software system augments Oracle business intelligence software with data discovery capability. He characterized both as employing reactive query and reporting and providing descriptive statistics rather than proactive advanced analytics and optimization.
“It’s the difference between a rear view mirror and knowing what’s down the road,” he said. “Unlike other offerings, SAS HPA can perform analyses that range from descriptive statistics and data summarizations to model building and scoring new data at breakthrough speeds.”
IBM offers an awful lot of Big Data analytics products such as InfoSphere Streams, Business Analytics: Cognos 10 , Cognos Consumer Insight , Netezza MPP Data Warehousing , IBM Smart Analytics System and IBM SPSS. IBM considers its Big Data strategy more client-focused than SAP HANA, Oracle Exalytics and other competitors.
“IBM deals with all aspects of Big Data including data in motion as well as data at rest, structured and unstructured data,” said Leon Katsnelson, IBM director of Big Data Cloud Initiatives. “The IBM platform does not treat Big Data as an island and instead allows our clients to integrate it with their existing business processes and to augment their existing data warehousing solutions.”
In addition, IBM has a Hadoop-based analytics product known as IBM InfoSphere BigInsights. BigInsights software is the result of a four-year effort of more than 200 IBM Research scientists and provides a framework for large-scale parallel processing and scalable storage for terabyte to petabyte-level data. It incorporates unstructured text analytics and indexing that allows users to analyze rapidly changing data formats and types on the fly.
BigInsights can be used with software called InfoSphere Streams that analyzes data entering an organization and monitors it for changes that may show new patterns or trends. In May IBM announced enhancements to Streams that make it possible to analyze Big Data such as Tweets, video frames, GPS, and sensor and stock market data up to 350 percent faster than before. BigInsights complements Streams by applying analytics to an organization’s historical data as well as data flowing through Streams.
Speaking of Hadoop, Pentaho was one of the early players in the Hadoop/analytics space. In a short time, Hadoop has become something of a Big Data darling – having gained fame as the underlying logic behind massive Google analytics systems. Everyone, except perhaps SAS, is jumping on the Hadoop bandwagon.
“Most of our conversations about Big Data are with organizations using technologies such as Hadoop and No-SQL databases that are alternatives to Oracle/SAP offerings,” said James Dixon, who holds the unconventional title of Lord of the 1s and 0s at Pentaho. “Oracle Exalytics and SAP HANA products seem to appeal to locked-and-loaded Oracle and SAP customers.”
Noting that SAP claims HANA is 3,600 times faster than regular SAP ERP queries, Dixon’s take is that HANA is therefore a fix for an SAP performance problem experienced by its customers. “I’m not saying that HANA is not good technology, just that it wasn’t necessary invented to solve general Big Data problems,” he said.
As for Oracle, he said he had talked to various organizations that already have all the Oracle database capacity they can afford, yet can only store part of their data there and find it expensive to scale. “These companies go for Hadoop and the No-SQL databases because they provide scale-out solutions that are more flexible/granular,” he said.
Drew Robb is a writer who has been writing about IT, engineering, and other topics. Originating from Scotland, he currently resides in Florida. Highly skilled in rapid prototyping innovative and reliable systems. He has been an editor and professional writer full-time for more than 20 years. He works as a freelancer at Enterprise Apps Today, CIO Insight and other IT publications. He is also an editor-in chief of an international engineering journal. He enjoys solving data problems and learning abstractions that will allow for better infrastructure.