Open Source R Language Could Revolutionize Business Intelligence

Herman Mehling

Updated · Aug 30, 2010

The R programming language could be coming to a workplace near you — if it hasn’t arrived already. The big deal about R is that it can analyze Big Data, those exploding data sets that have traditionally defied analysis.

R is the brainchild of Ross Ihaka and Robert Gentleman (known as “R” and “R”), academics at the Department of Statistics at the University of Auckland, New Zealand. Since Ihaka and Gentleman wrote the original R paper in 1993, R has become the lingua franca of analytic statistics among students, scientists, programmers and data managers.

R is a GNU project similar to the S statistical programming language and environment, which is often the vehicle of choice for analytic statistics. R provides an open source route to S and adds some unique capabilities. One of R’s greatest strengths is the ease with which it can create well-designed publication-quality plots with mathematical symbols and formulae.

After long use in academia, R only recently began to appear in the business world. Among the vendors bringing R into the commercial realm are SAS, Netezza (NYSE: NZ), Revolution Analytics and IBM (NYSE: IBM), which acquired SPSS.

While IBM and SAS are the 800-pound gorillas in the business analytics market, they have been slow to evangelize R. Instead, the boosterism is flowing from Revolution Analytics, a startup founded by Norman Nie.

More than 30 years ago, Nie co-invented the Statistical Package for Social Sciences (SPSS), which marked the beginning of analytic and predictive statistical software. Now Nie is championing R.

In just two years, Nie’s new company has won blue-chip customers such as Bank of America, Motorola and Pfizer.

Earlier this month, Revolution Analytics introduced ‘Big Data’ analysis to its Revolution R Enterprise software, taking R to what it claims are unprecedented levels of capacity and performance for analyzing very large data sets.

The company says R users will be able to process, visualize and model terabyte-class data sets in a fraction of the time of legacy products, without the need for expensive or specialized hardware.

This Big Data scalability will help R transition from a research and prototyping tool to a production-ready platform for enterprise applications such as quantitative finance and risk management, social media, bioinformatics and telecommunications data analysis, said Nie.

On its website, Revolution Analytics has published performance and scalability benchmarks for Revolution R Enterprise analyzing a 13.2 gigabyte data set of commercial airline information containing more than 123 million rows and 29 columns.

The new version of Revolution R Enterprise introduces an add-on package called RevoScaleR that provides a new framework for fast and efficient multi-core processing of large data sets, a capability that Revolution Analytics says sets it apart from other vendors.

 

More Posts By Herman Mehling