EMC Addressing Dearth of Big Data Talent with Kaggle Partnership

Ann All

Updated · Oct 25, 2012

While it’s easy to find marketing hype about Big Data, it’s hard to find professionals who understand what it is and know what to do with it.

EMC, the enterprise storage company that has been staking a claim in the burgeoning Big Data market since its 2010 acquisition of Greenplum, is trying an unusual crowdsourced approach to address the dearth of Big Data pros.

EMC/Greenplum earlier this week announced a partnership with Kaggle, an online community that hosts data modeling competitions, in which companies can use Greenplum’s Chorus technology to tap into the expertise of the 55,000-plus data scientists using Kaggle.

EMC/Greenplum integrated Kaggle into Greenplum Chorus, a collaboration platform based on the technology of Pivotal Labs, a company it acquired last March. Josh Klahr, Greenplum’s vice president of Product Management, described Chorus as “a Facebook-like social collaboration tool for data science teams to iterate on the development of datasets and ensure that useful insights are delivered to the business quickly.”

Big Data Meetup

According to EMC, Chorus users can search, browse and drill into the profiles of Kaggle community members who have opted to receive consulting opportunities through the Greenplum Chorus platform. An integration of Chorus and Kaggle APIs allows them to send messages to Kaggle members. Kaggle certifies Chorus as the source of the messages and forwards them to the desired recipients. Kaggle members review the messages and respond directly to Chorus users to further discuss details.

Anthony Goldbloom, founder and CEO of Kaggle, said the partnership creates an opportunity for Kaggle’s community of data scientists to parlay their skills into contract work for companies seeking help with Big Data projects. “Teaming with EMC Greenplum opens up new and exciting opportunities to existing and future Kaggle community members. The partnership also helps to solve the acute shortage of elite data scientists, which prevents companies from taking full advantage of their data,” he said.

Goldbloom said Kaggle members are “able to tackle difficult problems, precisely the kind of problems that you expect Greenplum customers to be dealing with — unstructured text data, graph data, data sets missing values, for example.”

EMC/Greenplum expects the Chorus and Kaggle integration to be available in November.

EMC is also releasing the Greenplum Chorus source code under the Apache 2.0 open source license. The OpenChorus Project “will speed innovation and adoption of collaborative data science practices, helping organizations to drive greater business insight and economic value from Big Data,” according to EMC.

Ann All is the editor of Enterprise Apps Today. Follow Enterprise Apps Today on Twitter @EntApps2Day.

Ann All
Ann All

Public relations, digital marketing, journalism, copywriting. I have done it all so I am able to communicate any information in a professional manner. Recent work includes creating compelling digital content, and applying SEO strategies to increase website performance. I am a skilled copy editor who can manage budgets and people.

More Posts By Ann All