3 Tips for Solving Big Data Skills Shortage
More schools are creating data analytics programs, but many companies can't wait for help with their Big Data projects. Here's how to deal with a Big Data skills shortage.
Shortages of scarce skills usually create two things: higher salaries for the desired skills and creative approaches for increasing the supply of folks who possess those skills.
Professionals with Big Data analytics skills are certainly commanding high salaries. As Eric Lindquist notes, writing for eWEEK, graduates of an Advanced Analytics program at North Carolina State University typically receive multiple job offers paying six-figure salaries. Often these graduates are employed as data scientists, a role that a New Vantage Partners survey defined as "applying varying degrees of statistics, data visualizations, computer programming, data mining, machine learning, and database engineering to solve complex data problems."
Specialized academic programs like the one at North Carolina State are becoming more common. A New York Times article published earlier this year mentions Columbia, the University of San Francisco, Stanford, New York University, Northwestern, Syracuse, University of California at Irvine, George Mason and Indiana University as among schools offering data science programs.
In addition vendors with skin in the Big Data game, including Teradata and IBM, are partnering with academic institutions to offer programs in Big Data analytics. The Teradata University Network is a Web portal that offers data analysis research, software and other training materials to participating professors and students. IBM provides access to software, data sets and data scientists to more than 1,000 universities through its academic partner program.
But many companies need data scientists now and cannot wait for more schools to produce more graduates qualified for this kind of work. The good news is, academic programs are not the only way to address the data analyst shortage.
Build a Data Science Team
I interviewed the global head of R&D at Opera Solutions, a provider of data analytics software and consulting services that uses a team approach to fill its need for data scientists. Its data science teams include scientists responsible for creating algorithms; analytics managers who provide guidance to scientists, in a role similar to an academic advisor at a university; and project managers, who manage scheduling, costs and deliverables from a broader business perspective.
Opera Solutions, which earned press attention after a team of its data scientists nabbed the runner-up prize in a 2009 contest to create an improved search algorithm for Netflix, is organized into business units that serve specific verticals, such as health care, insurance and supply chain/operations. Each unit is led by a business head, who handles business development and budgets, and a science head, who helps develop analytic technologies and works closely with analytic teams.
Train Your Own Data Scientists
I also interviewed the president and CEO of Utopia, a provider of enterprise data solutions that has created its own intensive three-month training program for data consultants. A key part of the program is pairing consultants-in-training with veteran consultants.
"Instead of simply learning about the technology, you get to see how it works and interacts with different systems in a production setting. You also get exposed to an entire team of people with different experiences to learn from," said Mike Nicely, who became a data migration consultant with Utopia after completing the program.
Use Technology to Fill Big Data Gaps
In addition to recruiting data scientists from academic programs and creating them with their own internal training programs, companies are getting technical help from vendors that can help fill the Big Data gaps.
A piece on Data Informed discusses development platforms that offer frameworks for building Big Data analytics applications. Such tools work by adding layers of abstraction to programming interfaces so developers can create Big Data apps without expertise in Hadoop, the popular open source software framework that many organizations use to process large data sets.
The CTO of Lotame, a provider of data management software for online marketing data, says his company has benefited from the use of a tool from Continuuity that helped it reduce its Big Data app development cycles and broaden its pool of qualified developer candidates.
"Finding engineers who can build apps on Hadoop effectively is a challenge. So one of the big benefits of Continuuity is you don’t need to know all that knowledge. The platform takes care of all that plumbing for you so that you can really concentrate on just building apps," says CTO Jeremy Pinkham.
Writing for Enterprise Apps Today, Infostructure Associates President Wayne Kernochan describes how many vendors provide "hooks" in their databases that allow SQL-like access to Hadoop stores in public clouds, which helps companies that want to access social media data about their customers, a popular Big Data use case. "Data virtualization tools, do this for most databases, and can combine the results in a common format as well. In fact, a data virtualization tool does much of the cross-optimization between Hadoop stores and between public Hadoop and in-house Big Data as well," he writes.
He also suggests several alternative approaches to using Hadoop for data warehouse needs. For example, companies can use a data virtualization tool to create a "virtual data warehouse" across existing databases -- including those for "mixed" and OLTP (online transaction processing) workloads, or together with a master data management solution.
Ann All is the editor of Enterprise Apps Today and eSecurity Planet. She has covered business and technology for more than a decade, writing about everything from business intelligence to virtualization.