PMML Makes Predictive Analytics and Data Mining Easier
Updated · Nov 29, 2010
Predictive analytics — the art of mining and analyzing historical data patterns to predict the future — is not a common term among IT types, let alone consumers. Yet predictive analytics is practiced widely, and its use affects millions of consumers and businesses every day.
“Every time you swipe your credit card or use it online, a predictive analytic model checks the probability of that transaction being fraudulent,” said Alex Guazzelli, vice president of analytics at Zementis, a developer of predictive analytics software.
“If you rent DVDs online, chances are a predictive analytic model recommends a particular movie to you,” he said. “Predictive analysis is already an integral part of your life and its application is bound to assist you even more in the future.”
Predictive analytics is also used in sensors in bridges, buildings, industrial processes and machinery, generating data and making predictions to alert people about potential faults and problems before they occur. Its many uses also include healthcare, financial services and insurance.
PMML: The Predictive Analytics Standard
Predictive analytics leverages techniques from statistics, data mining and game theory to help individuals analyze current and historical facts so they can make predictions about future events.
The Predictive Model Markup Language (PMML) is the de facto standard to represent predictive analytic models and is currently supported by all of the top commercial and open source statistical tools.
The language is supported by top business intelligence and analytics vendors like IBM, SAS, MicroStrategy, Oracle and SAP, and NASA and Visa can also be found on the member list.
PMML enables the instant deployment of predictive solutions. Within a company, PMML can be used as the lingua franca not only between applications, but also between divisions, service providers and external vendors. In this scenario, it becomes the standard that defines a single clear process for the exchange of predictive solutions.
PMML represents a myriad of predictive modeling techniques, such as Association Rules, Cluster Models, Neural Networks and Decision Trees. These techniques empower users around the globe to extract hidden patterns from data and use them to forecast behavior.
The beauty of the XML-based language is that it allows people to easily share predictive analytic models between different applications, said Guazzelli.
“Therefore, you can train a model in one system, express it in PMML, and move it to another system, where you can use it to predict, for example, the likelihood of machine failure,” he said.
The PMML project is the invention of the Data Mining Group, a vendor-led committee composed of commercial and open source analytics companies and government and academic users. Consequently, most of the leading data mining tools today can export or import PMML. A mature standard that has evolved over the last 10 years, PMML can represent not only the statistical techniques used to learn patterns from data such as artificial neural networks and decision trees, but also pre-processing of raw input data and post-processing of model output.