Twitter LinkedIn Facebook RSS Android

PMML Makes Predictive Analytics and Data Mining Easier: Page 2

By Herman Mehling     Feedback
Previous |   Page 2 of 2   |   1 2

Building Analytics Models That Can Be Shared

Sharing models between applications is key to the success of predictive analytics. But to be able to share a model, you first need to build it. Model building is composed of several phases, including an exhaustive data analysis phase.

"In this phase, you slice and dice raw data and select the most important pieces of information for model building," said Guazzelli.

Raw and derived fields are then used for model training. Typically, only a fraction of the data fields looked at during the analysis phase are used to build the final model.

Once the model is complete, the next task is to test its performance against a test data set. This may last several weeks, depending on the complexity of the problem people are trying to solve.

"When you put a predictive analytic model to work, you usually expect it to do its job for months or years until it needs to be refreshed, most probably because of performance deterioration," said Guazzelli. Then another model is built and deployed in place of the older one.

Without a language such as PMML, deploying predictive solutions would be difficult and cumbersome, as different systems represent their computations in different ways.

"Every time you move a model from one system to another, you go through a lengthy translation process which is prone to errors and misrepresentations," said Guazzelli.

With PMML, the process is straightforward. From application A to B to C, PMML allows predictive solutions to be easily shared and put to work as soon as the model building phase is completed.

"For example, you might build a model in IBM SPSS Statistics and instantly benefit from cloud computing where you can deploy it in ADAPA, the Zementis predictive decisioning platform," said Guazzelli.

Or you can move it to IBM InfoSphere, where it will reside close to the data warehouse, or you can move it to KNIME, an open-source tool for building and visualizing data flows from the University of Konstanz in Germany, said Guazzelli.

This is the power of PMML: enabling true interoperability of models and solutions between applications. PMML also allows IT folks to shield end-users from the complexity associated with statistical tools and models.


This article was originally published on November 29, 2010
Previous |   Page 2 of 2   |   1 2
Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date