Predictive analytics PMML
In an attempt to provide a standard
language for expressing predictive
models, the Predictive Model Markup
Language (PMML) has been proposed.
Such an XML-based language provides a
way for the different tools to define
predictive models and to share these
between PMML compliant applications.
PMML 4.0 was released in June, 2009.
Data mining Standards
For exchanging the extracted models – in
particular for use in predictive analytics –
the key standard is the Predictive Model
Markup Language (PMML), which is an
XML-based language developed by the
Data Mining Group (DMG) and supported
as exchange format by many data mining
Data mining Commercial data-mining software and applications
IBM DB2 Intelligent Miner: in-database
data mining platform provided by IBM, with
modeling, scoring and visualization
services based on the SQL/MM - PMML
Oracle Data Mining - PMML
In Release 11gR2 (, ODM
supports the import of externally-created
PMML for some of the data mining
models. PMML is an XML-based standard
for representing data mining models.
SQL Server Analysis Services - Data definition language (DDL)
For data mining models import and export, it
also supports Predictive Model Markup
In-database processing - Translating Models into SQL Code
Many analytic model-building tools have
the ability to export their models in either
in SQL or PMML (Predictive Modeling
Markup Language)
Predictive Model Markup Language
The 'Predictive Model Markup Language'
('PMML') is an XML-based file format
developed by the Data Mining Group to
provide a way for applications to describe
and exchange statistical model|models
produced by data mining and machine
learning algorithms. It supports common
models such as logistic regression and
feedforward neural networks.
Predictive Model Markup Language - PMML Components
A PMML file can be described by the following
components:A. Guazzelli, M. Zeller, W. Chen, and
G. Williams. [ PMML:
An Open Standard for Sharing Models]. The R
Journal, Volume 1/1, May 2009.A. Guazzelli, W.
Lin, T. Jena (2010).
[ PMML in
Action (2nd Edition): Unleashing the Power of
Open Standards for Data Mining and Predictive
Analytics]. CreateSpace.
Predictive Model Markup Language - PMML Components
* 'Header (computing)|Header': contains
general information about the PMML
document, such as copyright information
for the model, its description, and
information about the application used to
generate the model such as name and
version. It also contains an attribute for a
timestamp which can be used to specify
the date of model creation.
Predictive Model Markup Language - PMML Components
* 'Data transformation (statistics)|Data
Transformations': transformations allow for
the mapping of user data into a more
desirable form to be used by the mining
model. PMML defines several kinds of
simple data transformations.
Predictive Model Markup Language - PMML Components
* 'Predictive modelling|Model': contains the
definition of the data mining model. E.g., A
multi-layered feedforward neural network
is represented in PMML by a
NeuralNetwork element which contains
attributes such as:
Predictive Model Markup Language - PMML Components
Besides neural networks, PMML allows
for the representation of many other
types of models including support
vector machines, association rules,
Naive Bayes classifier, clustering
models, Text mining|text models,
Decision tree learning|decision trees,
and different Regression
analysis|regression models.
Predictive Model Markup Language - PMML Components
** Outlier Treatment (attribute outliers):
defines the outlier treatment to be use. In
PMML, outliers can be treated as missing
values, as extreme values (based on the
definition of high and low values for a
particular field), or as is.
Predictive Model Markup Language - PMML Components
In PMML 4.1, all the built-in and
custom functions that were originally
available only for pre-processing
became available for post-processing
Predictive Model Markup Language - PMML 4.0, 4.1 and 4.2
PMML 4.0 was released on June 16,
2009.[ Data Mining Group
website | PMML 4.0 - Changes from
2009/06/pmml-40-is-here.html Zementis
website | PMML 4.0 is here!]R. Pechter.
sues/11-1-2009-07/p3V11n1.pdf What's
PMML and What's New in PMML 4.0?]
The ACM SIGKDD Explorations
Predictive Model Markup Language - PMML 4.0, 4.1 and 4.2
* Model Explanation: Saving of evaluation and
model performance measures to the PMML file
Predictive Model Markup Language - PMML 4.0, 4.1 and 4.2
PMML 4.1 was released on December 31,
2011.[ Data Mining Group
website | PMML 4.1 - Changes from
PMML 4.0][ Predictive
Analytics Info website | PMML 4.1 is here!]
Predictive Model Markup Language - PMML 4.0, 4.1 and 4.2
* Simplification of multiple models. In
PMML 4.1, the same element is used to
represent model segmentation, ensemble,
and chaining.
Predictive Model Markup Language - PMML 4.0, 4.1 and 4.2
The latest version of PMML, 4.2, was
released on February 28, 2014.
Data Mining Group website | PMML 4.2 Changes from PMML
4.1][ Predictive
Analytics Info website | PMML 4.2 is here!]
