PMML Introduction
Now sensors are becoming ubiquitous, ranging from smart home instrumentation to monitoring of deepwater oil drilling equipment and structures. For all the data collected from these sensors to work, predictive analysis calls for open standards that take into account the conditions that enable the system to communicate unimpeded by private code and incompatibility barriers. PMML is the standard used to perform predictive analysis or data mining models. With PMML, a predictive solution can be built on one system and then deployed to another location to work quickly.
For the oil and chemical industry, predictive maintenance refers to an application that can preprocess data collected from sensors and then use the data to build predictive solutions that can detect problems before a mechanical failure occurs. Following the oil spill in the Gulf of Mexico, predictive analysis and open standards can also provide another tool for ensuring security and handling reliability.
As a factual standard for performing predictive solutions, PMML allows models and data transformations to behave in the same simple way. When used to represent all the calculations that make up a predictive solution, PMML is not only a bridge between data analysis, model building and deployment systems, but also a bridge between all the people and teams within the company that are involved in the analysis process. This is extremely important because it can be used to disseminate knowledge and best practice methods while ensuring transparency.
Predictive Modeling Technology
This section focuses on all the predictive modeling techniques involved in the specific PMML element. Although there are countless different technologies each year, they need to be recognized and adopted by the vast number of data-mining community participants before they become standard. PMML released the 4.0 version in 2009, which sets specific elements for the following modeling or statistical techniques:
Association rules: Associationmodel elements
Cluster Model: Clusteringmodel elements
Decision Tree: TreeModel element
Na ve Bayes classifier: NA vebayesmodel element
Neural network: Neuralnetwork elements
Regression: Regressionmodel and Generalregressionmodel elements
Rule set: Rulesetmodel element
Sequence: Sequencemodel elements
Support Vector machines: Supportvectormachinemodel elements
Content model: Textmodel elements
Time series: Timeseriesmodel elements