Data mining methodology and implementation steps

Source: Internet
Author: User

1. Industry Data Mining methodology

2, in the work, we carry out the guidance method of data mining implementation:

Eight-Step application modeling: Business understanding, indicator design, data extraction, data exploration, algorithm selection, model evaluation, model release, model optimization

Step One: Business understanding

Common misunderstanding: Many people think that there is no need to identify problems and goals beforehand, as long as the data using data mining technology, and then the analysis of the results of mining search and interpretation, will naturally find some of the previously we do not know, useful laws and knowledge.

Process: Business Research, problem positioning--Business analysis

Step Two: Indicator design

Based on the analysis of business problems, find the appropriate analysis method or methodology to guide the design of model indicators, to ensure that the indicators are systematic and comprehensive.

Some common methods of analysis

Step Three: Extract data

Data extraction ensures the integrity, availability, and integrity of modeling data.

Data extraction: Extracting the data needed for modeling

Data cleansing: Missing processing Extreme Value data processing error data processing redundancy data processing

Data Audit: Data statistics error audit data source error audit data statistics caliber audit

Data integration: Building wide tables of data mining

Step Four: Data exploration

The data exploration mainly involves two work: first, carries on the data examination, the analysis, verifies whether conforms to the target design original intention and the business meaning; second, according to the modeling needs to do some standardization of data processing, so that different indicators on the same dimension of mathematical operations.

Step five: Algorithm selection
According to the modeling scenario, the algorithm chooses: such as: Description class has classification rules, cluster analysis, prediction class has, neural network, decision tree, time series, regression analysis, association analysis, Bayesian network, deviation detection, evaluation class has factor analysis, principal component analysis, mathematical formula, and combined with data conditions (such as discrete values, continuous values, Data size), and select the appropriate algorithm.

Step Six: Model evaluation

Step Seven: Model release

Focus on business issues to provide end-to-end thematic solutions, improve the effectiveness and value of data mining applications, is a set of end-to-end, complete data mining solutions, rather than pure data mining results

Step eight: Model optimization

Model Initial Construction: Model verification

Model Rise period: Model optimization based on model validation and business conditions

Model maturity: Model accuracy achieves corresponding precision, stable and mature leading business development

Model recession: The development model that accompanies the business no longer applies to the new business environment and gradually stops.

Data mining methodology and implementation steps

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.