A preliminary study on data mining of "Bi's little Thing"

Source: Internet
Author: User

A preliminary study of data mining in the "Bi Thing"

What is data mining?

    • Data Mining, also known as Information Discovery (Knowledge Discovery), is the use of automated or semi-automated methods to find potentially valuable information and rules in the data.
    • Data mining technology originates from database, statistic and artificial intelligence.

What data mining can do

Analyze the large amount of data generated in the enterprise and find out the hidden rules
Get a clearer picture of your current business health
The scientific basis for decision-makers to grasp the direction of the future
Forecast Sales

    • Send a message to a specific customer
    • Identify products that may require tying
    • Find the sequence in which a customer puts a product into a shopping cart
    • ......

Data mining algorithms
Data mining is the process of extracting knowledge from certain forms of data, whose main task is to describe, classify and predict the data. The data prediction techniques commonly used in data mining include linear regression, least squares and neural networks.

About Analytics Services Another interesting thing is data mining, in business intelligence, data mining is one of the highest levels. The big data that is popular now, in the end often also relies on data mining to embody its value.

If the BI process can be seen as data yesterday, today and tomorrow, data yesterday, through the report to tell you what happened before the business, data today, through multidimensional analysis and other tools to tell you why these occur, then the data tomorrow, is through the data mining algorithm, to the existing massive historical data mining , so you know what your business will be like in the future.

Microsoft's data mining tools include many algorithms, such as Bayesian, decision Trees, association rules, and timing analysis.
Data mining analyzes sample data, discovers rules from it, and then uses predictions for future unknown data. This is often used for issues such as product recommendations for e-commerce sites, potential customer analysis, and customer classification.

Serial number

Data mining Technology

Description

1

Microsoft Naive Bayes

Bayesian model

The Microsoft Naive Bayes algorithm considers all input properties to be independent and calculates the probability of each pair of input attribute values and predicted attribute values. This algorithm can be used for classification and prediction.

2

Microsoft Association Rules

The Microsoft Association algorithm uses correlation statistics between individual property values or transaction items to analyze the data.

3

Microsoft Cluster analysis

The Microsoft clustering algorithm finds a natural grouping of data in multidimensional representations of attribute values. This algorithm is useful when you need to find a general grouping.

4

Microsoft Decision Tree

The Microsoft decision Tree algorithm is a classification algorithm that fits predictive modeling. The algorithm supports prediction of discrete attributes and continuous attributes.

5

Microsoft Logistic regression

The Microsoft logistic regression algorithm is a regression algorithm suitable for regression modeling. This algorithm is one of the Microsoft neural network algorithms, and is obtained by eliminating hidden layers. The algorithm supports the prediction of discrete attributes and continuous attributes.

6

Microsoft Neural Network

Microsoft Neural Network algorithm

7

Microsoft Time Series

The Microsoft timing algorithm can analyze time-related data to discover patterns based on timing analysis, such as monthly sales patterns and annual profit patterns.

8

Microsoft Sequential analysis and cluster analysis

The Microsoft sequential clustering algorithm combines two other data mining techniques: Sequential analysis and cluster analysis. This algorithm analyzes the sequence-related patterns and clusters them.

9

Microsoft linear regression

The Microsoft linear regression algorithm is a regression algorithm suitable for regression modeling. This algorithm is one of the Microsoft decision tree algorithms, and is obtained by disabling the split (the entire regression formula is placed in a single root node). The algorithm supports the prediction of continuous attributes.

The process of data mining, like other IT projects, can probably be divided into the following processes. First, define the problem, then prepare and explore the data, then build and validate the model, and finally deploy and update the model.

This process is not necessarily the end of a breath, such as the discovery of data that is not needed in the model and the need to re-prepare the data, or to identify problems during the model validation phase, you may need to redefine the model.
The query statement used by Data mining is DMX, which can be used to create and process mining models and make prediction queries.

A preliminary study on data mining of "Bi's little Thing"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.