Algorithms in Clementine12

Last Update:2014-11-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently, a friend asked me what algorithms are available in the Clementine12? The algorithm in the sensory Clementine12 is many, complete and the general classification (forecast, classification, Subdivision, association) is done according to the business purpose, so you just know what kind of problem your business problem is. With what algorithm to achieve their desired purpose can be based on the model in Clementine12, quickly find the mode you want;

The following figure is the algorithm for all data mining in Clementine12:

The following is Professor Shebangchang Data Mining (Mining) 10 Kinds of analysis methods, in order to facilitate the initial understanding of the model, but also daily excavation often encountered in the algorithm, hope for everyone useful! (Even with data mining companies, one of these algorithms can Shong)

1, the Memory Foundation reasoning Method (memory-based reasoning; MBR)

The main concept of memory based reasoning is to use known cases (case) to predict some attributes of future cases, often looking for the most similar cases to compare.

There are two main elements in the Memory foundation inference method, namely the distance function (distance functions) and the binding function (combination functions). The purpose of the distance function is to find the most similar case; the associative function combines the properties of similar cases for prediction purposes. The advantage of the memory based reasoning method is that it allows data of various types, which are not subject to certain assumptions. Another advantage is that it has the ability to learn, and it can acquire knowledge about new cases through the learning of old cases. What is more disturbing is that it requires a lot of historical data and enough historical data to make good predictions. In addition, the Memory foundation inference method is time-consuming and difficult to find the best distance function and binding function. It can be applied to the detection of deceptive behavior, customer response prediction, medical treatment, classification of reactions and so on.

2. Market shopping basket analysis (harsh basket analyses)

Shopping basket Analysis The main purpose is to find out what kind of things should be put together? The business application is to find out what customers are doing with the customer's buying behavior and why these customers are buying these products and finding the Associated Associations (association)

Rules, enterprises gain benefits and establish competitive advantage by exploiting these rules. For example, retail stores can then be analyzed to change the arrangement of goods on racks or to design business packages that attract customers.

Shopping basket Analysis The basic operating process consists of the following three points:

(1) Choose the right item: The correct point here is for the enterprise body, must be in hundreds, thousands of items to choose a truly useful items out.

(2) The association rules are excavated by the discussion of the common occurrence matrices (co-occurrence matrix).

(3) To overcome the actual constraints: the more choice of items, the calculation of resources spent longer (exponentially), at this point must use some technology to reduce the loss of resources and time.

Basket analysis techniques can be applied to the following issues:

(1) For credit card shopping, can predict what future customers may buy.

(2) for telecommunications and financial services, through shopping basket analysis can design different service mix to expand profits.

(3) The insurance industry can detect and prevent possible unusual insurance portfolios by shopping basket analysis.

(4) For patients, in the combination of treatment, shopping basket analysis can be used as a combination of these courses can lead to the basis for the diagnosis of complications.

3. Decision Tree (Decision Trees)

The decision tree has a strong ability to solve the classification and prediction, it expresses in the way of law, and these rules are expressed by a series of questions, and the result can be derived by asking questions. A typical decision tree has a root at the top and a lot of leaves at the bottom, which breaks records into different subsets, and the fields in each subset may contain a simple rule. In addition, decision trees may have different types, such as a two-dollar Tree, a ternary tree, or a mixed decision tree.

4. Genetic algorithm (genetic algorithm)

The gene algorithm learns the process of cell evolution, and cells can produce better new cells through continuous selection, replication, mating and mutation. The genetic algorithm works similarly, it must establish a model in advance, and then through a series of operations similar to the generation of new cell processes, using the appropriate function (fitness function) to determine whether the offspring with this pattern, and finally only the most consistent results can survive, This program works until the function converges to the optimal solution. The genetic algorithm has a good performance on the cluster (cluster) problem, which can be used to support the application of the memory base inference method and the neural network.

5, cluster detection technology (Cluster detection)

This technique covers a wide range of functions, including genetic algorithms, neural networks, and cluster analysis in statistics. Its goal is to identify previously unknown similar groups in the data, and in many of the analyses, cluster detection techniques have been applied to the beginning of research.

6. Link Analysis

The link analysis is based on the graph theory (graph germ) in mathematics, develops a pattern through the relation between records, it is the main body of the relation, it develops a lot of application from the relation of person and person, thing and thing or person and thing. For example, the telecommunication service industry can collect the time and frequency of using the telephone by the link analysis, and then infer the customer's usage preference, and put forward the plan in favor of the company. In addition to the telecommunications industry, more and more marketing operators also use the link analysis to do business-friendly research.

7. On-line analysis and processing (line analytic processing;olap)

Strictly speaking, online analysis processing is not a special data mining technology, but through online analytical processing tools, users can understand the hidden meaning of data more clearly. As with some visual processing techniques, it is more friendly to the average person to appear through charts and graphs. Such tools can also help to transform the data into the target of information.

8, Class neural network (neural NX)

A class of neural networks is a repetitive learning method, which gives a string of examples to be summed up in a sufficiently distinguishable style. In the face of new examples, the neural network can be summed up according to its past learning results, deduce new results, is a kind of machine learning. The related problems of data mining can be learned by neural learning, and the learning effect is correct and can be used as predictive function.

9. Difference Analysis (discriminant)

When the problem encountered is qualitative (categorical), while the independent variable (predictive variable) is quantitative (metric), the difference analysis is a very appropriate technique, which is usually applied to solve the problem of classification. If the variable is composed of two groups, it is called two-group-difference analysis (Two-group-discriminant), and if it is composed of many groups, it is called the Multivariate analysis (ListBox discriminant; MDA).

(1) to find the linear combination of predictive variables, the ratio of variation between groups is the largest, and each linear combination is irrelevant to the previous linear combination.

(2) Check whether the center of gravity of each group is different.

(3) Find out which predictive variables have the greatest difference ability.

(4) The subjects were assigned to a group according to the predicted variables of the new subjects.

10. Rogers regression analysis (Logistic analyses)

Rogers regression analysis is a good alternative when the group does not conform to the normal distribution hypothesis in the differential analysis. Rogers regression analysis is not a prediction of whether an event occurs, but rather a probability of predicting the event. It assumes that the relationship between the argument and the dependent variable is the shape of the S row. When the independent variable is very small, the probability value is close to zero; When the value of the variable increases, the probability value is increased along the curve, and the curve covariance begins to decrease, so the probability value is between 0 and 1.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Algorithms in Clementine12

Contact Us

Recommend Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support