This article is mainly to continue the previous Microsoft Decision tree Analysis algorithm, the use of another analysis algorithm for the target customer group mining, the same use of Microsoft case data for a brief summary.
Application Scenario Introduction
In the previous article, we used the Microsoft Decision tree Analysis algorithm to analyze the customer attributes in the orders that have taken place, and we can get some important information, here is a summary:
1, the most important factors affecting the purchase of bicycle behavior are: whether there is a car in the home, followed by age, again is the region
2, through the folding tree for the comparison of customers who want to buy a bike group characteristics are mainly: Home no car, age at 45 years old, not in North America, Home and no children (rice country inside the cock silk level),
There is also a car at home, age between 37 to 53, commuting distance of less than 10Miles, home children less than 4, and then the annual income of more than 58000$ (rice country of high-rich handsome)
In fact, the most important application scenario of decision tree algorithm is to analyze the order of the factors affecting some behavior, through which we can know that certain groups of people will have a few more significant properties, such as the family has no car, age, etc., but we want to analyze this part of the specific group of its unique properties can not be done, To analyze the common attributes shared by this particular group requires that our Microsoft Clustering algorithm appear today, simply to say: things are divided into categories, flock together, clustering algorithms we want to find those who are going to buy bicycles in the customer group have what properties, For example, when we enter the square in the evening will see, the square Aunt Group, children in a group, playing basketball group, there are a group of couples in the square side of the dark woods and so on, and they are different between these teams, if you want to sell children's toys ... That group is the natural thing you want to get close to.
Technical preparation
(1) We also take advantage of Microsoft's case Data Warehouse (ADVENTUREWORKSDW2008R2), two fact sheets, a history of historical purchases of bicycle records, and another one that we are going to dig to collect the people who may have purchased bicycles. You can refer to the previous article
(2) VS, SQL Server, Analysis Services Nothing to introduce, the installation of the database when the full selection is possible.
Let's move on to the topic, and we'll continue to take advantage of the last solution, followed by the following steps:
(1) Open the solution and go to the mining model template
We can see that a decision tree algorithm already exists, and we will add another algorithm.
2. Right-click the Structure column, select New mining model, enter a name to
Click OK so that our newly created cluster analysis will increase in the mining model, where we use the same primary key as the decision tree, the same prediction behavior, the input column is also, can be changed.
Next, the deployment processes the mining model.
Results analysis
Also in this we use "Mining Model Viewer" to view, here Mining model we choose "Clustering", which will provide four tabs, the following we introduced in turn, Direct blueprint:
In this we choose to take place in the group to buy bicycles, the most color for the most will buy bikes in the group, the arrow we have shown, the same we can find the most do not want to buy a bike of a group of people, that is, "classification four", the strength of the lines between them to indicate the relationship between strong and weak, Of course, in order to remember that we can change their name, directly select the class, right-click Rename.
For example, the following we have to do is to analyze what the characteristics of these groups, of course, we are most concerned about: the most want to buy a car of a group of people, do not want to buy a car can also, as for the group, passers-by group A, b ... It's all soy sauce, and we're not going to analyze it.
We open the "classification profile" to see:
Ha... The characteristics of these groups have been shown, if the data for a long time, will have a visual acuity of the chart, the data should also maintain a specific sense of smell. Tonight I do not analyze the chart of the most want to buy a car of the characteristics of the silver, tomorrow continue to analyze, and see can help me to simple analysis, the same first a few of the structure of the picture:
Tomorrow night the results are analyzed, and the characteristics of the two algorithms are compared and analyzed. Be interested in big data don't forget your "recommendation" Oh.
The power of data mining: I knew you'd do it!
(not to be continued .....) )
(original) Big Data era: a summary of knowledge points based on Microsoft Case Database Data Mining (Microsoft Clustering algorithm)