Using Knimi to do customer segmentation
Sinom
20150801
Http://blog.csdn.net/shuaihj
First, test data
Need to test the data, please leave the mailbox
Second, calculate the consumption amount and the number of consumption
1. read in ( sales data. csv)
Reading column headers
2. time format conversion
Identify the order creation date column in the specified time format
3. group Plus and sum of amounts
"Sales Amount" according to customer number sums
4. field Renaming is more readable
Statistical Results
5. counting the order groups
"Sales order Number" is de-sums based on customer number
6. field Renaming is more readable
Statistical results
7. Connect to query customer's consumption amount and consumption times
Setting connection and key columns
8. Statistical Results
9. Data Flow
third, calculate how many days no consumption
1. last consumption time
Maximum order creation date based on customer number
2. How many days did you spend ?
calculate the customer's most recent consumption, from "year January 31" There are "how many days no consumption"
3. filtering useless fields
4. Statistical Results
5. Data Flow
Four, according to the sales data to the customer hierarchical clustering calculation
1. connect to query customer's consumer information
Setting connection and key columns
Query results
2. standardization before cluster computing
Set up columns and standardized algorithms that require normalization
Standardized results
3. Compute Hierarchical Clustering
Specify distance functions, connection types, and columns that participate in cluster calculations
Hierarchical clustering Results
4. Removing noise data (global)
Enlarge Hierarchical cluster diagram
Select noise points and set to noise
Filter the noise data globally
To view the data being filtered out
5. Data Flow
four, according to the sales data to the customer K-means clustering calculation
1. calculate K-means Clustering
Specify cluster parameters and the columns that participate in the cluster calculation
View Clustering Results
2. assigning data based on cluster results
(i.e. test real data based on training model)
View Clustering Results
3. Decision Tree Training
Setting decision Tree Parameters
View Training Results
4. Data Flow
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Knimi Data Mining modeling and Analysis series _003_ using Knimi to do customer segmentation