Level division of e-commerce merchants based on K-means clustering clustering algorithm (including octave simulation)

Last Update:2017-07-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

When engaged in the e-commerce channel operation, every key time node, big promotion, the end of the quarter and so on, we have to do one thing is the brand pool rating, update all the shop level. For example, so the merchant is divided into Ska,ka, ordinary shop, new shop These 4 levels, for different levels of merchants, will give different degree of traffic support or advertising strategy. Generally speaking, in a certain period of time, the evaluation of the dimensions can be: UV, booking amount, praise rate, pin back amount, ad bit CTR, conversion rate, PC-side traffic, mobile phone-side traffic, guest unit price ... n multiple dimensions, how can we find an algorithm in these n multiple dimensions to divide our brand into 4 levels? Today's discussion of the K-means clustering algorithm is one of the e-commerce channel based on 296 brands of weekly sales of real data, we come to the brand Pool division.

First, the K-means clustering algorithm can be described in the following steps:

1. Random selection of K centroid (Centroids);

2, calculate the distance from the K centroid of each data point, select a centroid with the smallest distance as the owning group of the data point. For example, if a data point is closest to the center of mass, then it belongs to the # # group.

3, update the coordinates of the centroid, the data point coordinates of each group to calculate the average, to obtain a new centroid location and update.

4, repeat the second and third steps n times.

where k and n are specified in advance.

In order to visualize the K-means run process, we only take 296 of the brand's 2 dimensions: UV and booking amount. The main control code is as follows:

Percent ================= part 1:load data ====================fprintf (' Load parameters.\n\n ');p kg load io;tmp = xlsread (' Dat A.xlsx '); Id=tmp (:, 1); X=tmp (:, 2:3); percent =================== part 2:set parameters ======================k = 4;max_iters = 10;%% ================ = = = Part 3:k-means Clustering ======================fprintf (' \nrunning k-means clustering on Example dataset.\n\n '); Initial_centroids = Kmeansinitcentroids (x,k);% Run K-means algorithm. The ' true ' at the end tells we function to plot% the progress of k-means[centroids, idx] = Runkmeans (X, Initial_centroids , Max_iters, True); fprintf (' \nk-means done.\n\n ');

K-means Clustering Algorithm Core code:

function [Centroids, idx] = Runkmeans (X, Initial_centroids, ...                                      Max_iters, plot_progress) [m n] = size (X); K = Size (initial_centroids, 1); centroids = Initial_centroids;previous_centroids = Centroids;idx = Zeros (m, 1);% Run K-Mean SFOR i=1:max_iters        % Output Progress    fprintf (' K-means Iteration%d/%d...\n ', I, max_iters);    If exist (' octave_version ')        fflush (stdout);    End        example in X, assign it to the closest centroid    idx = findclosestcentroids (X, centroids);        % Given The memberships, compute new centroids    centroids = Computecentroids (X, IDX, K); endend

Select the algorithm for the nearest centroid:

function idx = Findclosestcentroids (X, centroids) K = Size (centroids, 1), idx = zeros (Size (x,1), 1), M = size (x,1); for (i = 1: m)  distance =-1;  index =-1;  for (j=1:k)    e = X (i,:)-centroids (J,:);    d_tmp = E*e ';    if (distance = =-1)      distance = d_tmp;      index = j;    else      if (d_tmp<distance)        distance = d_tmp;        index = j;      endif    endif  endfor  idx (i) = Index;endforend

Algorithm for recalculating centroid and initializing centroid:

function centroids = computecentroids (x, IDX, K) [m n] = size (X), centroids = Zeros (K, n); num = zeros (k,1); for (i = 1:m) 
   
    c = idx (i,:);  Centroids (c,:) + = X (i,:);  Num (c,:) ++;endforcentroids = centroids./num;function centroids = kmeansinitcentroids (x, K) centroids = zeros (k, size (x, 2 )); randidx = randperm (Size (x, 1)), Centroids = X (Randidx (1:k),:); end

After 10 iterations, the results of the grouping are as follows:

In my local raw data table, there are about 20 dimensions to measure the operation of each store, according to the K-means clustering algorithm can be easily categorized, although it cannot be visualized, but the principle is identical to the two-dimensional k-means.

Level division of e-commerce merchants based on K-means clustering clustering algorithm (including octave simulation)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Level division of e-commerce merchants based on K-means clustering clustering algorithm (including octave simulation)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Level division of e-commerce merchants based on K-means clustering clustering algorithm (including octave simulation)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support