bSAS sequential clustering algorithm and MATLAB code implementation

Source: Internet
Author: User

The sequential algorithm (sequential algorithms) is a very simple clustering algorithm, most of which use all eigenvectors at least once or several times, and the final result depends on the order of the vectors participating in the algorithm. This clustering algorithm generally does not know the number of clusters of k, but it is possible to give a clustering number of the upper bound Q. In this paper, we will mainly introduce the basic order algorithm (sequential algorithmic Scheme,bsas) and several variants, and give the code implementation.

First look at bSAS, which requires user-defined parameters: the non-similarity threshold θ and the maximum allowable number of clusters Q and the clustering order. The basic idea of the algorithm: to consider each new vector, according to the distance from the vector to the existing cluster, it is assigned to an existing cluster, or a newly generated cluster.

Algorithm Example:

There are 10 patterns of sample points: {x1 (0 0), X2 (3 8), X3 (2 2), X4 (1 1), X5 (5 3), X6 (4 8), X7 (6 3), X8 (5 4), X9 (6 4), X10 (7 5)}


First step: Select any of the pattern samples as the first cluster center, such as Z1 = x1

Step Two: Select the distance z1 the farthest sample as the second cluster center.

by Calculation, | | X6-Z1 | | Max, so z2 = x6

Step three: Calculate the distance between each pattern sample {XI, i =,..., N} and {z1, z2}, i.e.

di1= | | XI-Z1 | |

di2= | | XI–Z2 | |

and select the Minimum distance min (Di1, Di2), i =,..., N

Fourth step: Select the maximum distance in the minimum value of all pattern samples, if the maximum value reaches | | Z1-Z2 | | , the corresponding sample points are taken as a third cluster center z3, i.e.

If Max{min (Di1, Di2), i =,..., N} >θ| | z1-z2 | |, then z3 = XI

Otherwise, if a suitable sample is not found as a new cluster center, the process of finding the cluster center ends.

Here, θ can use a heuristic method to take a fixed fraction, such as 1/2.

In this case, when i=7, the above conditions are met, so z3 = X7

Fifth step: If there is a Z3 exists, then calculate Max{min (Di1, Di2, Di3), i =,..., N}. If the value exceeds | | Z1-Z2 | | A certain percentage, there is Z4, otherwise the process of finding a cluster center ends.

In this case, no Z4 satisfies the condition.

Sixth step: Divide the pattern sample {XI, i =,..., N} by the closest distance to the nearest cluster center:

Z1 = x1:{x1, x3, x4} for the first class

z2 = x6:{x2, x6} is the second class

Z3 = x7:{x5, X7, x8, X9, x10} for the third class

Finally, we can calculate the mean value of each sample in each class and get a more representative cluster center.

The algorithm MATLAB implementation of the Code download link, comments very full ~ ~ Download link


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.