Dbscan algorithm based on density clustering

Source: Internet
Author: User

According to the characteristics of various industries, a variety of clustering algorithms are proposed, which are divided into several categories: hierarchy, Division, density, graph theory, grid and model.

Among them, the density-based clustering algorithm is the most representative in Dbscan.

Assuming a set of data, the R code of the generated data is as follows

X1 <-seq (0, Pi,length. out= -) Y1<-sin (x1) +0.1*rnorm ( -) X2<-1.5+ SEQ (0, Pi,length. out= -) Y2<-cos (x2) +0.1*rnorm ( -) Data<-Data.frame (C (X1,X2), C (y1,y2)) names (data)<-C ('x','y') Qplot (data$x, data$y)

Using the density clustering Dbscan method, we can see that the clustering effect is as follows:

<- Ggplot (Data,aes (x, y)) library ('FPC'<-dbscan (data,eps=  0.6, minpts=4+ geom_point (size=2.5, AES (Colour=factor (Model2$cluster))) +theme ( legend.position='top')

Similarly, readers should look at the clustering effect of K-means.

<-Kmeans (data,centers=2, nstart=<-+ geom_point (size=2.5, AES (Colour=factor (Model1$cluster))) +theme (legend.position='top')

Therefore, different data sets and scenarios need to use different clustering algorithms.

The following describes how the algorithm works.

among them, the Dbscan method is sensitive to parameters EPs and minpts.

In this algorithm framework, NEPs (x, D) represents the data set D contained within the eps-neighborhood of object X
All child objects. The card (n) represents the cardinality of the set N, which is the number of elements contained in the set N. in cluster expansion
The stack structure is used to stack all the neighbor objects of the current object x, and then recursively judge the stack members
Whether the core object conditions are met, thus deciding whether to expand further.

Postscript:

1 about the general introduction of the algorithm, you can see the introduction of Baidu Encyclopedia. http://baike.baidu.com/link?url=cnLtGJsF_a4CzmVbAev3nFH75nZUMgwClKv_kk2ZsXuXrP1gvY8eMvY75UDL29AMJFJ2n60xB680PMkjitrG4a

2 According to the above algorithm flow, the author wrote the Java code into the Baidu cloud disk (including the above test data), interested readers please download themselves. Http://pan.baidu.com/s/1i3J7Adf

3 References "Research on Dbscan Clustering algorithm for heterogeneous datasets" Chongqing University Master's thesis Chen Yootian II o April 13 Http://pan.baidu.com/s/1mgvKR7U

Finish

Dbscan algorithm based on density clustering

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.