Dbscan is a density based clustering algorithm, its basic principle is given two parameters, ξ and MINP, where ξ can be understood as a radius, the algorithm will look for a sample in this radius, MINP is a ξ for the radius of the number of samples found in the limit of N, as long as the N>=MINP, Find the sample point is the core sample point, the specific description of the algorithm see reference file 1, below is the Java implementation of this algorithm:
First, define a point class that represents the sample points
Package com.sunzhenxing;
public class Point {
private int x;
private int y;
Private Boolean IsKey;
Private Boolean isclassed;
public Boolean IsKey () {
return IsKey;
}
public void Setkey (Boolean IsKey) {
This.iskey = IsKey;
This.isclassed=true;
}
public Boolean isclassed () {
return isclassed;
}
public void setclassed (Boolean isclassed) {
this.isclassed = isclassed;
}
public int GetX () {
return x;
}
public void SetX (int x) {
this.x = x;
}
public int GetY () {
return y;
}
public void sety (int y) {
This.y = y;
}
Public Point () {
x=0;
y=0;
}
Public point (int x,int y) {
This.x=x;
This.y=y;
}
Public point (String str) {
String[] P=str.split (",");
This.x=integer.parseint (P[0]);
This.y=integer.parseint (p[1]);
}
Public String print () {
Return "<" +this.x+ "," +this.y+ ">";
}
}
Then define a tool class that serves the implementation of the algorithm:
Package com.sunzhenxing;
Import Java.io.BufferedReader;
Import Java.io.FileReader;
Import java.io.IOException;
Import java.util.*;
public class Utility {
/**
* Test the distance between two points
* @param P Point
* @param q Dot
* @return return the distance between two points
*/
public static double getdistance (point P,point q) {
int Dx=p.getx ()-q.getx ();
int dy=p.gety ()-q.gety ();
Double distance=math.sqrt (Dx*dx+dy*dy);
return distance;
}
/**
* Check to point is not the core
* @param lst the list of storage points
* @param p points to be tested
* @param e e radius
* @param minp density threshold
* @return temporary storage of visited points
*/
public static list<point> Iskeypoint (list<point> lst,point p,int e,int) {
int count=0;
List<point> tmplst=new arraylist<point> ();
For (iterator<point> it=lst.iterator (); It.hasnext ();) {
Point Q=it.next ();
if (Getdistance (p,q) <=e) {
++count;
if (!tmplst.contains (q)) {
Tmplst.add (q);
}
}
}
if (COUNT>=MINP) {
P.setkey (TRUE);
return tmplst;
}
return null;
}
public static void setlistclassed (List<point> lst) {
For (iterator<point> it=lst.iterator (); It.hasnext ();) {
Point P=it.next ();
if (!p.isclassed ()) {
P.setclassed (TRUE);
}
}
}
/**
* If B contains the elements contained in a, then the two sets
Merge
* @param a
* @param b
* @return A
*/
public static Boolean mergelist (list<point> a,list<point> b) {
Boolean merge=false;
for (int index=0;index<b.size (); ++index) {
if (A.contains (B.get (index)) {
Merge=true;
Break
}
}
if (merge) {
for (int index=0;index<b.size (); ++index) {
if (!a.contains (B.get (index)) {
A.add (B.get (index));
}
}
}
return merge;
}
/**
* Returns the collection of points in the text
* @return Returns the collection of the midpoint of the text
* @throws IOException
*/
public static list<point> Getpointslist () throws ioexception{
List<point> lst=new arraylist<point> ();
String txtpath= "Src\com\sunzhenxing\points.txt";
BufferedReader br=new BufferedReader (New FileReader (Txtpath));
String str= "";
while ((Str=br.readline ())!=null && str!= "") {
Lst.add (new Point (str));
}
Br.close ();
return LST;
}
}
Finally, the algorithm is implemented in the main program, as follows:
Package com.sunzhenxing;
Import java.io.*;
Import java.util.*;
public class Dbscan {
private static list<point> pointslist=new arraylist<point> ();//Storing a collection of all points
private static list<list<point>> resultlist=new arraylist<list<point>> ();// The result set returned by the storage Dbscan algorithm
private static int e=2;//e radius
private static int minp=3;//density threshold value
/**
* Extract all points in the text and store them in Pointslist
* @throws IOException
*/
private static void display () {
int index=1;
For (iterator<list<point>> it=resultlist.iterator (); It.hasnext ();) {
List<point> Lst=it.next ();
if (Lst.isempty ()) {
Continue
}
SYSTEM.OUT.PRINTLN ("-----" +index+ "cluster-----");
For (iterator<point> it1=lst.iterator (); It1.hasnext ();) {
Point P=it1.next ();
System.out.println (P.print ());
}
index++;
}
}
Find all the clustering that can go direct
private static void Applydbscan () {
try {
Pointslist=utility.getpointslist ();
For (iterator<point> it=pointslist.iterator (); It.hasnext ();) {
Point P=it.next ();
if (!p.isclassed ()) {
List<point> tmplst=new arraylist<point> ();
if ((Tmplst=utility.iskeypoint (Pointslist, P, E, MINP))!= null) {
Mark all points that have been assembled
Utility.setlistclassed (TMPLST);
Resultlist.add (TMPLST);
}
}
}
catch (IOException e) {
TODO auto-generated Catch block
E.printstacktrace ();
}
}
Merge all the clustering that can be reached directly, that is, find the indirect points and merge them.
private static list<list<point>> GetResult () {
Applydbscan ()//Find all direct clustering
int length=resultlist.size ();
for (int i=0;i<length;++i) {
for (int j=i+1;j<length;++j) {
if (Utility.mergelist (Resultlist.get (i), Resultlist.get (j))) {
Resultlist.get (j). Clear ();
}
}
}
return resultlist;
}
/**
* Program Main function
* @param args
*/
public static void Main (string[] args) {
GetResult ();
Display ();
System.out.println (Utility.getdistance (0,0), new Point (0,2));
}
}
Below is a small test, that is, using the contents of the Src\com\sunzhenxing\points.txt file for testing, Points.txt file content is: