K-means some ideas as well as the realization

Last Update:2016-01-24 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

As a test, I used two-dimensional planar coordinates. We randomly select the image, use the OpenCV Mat class object to read the data in the image file, and then use the RAND function to randomly generate K (where the K value depends on the input) a two-dimensional coordinate value. Traverse the entire image pixel position to calculate the distance between each coordinate relative to any one of the core points l^2 (the square of the Euclidean distance, as this will be less computationally), to classify the pixels according to the distance value. When the entire image is traversed, we have a preliminary cluster, but this clustering is not good, the main reason is that the core of the cluster is arbitrary (the reason is because this clustering method is unsupervised, we are unknown to the dataset and data points). We need to trim the data for each data set.

1 pre-cluster image 2 after initial clustering

3 Final post-cluster image (647*580 resolution)

4 distribution of random points

These 3 graphs probably illustrate some of the causes and problems of the K-means clustering method.

When you look at these 3 graphs, ignore the size of the image, and focus on the internal proportions of the image better (clustering by distance, so it can be ignored).

When I was doing the K-means, in order to show the effect, different data sets are labeled in different colors, there is a point in each color area, and the location of the core points is calibrated.

We can see that the cluster of the figure 2 is irregular, 3 graph as the final form of clustering, there is a greater change, from the 4 figure can be found that the initial random point distribution is not as ordered in the 3 chart, but after the correction, basically compared to the rules, but still not complete four equal, one of the main reasons, is the effect of capturing the effects of random points, in the cluster object we take, it may be a little better, but in other distance class objects (where the data is distributed and uneven, such a large amount of data) may be worse.

#include <iostream> #include <vector> #include <string> #include <cv.hpp>using cv::mat;using CV :: imread;using cv::imshow;using cv::imwrite;using cv::waitkey;using cv::cvtcolor;//using CV::P oint2i;using std:: Vector;using std::string;using std::cin;using std::cout;using std::endl;//typedef Point2i Point;struct Point{int x;int y;unsigned int gray;unsigned int distance;double weight; Point () {x = 0;y = 0;gray = 0;distance = 0;weight = 0;} Point (int _x, int _y,unsigned int _gray=0,unsigned int _distance=0,int _weight=0) {x = _x; y = _y; gray =_gray; distance =_distance; Weight = _weight; }point (const point&_point) {this->x = _point.x; this->y = _point.y; this->gray = _point.gray; this->dista nce = _point.distance; This->weight = _point.weight;  }point& operator= (const point&_point) {this->x = _point.x; this->y = _point.y; this->gray = _point.gray; This->distance = _point.distance; This->weight = _point.weight; return *this; }};void DSORT (Vector<point>*vec) {Vector<point>::iterator fit = Vec->begin (); for (; FIt! = Vec->end (); ++fIt) { for (vector<point>::iterator sit = fit+1; sit! = Vec->end (); ++sit) {if (Sit->distance < fit->distance) { Point tmp (*sit); *sit=*fit;*fit = tmp;}}} Vector<point>::iterator it = Vec->begin ();} void Weight (vector<point>*vec,double alpha,double beta,double theta) {if (Vec->empty ()) {cout << " Container is empty "<< Endl;return;} cout << "Start sort" << endl;dsort (VEC); cout << "End Sort" << endl;double Cscore = 0.0;double Dsco Re = 0.0;vector<point>::iterator it = Vec->begin (); for (; It! = Vec->end (); ++it) {Dscore = (1/(sqrt (2 * 3.14 159) *beta) * (exp ((It-vec->begin ()) * (It-vec->begin ())/(2 * beta*beta));//cout << Dscore << endl;ds  Core *= It->distance;cscore = (1/(sqrt (2 * 3.14159) *theta) * (exp ((It-vec->begin ()) * (It-vec->begin ())/(2 * Theta*theta)); Cscore *= it->gray;it->weight = Double (alpha*it->distance) + double ((1-alpha) *it->gray)//cout << "weight=" < < It->weight << Endl;} cout << "After weight" << Endl;} int squaredistance (const point& p1, const point &p2) {//cout << "s" <<p1.x << "," << p1.y &lt ;< "<<p2.x<<", "<<p2.y<<endl;return ((p1.x-p2.x) * (p1.x-p2.x) + (P1.Y-P2.Y) * (p1.y-p2.y ));} int Colorgap (Mat mat,const point&p1, const POINT&AMP;P2) {//cout << "C" << p1.x << "," << p1.y << "<< p2.x <<", "<< p2.y<<endl;//cout<<" gap= "<< abs (Mat.at<uchar> ( p1.x, P1.y)-mat.at<uchar> (p2.x, p2.y)); Return ABS (Mat.at<uchar> (p1.x, P1.y)-mat.at<uchar> (p2.x, P2.y));} Const Point & Findcentre (Vector<point>&vec) {cout << "start Centre" << Endl;vector<point >::iterator it = Vec.begin (); intxall = 0;int yall = 0;int weight = 0;for (; it! = Vec.end (); ++it) {xall+= It->x;yall + = It->y;} Weight=vec.size ()//cout << xall << ' \ t ' << yall << ' \ t ' << weight << ' \ t ' <<xall /weight<<endl;cout << "End Centre" << Endl;  Point T (Xall/weight, Yall/weight), cout << t.x << ' t ' << t.y << endl;vec.clear ();//cout << "Xall=" << xall << "yall=" << yall << "weight=" << weight << "xall/weight=" <<xal l/weight<< "yall/weight=" <<yall/weight<< Endl;return t;}  BOOL Stopcondition (int num, point*pt, Vector<point>*vec) {bool flag = 1;for (int i = 0; I! = num; ++i) {if (pt[i].x = = Vec[i].begin ()->x&&pt[i].y = = Vec[i].begin ()->y) flag = flag & 1;elseflag = flag & 0;} return flag;}  void GetPoint (Mat mat, vector<point> *vec, int num = 5) {unsigned int Min = mat.rows*mat.rows + mat.cols*mat.cols;int pos = 0;for (int i = 0; I! = Mat.rows; ++i) {for (int j = 0; J! =)Mat.cols; ++J) {Min = mat.rows*mat.rows + Mat.cols*mat.cols;pos = 0;for (int k = 0; K! = num; ++k) {Point tmp = *vec[k].begin (); if ((M In > Squaredistance (Point (I, J), TMP))//&&squaredistance (Point (I, J), TMP) <=l_gap) {Min = Squaredistance (Point (I, J), tmp);p OS = k;}} Point TMP (I, J, Colorgap (Mat, point (I, J), Vec[pos].front ()), Min, 0); Vec[pos].push_back (TMP);}} void K_means (mat&mat,vector<point> *vec, int num = 5, double alpha = 0.5,int Amp=10,int l_gap=20,int c_gap=30,do Uble threshold=0.5) {int amplitude = 15;if (Mat.empty ()) {cout << "please input the mat" << Endl;return;}    Point*start = new Point[num];for (int i = 0; I! = num; ++i) start[i] = point (0, 0); for (int i = 0; I! = num; ++i) {Vec[i].push_back (Point (rand ()% mat.rows, rand ()% mat.cols)); cout << VEC [I].front (). x << "," << vec[i].front (). Y << Endl;} while (!stopcondition (num, start, VEC)) {cout << "________________________" << endl;for (int i= 0; I! = num; ++i) {cout << "start" << start[i].x << ' \ t ' << start[i].y << endl;cout << "VEC" <&lt ; Vec[i].begin ()->x << ' t ' << vec[i].begin ()->y << Endl;} cout << "_____________________________" << endl;for (int i = 0; I! = num; ++i) {Start[i] = Vec[i].front ();} amplitude = 0;getpoint (MAT, VEC, num); cout << "Get the Weight and centre" << endl;for (int x = 0; x! = num; + +) x) {cout << "num=" << x << endl;//weight (&vec[x], 0.5, 0.4, 0.4); Point T (Findcentre (Vec[x])),//*vec[x].begin () = T;vec[x].push_back (t), cout << vec[x].begin ()->x << VEC            [X].begin ()->y << Endl;}} GetPoint (MAT, VEC, num); cout << "Paint" << endl;/*for (int i = 0; I! = num; ++i) {Double allweight = 0.0;vector <point>::iterator it = Vec[i].begin (); for (; It! = Vec[i].end (); ++it) Allweight + = It->weight;it = Vec[i].begin () ;d ouble aveweight = allweight/(Vec[i].size ());*//*for (int i = 0; I! = mat.rows; ++i) for (int j = 0; J! = Mat.cols; ++j) mat.at<uchar> (i, j) = 255;*/for (int i = 0 ; I! = num;++i) {Vector<point>::iterator it = Vec[i].begin (); for (; It! = Vec[i].end (); ++it) {mat.at<uchar> (it- >x, it->y) = + i * 30;} Mat.at<uchar> (Vec[i].front (). x, Vec[i].front (). Y) = 0;}} int main () {int number = 0;vector<point> *vec;cin >> Number;vec = new Vector<point>[number];//for (int i = 0; I! = Numbe) Mat Mat =imread ("2.jpg"); Cvtcolor (Mat, Mat, Cv_bgr2gray); K_means (Mat, Vec,number), Imshow ("1", Mat),//imwrite (String ("2.jpg"), Mat); Waitkey (0);}

Code format messy, as their own notes, first so well, and so on this k-means joint color factor algorithm completely written out, and then adjust is not too late.

In these codes, where the color + distance weights are calculated in the weight function, we use two different parameters of the Gaussian function to carry out the weight of the protocol (the data is sorted by distance, from small to large.) The greater the distance, the farther away from the middle axis of the Gaussian function, the distance is brought into the Gaussian function and the final weight is smaller. Color + distance These two factors use one parameter to control the ratio of the two factors to the weight, so I feel better.

K-means some ideas as well as the realization

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

K-means some ideas as well as the realization

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support