Python KMeans clustering problem analysis, kmeans Clustering

Source: Internet
Author: User
Tags plotly

Python KMeans clustering problem analysis, kmeans Clustering

Today, python is used to implement simple cluster analysis. By the way, I am familiar with some numpy Array Operations and plotting techniques. Here I will record it.

From pylab import * from sklearn. cluster import KMeans # Use numpy. the append () function is used to merge multi-dimensional arrays in matlab. If the axis parameter value is 0, the y axis is merged. If the parameter value is 1, the x axis is merged, correspond to the effects of matlab [A; B] and [A, B] # create five random datasets x1 = append (randn () + 5, randn) + 5, axis = 1) x2 = append (randn (500,1) + 5, randn (500,1)-5, axis = 1) x3 = append (randn (500,1)-5, randn () + 5, axis = 1) x4 = append (randn ()-5, randn ()-5, axis = 1) x5 = append (randn (), randn (), ax Is = 1) # The following uses a stupid method to merge five datasets into an array (, 2) of the size datadata = append (x1, x2, axis = 0) data = append (data, x3, axis = 0) data = append (data, x4, axis = 0) data = append (data, x5, axis = 0) plot (x1 [:, 0], x1 [:, 1], 'Oc ', markersize = 0.8) plot (x2 [:, 0], x2 [:, 1], 'og ', markersize = 0.8) plot (x3 [:, 0], x3 [:, 1], 'ob', markersize = 0.8) plot (x4 [:, 0], x4 [:, 1], 'om ', markersize = 0.8) plot (x5 [:, 0], x5 [:, 1], 'oy ', markersize = 0.8) k = KMeans (n_clusters = 5, random_state = 0 ). fit (data) T = k. cluster_centers _ # obtain the data center point (t [:, 0], t [:, 1], 'r * ', markersize = 16) # display these five centers, pentagram tag ~ Title ('kmeans clustering') box (False) xticks ([]) # Remove the axis mark yticks ([]) show ()

The result is as follows:

Update

Today, I re-run the program error, prompted to import NUMPY_MKL failed, because the previous command pip install-U numpy manually updated numpy, initially in http://www.lfd.uci.edu /~ Gohlke/pythonlibs/# numpy download the numpy-1.11.2 + mkl-cp27-cp27m-win_amd64.whl file installed, as long as you reinstall it back on it

Update

There is also a package named plotly in python. You can use pip install plotly or pip3 install plotly (Python3.X) to draw exquisite images using this package. There are many examples on the official website, at the same time, plotly also supports matlab, R, and so on, but I personally think that the plot Syntax of plotly is more complex than matplotlib, And it is convenient to modify it according to the routine, however, if you only want to make better data visualization, you can refer to the routine on the official website and make modifications. Below is a sample code from the official website:

Import plotly. plotly as pyimport plotly. graph_objs as goimport plotlyimport numpy as np # generate three sets of Gaussian Distribution (Gaussian Distribution) points set x0 = np. random. normal (2, 0.45, 300) y0 = np. random. normal (2, 0.45, 300) x1 = np. random. normal (6, 0.8, 200) y1 = np. random. normal (6, 0.8, 200) x2 = np. random. normal (4, 0.3, 200) y2 = np. random. normal (4, 0.3, 200) # create the graph objecttrace0 = go. scatter (x = x0, y = y0, mode = 'markers',) trace1 = go. scatter (x = x1, y = y1, mode = 'markers') trace2 = go. scatter (x = x2, y = y2, mode = 'markers') trace3 = go. scatter (x = x1, y = y0, mode = 'markers') # The layout is a dictionary. The dictionary keywords keys include: 'shapes ', 'showlegend' layout = {'shapes': [{'type': 'circle', 'xref ': 'X', 'yref': 'y', 'x0 ': min (x0), 'y0': min (y0), 'x1 ': max (x0), 'y1': max (y0), 'opacity ': 0.2, 'fillcolor': 'blue', 'line': {'color': 'blue',},}, {'type': 'circle', 'xref ': 'X', 'yref ': 'y', 'x0': min (x1), 'y0': min (y1), 'x1': max (x1 ), 'y1 ': max (y1), 'opacity': 0.2, 'fillcolor': 'Orange ', 'line': {'color': 'Orange ',},}, {'type': 'circle', 'xref ': 'X', 'yref': 'y', 'x0': min (x2), 'y0 ': min (y2), 'x1 ': max (x2), 'y1': max (y2), 'opacity ': 0.2, 'fillcolor': 'green ', 'line': {'color': 'green',},}, {'type': 'circle', 'xref ': 'X', 'yref ': 'y', 'x0': min (x1), 'y0': min (y0), 'x1 ': max (x1), 'y1': max (y0 ), 'opacity ': 0.2, 'fillcolor': 'red', 'line': {'color': 'red' ,},], 'showlegend': False ,} data = [trace0, trace1, trace2, trace3] # image parts and layout section fig = {'data': data, 'layout ': layout ,} # Use the offline method to draw images. Because you have not registered an official website and the website is not easy to use, use the offline method to draw plotly. offline. plot (fig, filename = 'clusters ')

The result is that the image is opened in the browser and saved locally, for example:

Summary:Although the syntax of the plotly library is cumbersome, it can be fully utilized when there is a high requirement on data display. matplotlib is convenient for general plotting, in ipython mode, execute from pylab import * to obtain a work environment similar to MATLAB.

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.