Recently, when I was working on images, I suddenly had an idea which needed clustering. In fact, the algorithm was very simple, but it was very urgent to use scipy's cluster directly.
It is actually very simple to use, but there are very few Chinese articles. For a brief introduction, you can also take a look at the English documents: http://docs.scipy.org/doc/scipy/reference/cluster.html.
This is a demo I saw from stackoverflow. If you just use cluster, you can.
import scipyimport scipy.cluster.hierarchy as schimport matplotlib.pylab as pltscipy.randn(100,2)d = sch.distance.pdist(X)Z= sch.linkage(d,method='complete')P =sch.dendrogram(Z)plt.savefig('plot_dendrogram.png')T = sch.fcluster(Z, 0.5*d.max(), 'distance')#array([4, 5, 3, 2, 2, 3, 5, 2, 2, 5, 2, 2, 2, 3, 2, 3, 2, 5, 4, 5, 2, 5, 2,# 3, 3, 3, 1, 3, 4, 2, 2, 4, 2, 4, 3, 3, 2, 5, 5, 5, 3, 2, 2, 2, 5, 4,# 2, 4, 2, 2, 5, 5, 1, 2, 3, 2, 2, 5, 4, 2, 5, 4, 3, 5, 4, 4, 2, 2, 2,# 4, 2, 5, 2, 2, 3, 3, 2, 4, 5, 3, 4, 4, 2, 1, 5, 4, 2, 2, 5, 5, 2, 2,# 5, 5, 5, 4, 3, 3, 2, 4], dtype=int32)sch.leaders(Z,T)
For more information, see http://nbviewer.ipython.org/github/herrfz/dataanalysis/blob/master/week4/clustering_example.ipynb.
This article is actually very good, although I have not been able to successfully down the file.