Machine learning Path: Python K-mean clustering Kmeans handwritten numerals

Source: Internet
Author: User

Python3 Learning using the API

Using the data set on the Internet, I downloaded him to a local

can download datasets in my git: https://github.com/linyi0604/MachineLearning

Code:

1 ImportNumPy as NP2 ImportPandas as PD3  fromSklearn.clusterImportKmeans4  fromSklearnImportMetrics5 6 " "7 K-Mean-value algorithm:8 1 randomly selected K samples as the center of the K category9 2 from the K sample, select the nearest sample to be the same category as yourself, until all samples are classified.Ten 3 pairs of K-classifications for the Re-computing center sample One 4 repeat from K New Center sample, A If the result of the class is the same as the last one, stop - otherwise repeat 234 -          the " " - " " - The data set originates from the online https://archive.ics.uci.edu/ml/machine-learning-databases/optdigits/ - I downloaded him locally. + Training Sample 3,823, test sample 1797 - the image is represented by a total of 64 dimensions through the 8*8 pixel matrix, and 1 target dimensions represent the number category + " " A  at #1 Preparing Data -Digits_train = Pd.read_csv ("./data/optdigits/optdigits.tra", header=None) -Digits_test = Pd.read_csv ("./data/optdigits/optdigits.tes", header=None) - #extracting 64-dimensional pixel features and 1-dimensional targets from a sample -X_train = Digits_train[np.arange (64)] -Y_train = digits_train[64] inX_test = Digits_test[np.arange (64)] -Y_test = digits_test[64] to  + #2 Building a model - #Initialize Kmeans cluster model Cluster Center number is 10 theKmeans = Kmeans (n_clusters=10) * #Clustering $ Kmeans.fit (X_train)Panax Notoginseng #each test image belongs to the cluster center you -Y_predict =kmeans.predict (x_test) the  +  A #3 Model Evaluation the #performance evaluation with ARI use ARI for model evaluation when clustering has a category + Print("K-Mean-value-clustering Ari value:", Metrics.adjusted_rand_score (Y_test, y_predict)) - " " $ K-Mean-value-clustering Ari value: 0.6673881543921809 $ " " - #if no cluster category is available, evaluate with contour factor

Machine learning Path: Python K-mean clustering Kmeans handwritten numerals

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.