Python vs. machine learning-clustering and EM algorithms

Source: Internet
Author: User

The idea of clustering: dividing a DataSet into several subsets (called a cluster cluster) that you don't want to cross, each potentially corresponding to a concept. But the practical significance of each cluster is determined by the users themselves, and the clustering algorithm will only be divided.

The role of Clustering:

1) can be used as a separate process for finding a distribution pattern of data

2) as a preprocessing process for classification. First, classify data is clustered and then the classification process is performed on each cluster of cluster results.

Performance Metrics for Clustering:

1) External indicator: This indicator is obtained by comparing the result of a cluster with a reference model

Jaccard coefficient: It depicts the probability jc=a/(a+b+c) of all samples belonging to the same class that are subordinate to the same category in both C and c*.

FM Index: It depicts a sample pair belonging to the same class in C, the proportion of the sample pair belonging to the c* is P1, the sample pair belonging to the same class in c*, and the ratio of the sample pairs belonging to C is p2,fmi the geometric average p1 of P2 and Fmi=sqrt ((A/(A+B)) * (A/(A+C)))

2) Internal indicator: This indicator is obtained directly from the study of clustering results, and does not utilize any reference model

Python vs. machine learning-clustering and EM algorithms

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.