Canopy-kmeans efficient algorithm based on Hadoop platform

Source: Internet
Author: User
Keywords Hadoop
Tags advantages and disadvantages based hadoop platform programming programming method selection

Canopy-kmeans efficient algorithm based on Hadoop platform

Zhaoqing

This paper introduces the programming model of MapReduce under Hadoop platform, analyzes the advantages and disadvantages of traditional clustering kmeans and canopy algorithm, and proposes an improved Kmeans algorithm based on canopy. Aiming at the stochastic problem of canopy selection in Canopy-kmeans algorithm, the algorithm is improved by the principle of minimum maximum, which avoids the blindness of cannopy selection. Using MapReduce parallel Programming method, mass news information clustering is used as the background. Experimental results show that this method has higher accuracy and stability compared with traditional kmeans and canopy algorithms.


Canopy-kmeans efficient algorithm based on Hadoop platform

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.