Research and implementation of visual data clustering based on Hadoop platform
University Lin
Hadoop is a distributed model for solving large data storage and analysis problems. Clustering algorithm can make the feature expression of visual data through clustering generation code book. How to complete the clustering algorithm in the distributed model is an important problem in the research and production. Aiming at the problem of large-scale visual data clustering, this paper designs and realizes the visual data clustering algorithm based on Hadoop model, which improves the efficiency of visual data clustering. Firstly, this paper introduces the visual features and analyzes the dimensionality disaster problem in the process of generating codebook of visual information clustering. Then, this paper analyzed the Hadoop distributed model in detail, designed and implemented the visual data K and GMM clustering method based on Hadoop model, and solved the problem of dimensionality disaster in the process of generating codebook of visual information data. The distributed clustering of visual data is realized by Map/reduce algorithm, which improves the efficiency of visual information data processing to a large extent. On the basis of these studies, based on the experimental data of different scales and the design of different scale clusters of experiments and analysis of experimental results, it is concluded that the clustering algorithm for large-scale data clustering with the Hadoop framework is more efficient, faster and more scalable.
Research and implementation of visual data clustering based on Hadoop platform
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.