Research on TFIDF algorithm based on MapReduce programming model
Zhao Weiyan Wang Jingyu
With the rapid development of technology such as Internet, information processing has become an indispensable tool for people to obtain useful information, and how to obtain useful information efficiently in mass information is very important. The existing text classification algorithm meets the bottleneck in time complexity and space complexity, and it can not meet the needs of people, therefore, a TFIDF algorithm based on Hadoop distributed platform is proposed, the specific flow of the algorithm is given, and the algorithm is realized by MapReduce programming, Compared with the traditional serial algorithm, the comparison experiment is carried out in single machine and cluster mode. Experiments show that the TFIDF text classification algorithm can be used to efficiently classify massive data.
Research on TFIDF algorithm based on MapReduce programming model
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.