Research and optimization of batch file storage performance oriented to HDFs

Source: Internet
Author: User
Keywords Redis batch file system architecture Hadoop Distributed File system
Tags *.h file client code communication data distributed distributed file system file

Research and optimization of batch file storage performance oriented to HDFs

Nanjing Normal University Suyisu

The main work and innovation of this article are as follows: 1, on the basis of studying the source code of HDFS system, this paper introduces HDFs typical operation flow and backstage management work, analyzes the metadata architecture and communication mechanism in HDFs, and discusses some problems and defects faced by HDFs system in batch file processing. 2, in order to solve the problem of batch file storage, the mechanism of batch file storage and the process of reading and writing are reconstructed. When writing to a bulk user file, the client client merges the batch files into a group file and creates mapping metadata between user files, data fragments, group files, data blocks, and then stores the group files and related metadata in HDFs; The client clients first obtain the metadata of the user files, then classify the data fragments according to the storage location, then send the data reading request to the Datanode according to the class, obtain all the data fragments, and finally assemble the data fragments into the files that the user asks for. 3, on the basis of batch file storage optimization, this paper puts forward the method of storing the easily separated metadata migration in Namenode node to Redis server node, and realizes "metadata distribution, access distribution" to further reduce the memory consumption and access load of namenode nodes. 4, for the above optimization scheme, the HDFs open source system is programmed and tested, and the experimental results verify the effectiveness of the optimization strategy.


Research and optimization of batch file storage performance oriented to HDFs

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.