Hadoop Learning---HDFs

Source: Internet
Author: User

The block with the default base storage unit for HDFs 64mb,hdfs is much larger than the disk block, to reduce the addressing overhead. If the block size is 100MB, addressing time at 10ms, the transfer rate is 100mb/s, then the addressing time is 1% of the transmission time

Three important roles for HDFs: Client,datanode,namenode

Namenode is equivalent to the manager in HDFs, managing the namespace of the file system. It maintains the file system tree and all the files and index directories in the tree. It stores the file system's metadata in memory.

Datdanode is equivalent to a worker in HDFs and is the basic unit of file storage. Periodically report to namenode the list of blocks it stores

The client is the application that obtains the HDFs file, accesses the entire file system by interacting with Namenode, Datanode, and the client provides a file system interface similar to the POSIX (Portable Operating system interface), As a result, users do not need to know Namenode, Datanode and their functions when programming.

(1) File write

    • Client initiates a request to Namenode to write a file
    • Namenode depending on file size and file block configuration, see the information returned to the client for some of the datanode it manages
    • The client divides the file into blocks, which are written sequentially to each datanode according to the Datanode address information

(2) file read

    • Client initiates read file request to Namenode
    • Namenode returns information about the Datanode that stores the file
    • Client Read file

(3) Block replication

    1. Namenode found that the block of some files does not meet the minimum number of copies or partial datanode failure
    2. Notify Datanode to duplicate each block
    3. Datanode start copying each other.

Hadoop Learning---HDFs

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.