The design of Dream------Hadoop--hdfs

Source: Internet
Author: User

HDFs is a file system designed for storing large files in streaming data access mode.   Streaming data Access HDFs is built on the thought that one-write, multiple-read mode is the most efficient. A dataset is typically generated or copied by a data source, then a variety of analysis is carried out on this basis. At a minimum, each analysis involves most of the data in the dataset (set all), so reading the entire the time of the dataset is more important than the delay in reading the first record.   Commercial Hardware Hadoop does not need to run on expensive and highly reliable hardware. It is designed to run on commercial hardware (common hardware that can be purchased at various retail stores) cluster, at least for large clusters, the probability of node failure is still relatively high. In the face of this failure, HDFS is designed to continue to run without noticeable disruption to the user.   at the same time, applications that are not suitable for HDFs are worth studying. Currently, HDFs is not very suitable for some areas, but it may be improved in the future.   Low Latency Data access applications that require low-latency access to data in the millisecond range are not suitable for HDFS. HDFs is optimized for high data throughput, which may be at the expense of latency. Currently, HBase is a better choice for low-latency access a large number of small files The namenode node stores the file system's metadata, so the limit on the number of files is determined by the amount of memory in the Namenode node. Based on experience, each file, index directory, and block account for approximately 150 bytes. So, for example, if you have 1 million files, each one block, you need at least 300MB of memory. While it is possible to store millions of files, 1 billion or more files are beyond the current hardware capabilities.   Multi-user write, Arbitrary file modification the file in HDFs has only one writer, and the write operation is always at the end of the file. It does not support multiple writers, or is modified anywhere in the file. (these may be supported in the future, but they are relatively less efficient)

The design of Dream------Hadoop--hdfs

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.