Hadoop configuration Item Grooming (hdfs-site.xml)

Source: Internet
Author: User
Tags time in milliseconds

Continue the previous chapter to organize the HDFs related configuration items

Name Value Description
Dfs.default.chunk.view.size 32768 The content display size for each file in the HTTP access page of Namenode, usually without setting.
Dfs.datanode.du.reserved 1073741824 The amount of space reserved for each disk needs to be set up, mainly for non-HDFS files, default is not reserved, 0 bytes
Dfs.name.dir /opt/data1/hdfs/name,
/opt/data2/hdfs/name,
/nfs/data/hdfs/name
The metadata used by NN is saved, and it is generally recommended to keep one copy on NFS, used as a 1.0 ha scenario, or on multiple drives on a single server
Dfs.web.ugi Nobody,nobody Users and groups used by the Web Tracker page server used by NN,JT, etc.
Dfs.permissions true | False DFS permissions are turned on, I generally set false, through the development of tools to train others interface operation to avoid misoperation, set to true sometimes you will encounter data because the permissions can not access.
Dfs.permissions.supergroup SuperGroup Set the HDFS Super privilege group, which is supergroup by default, and the user who started Hadoop is typically superuser.
Dfs.data.dir /opt/data1/hdfs/data,
/opt/data2/hdfs/data,
/opt/data3/hdfs/data,
...
True Datanode data save path, can write multiple hard disks, comma separated
Dfs.datanode.data.dir.perm 755 Path permission for local folder used by Datanode, default 755
Dfs.replication 3 The number of copies of the HDFS data block, by default 3, in theory the more the number of copies run faster, but more storage space is required. The rich can tune 5 or 6.
Dfs.replication.max 512 Sometimes the DNS temporary failure recovery causes the data to exceed the default backup count. The largest number of copies, usually of no use, is not written in the config file.
Dfs.replication.min 1 Minimum number of copies, function ibid.
Dfs.block.size 134217728 The size of each file block, we use 128M, the default is 64M. This calculation needs 128*1024^2, I have encountered someone directly write 128000000, very romantic.
Dfs.df.interval 60000 Disk usage Statistics automatic refresh time in milliseconds.
Dfs.client.block.write.retries 3 The maximum number of times the data block is written, which does not catch a failure before this number of times.
Dfs.heartbeat.interval 3 The heartbeat detection interval for the DN. Seconds
Dfs.namenode.handler.count 10 The number of threads that are expanded after nn startup.
Dfs.balance.bandwidthPerSec 1048576 Maximum bandwidth per second used when doing balance, using bytes as units instead of bit
Dfs.hosts /opt/hadoop/conf/hosts.allow A host Name list file, where the host is allowed to connect the NN, must write absolute path, the contents of the file is empty is considered to be all.
Dfs.hosts.exclude /opt/hadoop/conf/hosts.deny The rationale is the same, but here is a list of host names that are forbidden to access the NN. This is useful for removing the DN from the cluster.
Dfs.max.objects 0 The number of Dfs maximum concurrent objects, the files in HDFs, and the directory blocks are considered to be an object. 0 means no Limit
Dfs.replication.interval 3 NN computes the internal interval of the copied block, usually without writing to the configuration file. The default is good
Dfs.support.append true | False The new Hadoop supports the append operation of the file, which is to control whether the file is allowed to append, but the default is false, because there are additional bugs.
dfs.datanode.failed.volumes.tolerated 0 The maximum number of bad drives that can cause the DN to be hung, the default 0 is that the DN will be shutdown as long as 1 hard drives are broken.
Dfs.secondary.http.address 0.0.0.0:50090 SNN Tracker page listener address and port
Dfs.datanode.address 0.0.0.0:50010 DN of the service listening port, Port 0 will be random listening to the port, through the heartbeat to notify the NN
Dfs.datanode.http.address 0.0.0.0:50075 The tracker page of the DN listens to the address and port
Dfs.datanode.ipc.address 0.0.0.0:50020 The IPC listening port of DN, write 0 to listen at random port via heartbeat transmission to NN
Dfs.datanode.handler.count 3 The number of service threads started by DN
Dfs.http.address 0.0.0.0:50070 Tracker page listening address and port for nn
Dfs.https.enable true | False If the tracker of NN is listening on the HTTPS protocol, the default is False
Dfs.datanode.https.address 0.0.0.0:50475 HTTPS for DN Tracker page listening address and port
Dfs.https.address 0.0.0.0:50470 HTTP Tracker page listening address and port for nn
Dfs.datanode.max.xcievers 2048 Equivalent to the maximum number of open files under Linux, the document does not have this parameter, when the Dataxceiver error occurs, you need to increase the size. Default 256

The main configuration will be used in this is probably the case, there are some https certfile and some internal time configuration, not commonly used will not write.

Hadoop configuration Item Grooming (hdfs-site.xml)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.