1. Parameter meaning
Dfs. replication: set the number of copies of the data block to be copied;
Dfs. replication. min: Minimum number of copies of the specified data block;
Dfs. safemode. threshold. pct: Specify the proportion of data blocks to meet the minimum number of replicas.
(1) If the ratio is less than this, the system will switch to the security mode to copy data blocks;
(2) When this ratio is exceeded, the security mode is left, indicating that the system has enough data block copies to provide external services.
(3) If the value is less than or equal to 0, the system does not enter the security mode. If the value is greater than 1, the system remains in the security mode.
Ii. Significance of dfs. replication. min
The number of replicas is set by dfs. replication. If a failed node causes a decrease in the number of data block replicas, when the number is lower than that of dfs. replication. min, the system then copies a new copy at another node. If the data block
Because too many nodes in the environment have copied more than the number of copies of dfs. replication. max, the replication will stop.
Iii. hadoop Security Mode
Hadoop's security mode is read-only, which means that the number of data blocks in the current system is relatively small. In this phase, data blocks must be copied, you are not allowed to modify or delete data blocks. When NameNode is started, it first enters the security mode. If the proportion of the block lost by datanode reaches a certain level (1-dfs.safemode.threshold.pct), the system will remain in the security mode, that is, read-only status. Dfs. safemode. threshold. pct (default value: 0.999f) indicates that when HDFS is started, if the number of blocks reported by DataNode reaches 0.999 times the number of blocks recorded by metadata, the security mode can be left; otherwise, the read-only mode is used. If it is set to 1, HDFS is always in SafeMode.
Reference: http://www.dataguru.cn/forum.php? Mod = viewthread & tid = 72589.