1. Location of file storage
Example View
./bin/hadoop Fsck/data/bb/bb.txt-files-blocks-racks–locations
blk_1076386829_2649976 is a meta file name, specifically how to find the meta files, you can use the Find command, we can see the files stored in 117 and 229 of the two machines, such as we log on to the 117 machine.
First path to Dfs.datanode.data.dir (if forgotten, can be viewed in $hadoop_home/etc/hadoop/hdfs-site.xml)
My machine is configured as follows:
Execute the Find statement in 3 directories, as shown in the example command:
Find/data1/hdfs1/data/current/bp-236683338-10.207.0.217-1403487328282/current-name Blk_1076386829_2649976.meta
The meta file is eventually found. As follows:
This will also find your file, you can view the cat blk_1076386829.
A simple simulation of one of the data block corruption, after the data block corruption, Before the node executes Directoryscan (Dfs.datanode.directoryscan.interval decision), no corruption is found, before the data block information is reported to Namenode (Dfs.blockreport.intervalMsec decision), will not recover data blocks, and will not take recovery measures until Namenode receives block information
The real situation will certainly be more complicated, and you can learn from this simple process the two parameters that are mentioned at the beginning.
Parameter configuration
Two main parameters in Hdfs-site.xml configuration in the next
<property> <name>dfs.namenode.secondary.http-address</name> <value>master:9001 </value></property><property> <name>dfs.blockreport.intervalMsec</name> <value>600000</value> <description>determines block reporting interval in milliseconds.</ Description></property><property> <name>dfs.datanode.directoryscan.interval</name > <value>600</value> </property>
It's all 10 minutes.
Log details
2016-06-14 21:48:51,083 INFO Org.apache.hadoop.hdfs.server.datanode.DirectoryScanner:BlockPool bp-660628275-192.168.1.100-1464787466998 total blocks:1, missing metadata files:1, missing block files:1, missing blocks In memory:0, mismatched blocks:0
2016-06-14 21:48:51,084 WARN Org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl:Removed block 1073741825 from memory with Missing block file on the disk
2016-06-14 21:49:17,168 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Blockreport of 1 blocks took 0 msec to generate and 1 msecs for RPC and NN processing
2016-06-14 21:49:17,169 INFO org. Apache.hadoop.hdfs.server.datanode.DataNode:sent block report, processed command:[email protected]
2016-06-14 21:49:20,977 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:Receiving bp-660628275-192.168.1.100-1464787466998:blk_1073741825_1001 src:/192.168.1.101:53718 dest:/192.168.1.102:50010
2016-06-14 21:49:20,984 INFO org.apache.hadoop.hdfs.server.datanode.DataNode:Received bp-660628275-192.168.1.100-1464787466998:blk_1073741825_1001 src:/192.168.1.101:53718 dest:/192.168.1.102:50010 of size 1366
Hadoop HDFS Data Block exploration