130th: Hadoop Cluster Management tools Datablockscanner practical Detailed learning Notes

Last Update:2015-11-15 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Description :Hadoop Cluster management tools Datablockscanner Practical Detailed learning notes

Datablockscanner a block scanner running on Datanode to periodically detect current Datanode all of the nodes on the Block to detect and fix problematic blocks in a timely manner before the client reads the problematic block.

It has a list of all the blocks that are maintained, by scanning the list of blocks sequentially, to see if there is a checksum or error problem, and it has a closure mechanism.

What is the interception mechanism? Datablockscanner scanning consumes a large amount of disk bandwidth and can have performance problems if the disk bandwidth is too high. So it consumes only a fraction of the disk bandwidth.

The default interval 504 hours (3 weeks) is scanned once. Can be set in the dfs.datanode.scan.period.hours property If a corrupted block is found to be reported to Namenode for repair.

If you want to view the information of the current datanode datablockscanner , you can access the datanodeip 50075 Port. such as worker4:50075/blockscannerreport. The results are very clear.

HDFs offers two kinds of calibration methods:

1. Checksum. The first time the system calculates the checksum of the data, in the channel transmission process, if the newly generated checksum does not match the original checksum, the data is considered to be corrupted.

HDFs is a transparent way to verify all written data, set by default, and validate checksums when reading data. Each checksum of the data creates a separate checksum, the default check block size is 4 bytes, the corresponding checksum is a byte,datanode The node verifies the received data before it stores the data, and if the node detects an error, the client receives a checksum error:checksumexception.

When the client reads the data on the Datanode, it verifies the checksum, and compares the checksum with the Datanode Store, each Datanode will maintain a checksum log, which has a validation time, the client successfully verifies the checksum and then tells the Datanode node, and then updates the log.

2.DataBlockScanner, is to open the background thread on datanode , periodically verify the block stored on the current Datanode , Prevent loss of data from physical media corruption.

is performed in a separate thread and periodically verified. Whendfs.clientwill be notified when readDatablockscannerCheck the results,DatablockscannerThe maximum scan speed is8mb/s(can be set). Minimum scan speed1mb/s. The default scan cycle is3Week (504hr). Because the cycle is long, the scan takes a long time, in order to avoid the scanning processDatanodethe node restarts the situation,Datablockscanneruse a logger to persist eachBlocklast scan time, soDatanodeafter a reboot, each one is restored through the logBlockthe effective time.

To conserve system resources, the validation of blocks not only relies on datablockscanner 's background threads but also transmits blocks to specific clients. The scan time of the block is updated. Because the Datanode needs to validate the data block when it transmits a block to the client, the logger does not immediately block The scan information is written to the log because of the frequent disk IO will cause performance degradation and when will the current Block the last scan time to write to the log? This has different conditions, interested can refer to the source code.

above content is Liaoliang teacher Dt hadoop The first-class combat classic "section 130 Talk about the study notes.
Liaoliang: " flink docker , android technical Chinese evangelist. spark Dean and chief expert of Asia Pacific Research Institute, dt android Span style= "font-family: the song Body;" > Soft and hard integrated source-level experts, English pronunciation magician, fitness enthusiasts.

Public account:Dt_spark

Contact email [email protected]

Tel:18610086859

qq:1740415547

Number:18610086859

Sina Weibo:ilovepains

Liaoliang's first Chinese Dream: Free for the whole society to cultivate thousands of outstanding big Data practitioners!

Can be donated by Liaoliang Teacher's number 18610086859 to send red envelopes, now released Liaoliang free video complete the following:

1, "Big Data sleepless Night:Spark kernel decryption (total )":http://pan.baidu.com/s/1eQsHZAq

2, "Hadoop in- depth Combat classic" Http://pan.baidu.com/s/1mgpfRPu

3 spark Pure combat Public Welfare Forum " &NBSP; http://pan.baidu.com/s/1jGpNGwu
4 Span style= "font-family: the song Body;" >, " scala The classic of the practical," http://pan.baidu.com/s/1sjDWG25
5 docker &NBSP; http ://pan.baidu.com/s/1ktpl8uf
6 spark Asia Pacific Research Institute spark &NBSP; http://pan.baidu.com/s/1i30Ewsd

7,Spark Combat Master Road All six stages video:http://edu.51cto.com/pack/view/id-144.html

8, "Big Data Spark Enterprise-level combat" purchase http://item.jd.com/11622851.html

The address of the video website of the first lecture:

51CTO

Http://edu.51cto.com/lesson/id-78344.html

130th: Hadoop Cluster Management tools Datablockscanner practical Detailed learning Notes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

130th: Hadoop Cluster Management tools Datablockscanner practical Detailed learning Notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

130th: Hadoop Cluster Management tools Datablockscanner practical Detailed learning Notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support