No live nodes contain current block. will get new block locations from namenode and retry...

Source: Internet
Author: User

When multiple users operate on HDFS and hbase, the following exception occurs, which means they cannot connect to datanode and cannot obtain data.

INFO hdfs.DFSClient: Could not obtain block blk_-3181406624357578636_19200 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry...13/07/23 09:06:39 WARN hdfs.DFSClient: Failed to connect to /192.168.3.4:50010, add to deadNodes and continuejava.net.SocketException:

2. Use hadoop fsck/to check the HDFS file. The result is healthy, indicating that the node data is correct. The namenode and datanode should be consistent.

3. view the datanode log

Dataxceiver is found to be a problem. The value of dataxceiver is greater than 4096, so reading and writing cannot be provided. This value was previously changed to 4096. Now it is found that this value is too small.

In the configuration file, change:

<property>          <name>dfs.datanode.max.xcievers</name>          <value>12288</value>  </property>

4. the problem persists. Then, the following error is reported in the datanode log.

2012-06-18 17:47:13 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode DatanodeRegistration(x.x.x.x:50010, storageID=DS-196671195-10.10.120.67-50010-1334328338972, infoPort=50075, ipcPort=50020):DataXceiverjava.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected 

Https://issues.apache.org/jira/browse/HDFS-3555 this post said that is a client problem, resulting in datanode can not write data to the client, re-check the code, because the file is large, found that every time the data is read is not close the file

5. Restart the cluster and find that the meta table cannot be loaded when hbase is started.

org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: .META.,,1

Http://www.zihou.me/html/2013/06/27/8673.html
Follow this post to solve the problem. After the restart, there is no problem with hbse.

When reading data in multiple threads after 6, region still cannot provide services after a period of time. This is certainly a problem with datanode, but dataxceiver has been changed to a large one and continues to check the code, the filesystem instance is obtained every time data is read, and it is not closed. After the change, the problem is solved.


Conclusion: In fact, this problem has nothing to do with the cluster. It should be okay if dataxceiver is set to 4096.

You have two problems.

First, the file is not closed every time you read the file.

Second, do not obtain the filesystem instance multiple times.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.