Use Happybase to access hbase there is a broken pipe problem---Two "big" bug

Source: Internet
Author: User
Tags zookeeper

Source
Broken pipe error occurs when reading and writing data to HBase using happybase through the thrift interface. Troubleshooting steps:

1. View HBase logs:
Java HotSpot (TM) 64-bit Server VM warning:using incremental CMS is deprecated and would likely be removed in a future rele ASE17/05/12 18:08:41 INFO util. Versioninfo:hbase 1.2.0-CDH5.10.117/05/12 18:08:41 INFO util. Versioninfo:source Code Repository file:///data/jenkins/workspace/generic-package-centos64-7-0/topdir/BUILD/ hbase-1.2.0-cdh5.10.1 REVISION=UNKNOWN17/05/12 18:08:41 INFO util. Versioninfo:compiled by Jenkins on Mon Mar 02:46:09 PDT 201717/05/12 18:08:41 INFO util. Versioninfo:from source with checksum C6D9864E1358DF7E7F39D39A40338B4E17/05/12 18:08:41 INFO Thrift. thriftserverrunner:using Default Thrift server TYPE17/05/12 18:08:41 INFO Thrift. Thriftserverrunner:using Thrift Server Type THREADPOOL17/05/12 18:08:42 WARN impl. Metricsconfig:cannot Locate configuration:tried hadoop-metrics2-hbase.properties,hadoop-metrics2.properties17/05/ 18:08:42 INFO Impl. metricssystemimpl:scheduled Snapshot period at Ten second (s). 17/05/12 18:08:42 INFO impl. Metricssystemimpl:hbase mEtrics system STARTED17/05/12 18:08:42 INFO mortbay.log:Logging to Org.slf4j.impl.Log4jLoggerAdapter (Org.mortbay.log) Via ORG.MORTBAY.LOG.SLF4JLOG17/05/12 18:08:42 INFO http. Httprequestlog:http request log for Http.requests.thrift was not DEFINED17/05/12 18:08:42 INFO Http. httpserver:added Global filter ' safety ' (class=org.apache.hadoop.hbase.http.httpserver$quotinginputfilter) 17/05/12 18:08:42 INFO http. httpserver:added Global filter ' clickjackingprevention ' (class= Org.apache.hadoop.hbase.http.ClickjackingPreventionFilter) 17/05/12 18:08:42 INFO http. httpserver:added Filter Static_user_filter (class=org.apache.hadoop.hbase.http.lib.staticuserwebfilter$ Staticuserfilter) to Context THRIFT17/05/12 18:08:42 INFO http. httpserver:added Filter Static_user_filter (class=org.apache.hadoop.hbase.http.lib.staticuserwebfilter$ Staticuserfilter) to Context STATIC17/05/12 18:08:42 INFO http. httpserver:added Filter Static_user_filter (class=org.apache.hadoop.hbase.http.lib.staticuserwebfilter$stAticuserfilter) to Context LOGS17/05/12 18:08:42 INFO http. Httpserver:jetty bound to Port 909517/05/12 18:08:42 INFO mortbay.log:jetty-6.1.26.cloudera.417/05/12 18:08:42 WARN mort Bay.log:Can ' t reuse/tmp/jetty_0_0_0_0_9095_thrift____.vqpz9l, Using/tmp/jetty_0_0_0_0_9095_thrift____.vqpz9l_ 512017503248018505817/05/12 18:08:43 Info mortbay.log:Started [EMAIL PROTECTED]:909517/05/12 18:08:43 Info Thrift. Thriftserverrunner:starting tboundedthreadpoolserver on/0.0.0.0:9090 with ReadTimeout 300000ms; Min worker threads=128, max worker threads=1000, Max queued requests=1000.../05/08 15:05:51 INFO zookeeper. Recoverablezookeeper:process IDENTIFIER=HCONNECTION-0X645132BF connecting to ZooKeeper ensemble=cdh-master-slave1 : 2181,cdh-slave2:2181,cdh-slave3:218117/05/08 15:05:51 INFO zookeeper. Zookeeper:initiating Client Connection, connectstring=cdh-master-slave1:2181,cdh-slave2:2181,cdh-slave3:2181 sessiontimeout=60000 watcher=hconnection-0x64513-master-slave1:2181,cdh-slave2:2181,cdh-slave3:2181, baseznode=/hbase17/05/08 15:05:51 INFO zookeeper. Clientcnxn:opening socket connection to server cdh-slave3/192.168.10.219:2181. Won't attempt to authenticate using SASL (unknown error) 17/05/08 15:05:51 INFO zookeeper. Clientcnxn:socket connection established, initiating session, client:/192.168.10.23:43170, server:cdh-slave3/ 192.168.10.219:218117/05/08 15:05:51 INFO Zookeeper. Clientcnxn:session establishment complete on server cdh-slave3/192.168.10.219:2181, SessionID = 0x35bd74a77802148, Negotiated timeout = 60000[[email protected] example]$ 17/05/08 15:32:50 INFO client. Connectionmanager$hconnectionimplementation:closing Zookeeper sessionid=0x35bd74a7780214817/05/08 15:32:51 INFO Zookeeper. zookeeper:session:0x35bd74a77802148 closed17/05/08 15:32:51 INFO ZooKeeper. Clientcnxn:eventthread shut down17/05/08 15:38:53 INFO zookeeper. Recoverablezookeeper:process identifier=hconnection-0xb876351 connecting to ZooKeeper ensemble=cdh-master-slave1 : 2181,cdh-slave2:2181,cdh-slave3:218117/05/08 15:38:53 INFO Zookeeper. Zookeeper:initiating Client Connection, connectstring=cdh-master-slave1:2181,cdh-slave2:2181,cdh-slave3:2181 sessiontimeout=60000 watcher=hconnection-0xb8763510x0, Quorum=cdh-master-slave1:2181,cdh-slave2:2181,cdh-slave3 : 2181, baseznode=/hbase17/05/08 15:38:53 INFO zookeeper. Clientcnxn:opening socket connection to server cdh-master-slave1/192.168.10.23:2181. Won't attempt to authenticate using SASL (unknown error) 17/05/08 15:38:53 INFO zookeeper. Clientcnxn:socket connection established, initiating session, client:/192.168.10.23:35526, server:cdh-master-slave1/ 192.168.10.23:218117/05/08 15:38:53 INFO Zookeeper. Clientcnxn:session establishment complete on server cdh-master-slave1/192.168.10.23:2181, SessionID = 0X15BA3DDC6CC90D4, negotiated timeout = 60000

Initial inference is that HBase sets a time-out and causes the connection to break

2. View official documents, but no meaningful timeout parameter 3, Google similar issues

To view similar content:

Uploaded image for project: ' HBase ' Hbasehbase-14926hung thriftserver; No timeout on the read from client; If client crashes, worker thread gets stuck readingagile Board ExportDetailsType:BugStatus:RESOLVEDPriority:MajorResolu Tion:fixedaffects version/s:2.0.0, 1.2.0, 1.1.2, 1.3.0, 1.0.3, 0.98.16Fix version/s:2.0.0, 1.2.0, 1.3.0, 0.98.17Componen T/s:thriftlabels:nonehadoop flags:reviewedrelease Note:adds A timeout to server read from clients. Adds new configs hbase.thrift.server.socket.read.timeout for setting read timeout on server sockets in milliseconds. Default is 60000;descriptionthrift server is hung. All worker threads is doing this: "thrift-worker-0" daemon prio=10 tid=0x00007f0bb95c2800 nid=0xf6a7 runnable [0x00007f0b 956E0000] Java.lang.Thread.State:RUNNABLE at java.net.SocketInputStream.socketRead0 (Native Method) at JAV A.net.socketinputstream.read (socketinputstream.java:152) at Java.net.SocketInputStream.read ( socketinputstream.java:122) at java.io.Bufferedinputstream.fill (bufferedinputstream.java:235) at Java.io.BufferedInputStream.read1 ( bufferedinputstream.java:275) at Java.io.BufferedInputStream.read (bufferedinputstream.java:334)-Locked &lt ;0x000000066d859490> (a Java.io.BufferedInputStream) at Org.apache.thrift.transport.TIOStreamTransport.read (TIO streamtransport.java:127) at Org.apache.thrift.transport.TTransport.readAll (ttransport.java:84) at Org.apach E.thrift.transport.tframedtransport.readframe (tframedtransport.java:129) at Org.apache.thrift.transport.TFramedTransport.read (tframedtransport.java:101) at Org.apache.thrift.transport.TTransport.readAll (ttransport.java:84) at Org.apache.thrift.protocol.TCompactProtocol.readByte (tcompactprotocol.java:601) at Org.apache.thrift.protocol.TCompactProtocol.readMessageBegin (tcompactprotocol.java:470) at Org.apache.thrift.TBaseProcessor.process (tbaseprocessor.java:27) at org.apache.hadoop.hbase.thRift. Tboundedthreadpoolserver$clientconnnection.run (tboundedthreadpoolserver.java:289) at Org.apache.hadoop.hbase.thrift.callqueue$call.run (callqueue.java:64) at Java.util.concurrent.ThreadPoolExecutor.runWorker (threadpoolexecutor.java:1145) at Java.util.concurrent.threadpoolexecutor$worker.run (threadpoolexecutor.java:615) at Java.lang.Thread.run ( thread.java:745) they never recover. I don ' t have client side logs.  We ' ve been here before:hbase-4967 "connected client thrift sockets should has a server side read timeout" but this patch Only got applied to FB branch (and thrift have changed since then) PS: Source https://issues.apache.org/jira/browse/HBASE-14926
4, Google "Hbase.thrift.server.socket.read.timeout"

You can see a page content:

问题背景测试环境是三台服务器搭建的Hadoop分布式环境。Hadoop版本是:hadoop-2.7.3;Hbase-1.2.4; zookeeper-3.4.9。 使用thrift c++接口向hbase中写入数据,每次都是刚开始写入正常,过一段时间就开始报错。 但之前使用的hbase-0.94.27版本就没遇到过该问题,配置也相同,一直用的好好地。thrift接口报错解决办法通过抓包可以看出,hbase server响应了RST包,导致连接中断。 通过 bin/hbase thrift start -threadpool命令可以readTimeout的设置为60s。thriftpool经过验证却是和这个设置有关,配置中没有配置过该项,通过查看代码发现60s是默认值,如果没有配置即按照以该值为准。因此在conf/hbase-site.xml中添加上配置即可:<property>         <name>hbase.thrift.server.socket.read.timeout</name>         <value>6000000</value>         <description>eg:milisecond</description></property>ps:来源http://blog.csdn.net/wwlhz/article/details/56012053

So after adding the parameters, restart HBase thrift and find the problem solved

5, view the source code, you can see
#https://github.com/apache/hbase/blob/master/hbase-thrift/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java...  public static final String THRIFT_SERVER_SOCKET_READ_TIMEOUT_KEY =    "hbase.thrift.server.socket.read.timeout";  public static final int THRIFT_SERVER_SOCKET_READ_TIMEOUT_DEFAULT = 60000;...      int readTimeout = conf.getInt(THRIFT_SERVER_SOCKET_READ_TIMEOUT_KEY,          THRIFT_SERVER_SOCKET_READ_TIMEOUT_DEFAULT);      TServerTransport serverTransport = new TServerSocket(          new TServerSocket.ServerSocketTransportArgs().              bindAddr(new InetSocketAddress(listenAddress, listenPort)).              backlog(backlog).              clientTimeout(readTimeout));

Problem Solving ~ ~ ~

6. But has the problem been solved?

In fact, there are problems, a period of time to find a continuous scan after about more than 20 minutes, the connection was disconnected, but also a difficult search, found to be hbase This version of the problem, it will all the connection (whether or not in use) default to idle state, and then have a hbase.thrift.connection.max-idletime configuration, So I'll configure this as 31104000 (one year), if it's in CDH, it should be in the Admin page configuration,

General steps to encounter problems:
Technical Progress Type:
1, check the log, check the location of the error, the initial positioning problems
2. View official documents
3, Google similar problems, or view the source to locate the problem

Quick fix Problem Type:
1, check the log, check the location of the error, the initial positioning problems
2, Google similar issues
3. Check the official documentation, or view the source code

Reference:

    • [1] HBase thrift/thrift2 Usage Guide

Use Happybase to access hbase there is a broken pipe problem---Two "big" bug

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.