"Gandalf" HBase Random outage event handling & JVM GC Review

Source: Internet
Author: User

First, IntroductionThis article documents the solution to the HBase random outage event that plagued the team for two weeks, and reviews the JVM GC Tuning Basics for your reference. Welcome reprint, please specify Source: http://blog.csdn.net/u010967382/article/details/42394031
second, the experimental environment
16 virtual machines, 4G RAM, 1-core cpu,400g hdd ubuntu 14.04 LTS (gnu/linux 3.13.0-29-generic x86_64)
CDH5.2.0 kit (including corresponding version of Hadoop,hive,hbase,mahout,sqoop,zookeeper, etc.)
Java 1.7.0_6064-bit Server third, abnormal sceneIn the above experimental environment to perform computational tasks, the computational tasks involved hive, Mahout, Hbase bulkload, MapReduce, workflow driven through Shell scripting control, the entire task execution process involves the basic behavior data 1.6 million, business data 400,000.
Repeatedly perform tasks in the process of random occurrence of the following types of anomalies, only with the text description, do not copy the abnormal scene, we each:1. HBase's regionserver process randomly hangs (this exception occurs almost every time, except that the Regionser node that hangs is different) 2.HMaster process randomly hangs out3. The primary and standby namenode nodes are randomly hung off4.Zookeeper node randomly hangs out5.Zookeeper Connection Timeout6.JVM GC Sleep time is too long7.datanode Write Timeoutwait
through research and analysis and debugging, we find that problem solving needs to proceed from the following aspects:1.Hbase ZK Connection Timeout Correlation parameter tuning: The default ZK timeout setting is too short, once the full GC occurs, it is extremely easy to cause the ZK connection to time out;2.Hbase JVM GC-related parameter tuning: GC tuning allows for better GC performance, reducing the time and full GC frequency of a single GC;3.ZK Server tuning: This refers to ZK's service-side tuning,ZK client (such as HBase client) ZK Timeout parameter must be in the range of service-side timeout parameters, otherwise ZK client set timeout parameter does not get effect; 4.HDFS reading and writing data related parameters need to be tuned;5.YARN Allocate resource parameter adjustment for each node: yarn needs to allocate resources according to the real node configuration, and the previous yarn configuration is much larger than the hardware resources of the real virtual machine for each node;6. Cluster planning needs to be optimized: In the previous cluster planning, in order to fully utilize the virtual machine resources, NameNode, NodeManager, Datanode,regionserver will mix the same node, which will lead to these key hub node communication and memory pressure is too large, This makes it prone to anomalies when calculating pressure. The correct approach is to separate the hub node (namenode,resourcemanager,hmaster) from the Data + compute nodes. Iv. various configurations and cluster adjustments implemented in order to solve the problem HBase Hbase-site.xml
<property> <name>zookeeper.session.timeout</name> &LT;VALUE&GT;300000&LT;/VALUE&GT;&LT;/PR Operty>
<property> <name>hbase.zookeeper.property.tickTime</name> <value>60000</value> </property>
<property> <name>hbase.hregion.memstroe.mslab.enable</name> <value>true</value&gt ;</property>
<property> <name>hbase.zookeeper.property.maxClientCnxns</name> <value>10000</val Ue></property>
<property><name>hbase.client.scanner.timeout.period</name><value>240000</value></property>
<property><name>hbase.rpc.timeout</name><value>280000</value></property>
<property><name>hbase.hregion.max.filesize</name><value>107374182400</value></property>
<property><name>hbase.regionserver.handler.count</name><value>100</value></property><property><name>dfs.client.socket-timeout</name><value>300000</value><description>down the DFS timeout from ten to seconds.</description></property> hbase-env.shExport Hbase_heapsize=2048mexport hbase_home=/home/fulong/hbase/hbase-0.98.6-cdh5.2.0
Export Hbase_log_dir=${hbase_home}/logsexport hbase_opts= "-SERVER-XMS1G-XMX1G-XX:NEWRATIO=2-XX:PERMSIZE=128M-XX:MAXPERMSIZE=128M-VERBOSE:GC-XLOGGC: $HBASE _home/logs/ hbasegc.log-xx:+printgcdetails-xx:+printgctimestamps-xx:+useparnewgc-xx:+cmsparallelremarkenabled-xx:+ Useconcmarksweepgc-xx:cmsinitiatingoccupancyfraction=75-xx:+heapdumponoutofmemoryerror-xx:heapdumppath= $HBASE _ Home/logs"Zookeeper zoo.cfgSynclimit=10#new in 3.3.0:the maximum session timeout in milliseconds the server would allow the client to negotiate. Defaults to times the ticktime.maxsessiontimeout=300000# the directory where the snapshot is stored.# does not use/tmp For storage,/tmp Here are just# example sakes.datadir=/home/fulong/zookeeper/cdh/zookdata# the port at which the clients W Ill connectclientport=2181 modify the following two files to track the ZK log, ZK's default log view is inconvenient. log4j.propertieszookeeper.root.logger=info,console,rollingfilezookeeper.console.threshold=infozookeeper.log.dir=/home/fulong/ zookeeper/cdh/zooklogszookeeper.log.file=zookeeper.logzookeeper.log.threshold=debugzookeeper.tracelog.dir=/ Home/fulong/zookeeper/cdh/zooklogszookeeper.tracelog.file=zookeeper_trace.log
log4j.appender.rollingfile=org.apache.log4j.rollingfileappenderlog4j.appender.rollingfile.threshold=${ Zookeeper.log.threshold}log4j.appender.rollingfile.file=${zookeeper.log.dir}/${zookeeper.log.file}
# Max log file size of 10MBLOG4J.APPENDER.ROLLINGFILE.MAXFILESIZE=50MB zkenv.shif ["x${zoo_log4j_prop}" = "x"]then zoo_log4j_prop= "Info,console,rollingfile" fi
Note: After modifying the above two files, and did not see the ZK log4j log file, the reason for further investigation.HDFS Hdfs-site.xml<property> <name>dfs.datanode.socket.write.timeout</name> <value>600000</value&gt ;</property>
<property> <name>dfs.client.socket-timeout</name> &LT;VALUE&GT;300000&LT;/VALUE&GT;&LT;/PR Operty>
<property> <name>dfs.datanode.max.xcievers</name> <value>4096</value></prop Erty> YARN Yarn-site.xml<property><name>yarn.scheduler.minimum-allocation-mb</name><value>512</value></property>
<property><name>yarn.scheduler.fair.user-as-default-queue</name><value>false</value></property>
<property><name>yarn.resourcemanager.zk-timeout-ms</name><value>120000</value></property>
<property><name>yarn.nodemanager.resource.memory-mb</name><value>3072</value></property>
<property><name>yarn.scheduler.minimum-allocation-mb</name><value>128</value></property>
<property><name>yarn.scheduler.maximum-allocation-mb</name><value>3072</value></property>
<property><name>yarn.nodemanager.resource.cpu-vcores</name><value>1</value></property>
<property><name>yarn.scheduler.maximum-allocation-vcores</name><value>1</value></property>
<property><name>yarn.nodemanager.container-monitor.interval-ms</name><value>300000</value></property> Cluster Tuningnn active, nn Standby, rm Active, RM Standby nodes are not running DN,NM,RSDN, NM, RS where the node one by one corresponds. The layout after adjustment:

Description:If you encounter a similar problem, you can focus on the above configuration items, butSpecific values should be analyzed according to the specific situation。 v. Supplementary review--JVM GCThe GC problem in tuning process is very obvious, before tuning, there are frequent 3~6min full GC time, after tuning, GC time can be controlled within 20s. GC tuning is important for Hadoop clusters and is a basic knowledge that must be mastered in this simple record.
For a complete description of the knowledge, please refer to: http://www.cubrid.org/blog/dev-platform/understanding-java-garbage-collection/translation address:/http Www.importnew.com/1993.html and instructions on the Oracle website: http://www.oracle.com/technetwork/java/javase/tech/ exactoptions-jsp-141536.htmlhttp://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.htmlhttp:/ /www.oracle.com/technetwork/java/javase/tech/index-jsp-140228.html
The following describes only the most important basics:The main memory area of each JVM is divided into two parts: Permanent space and Heap space.
Permanent is the persistent generation (Permanent Generation), which mainly holds Java class definition information, which is not related to the Java objects that the garbage collector collects. Heap={old+new={eden,Survivor 0 ,Survivor 1 }}, old is the older Generation, and new is the younger generation (young Generation). younger generation (young Generation)used to save the object that was created for the first time, it is further divided into three spaces:an Eden Space (Eden),Two survivor Space (Survivor). the division of old age and younger generation has a great impact on garbage collection.         The basic order of execution is as follows:
    1. The vast majority of objects that have just been created are stored in the Eden space.
    2. After the first GC has been performed in Eden Space, the surviving objects are moved to one of the survivor spaces.
    3. Thereafter, after the GC is executed in Eden Space, the surviving objects are stacked in the same survivor space.
    4. When one survivor is saturated with space, the surviving object is moved to another survivor's space. This will then empty the survivor space that is already saturated.
    5. In the above steps, repeated several times the surviving objects will be moved to the old age.
As of the current version, there are 5 types of Java-Configurable garbage collectors:
    1. Serial GC
    2. Parallel GC
    3. Parallel Old GC (Parallel compacting GC)
    4. Concurrent Mark & Sweep GC (or "CMS")
    5. Garbage First (G1) GC  
The use of more mature is the CMS.
We can use various tools to monitor the JVM GC, the more simple and intuitive is the jstat.For example, we want to monitor Namenode GC situation, we can first see the process number with JPS, and then through Jstat to view the GC situation:

The parameter followed by the Jstat-gcutil is the JVM process number, and 1s is the data refresh time. Each column of the command output is in turn: Survivor 0 of the space occupancy ratio, survivor 1 of the space occupancy ratio, the occupancy ratio of the Eden space, the occupancy ratio of the old age space, the occupation ratio of the durable generation space, the number of GC occurrences of the young Generation (s0+s1+e), the total time of GC in the young generation (in seconds), The number of full GC occurrences, the total time that the full GC occurred(units in seconds), the total time that the GC consumes(in seconds).
Finally, we enclose the JVM parameters for this tuning hbase setting:-server//Turn on Java server Mode
-xms1g//min max heap Memory-xmx1g
-xx:newratio=2//old age space: young generation space =2
-xx:permsize=128m//Initial and maximum persistent generation space, feeling can be further reduced, currently observing the use of durable generation space is not more than 30%-xx:maxpermsize=128m
-XLOGGC: $HBASE _home/logs/hbasegc.log//Turn on GC logging for easy commissioning with minimal performance impact-xx:+printgcdetails-xx:+printgctimestamps
-XX:+USEPARNEWGC//Turn on the CMS garbage collection period-xx:+cmsparallelremarkenabled-XX:+USECONCMARKSWEEPGC-xx:cmsinitiatingoccupancyfraction=75

"Gandalf" HBase Random outage event handling & JVM GC Review

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.