Remote debugging of hadoop Components

Source: Internet
Author: User

Remote debugging is very useful for application development. For example, develop programs for low-end machines that cannot host the development platform, or debug programs on dedicated machines (such as Web servers that cannot interrupt services. Other scenarios include Java applications (such as mobile devices) running on devices with small memory or low CPU performance, or developers who want to separate applications from the development environment.

To perform remote debugging, you must use Java Virtual Machine (JVM) V5.0 or an updated version.

JPDA Overview

Sun Microsystem's Java platform debugger architecture (JPDA) technology is a multi-tier architecture that allows you to easily debug Java applications in various environments. JPDA consists of two interfaces (JVM tool interface and jdi), one Protocol (Java debug wire protocol), and two software components (backend and front-end) used to merge them. It is designed to allow debugging personnel to debug in any environment.

For more details, refer to using eclipse to remotely debug Java applications.

Jdwp settings

The JVM itself supports remote debugging, and eclipse also supports jdwp. You only need to load the following parameters when the JVM of each module is started:

-Xdebug -Xrunjdwp:transport=dt_socket, address=8000,server=y,suspend=y

Meanings of parameters:

-Xdebug enables debugging feature-xrunjdwp enables the jdwp implementation, which contains several sub-options: Transport = dt_socketjpda front-end and back-end transmission methods. Dt_socket indicates that socket transmission is used. Address = 8000jvm listens for requests on port 8000. Set this parameter to a non-conflicting port. Server = YY indicates that the started JVM is debugged. If it is N, the JVM started is the debugger. Suspend = YY indicates that the started JVM will pause and wait until the debugger is connected. If suspend is N, the JVM does not pause the wait.
Configure hbase remote debugging

Open/etc/hbase/conf/hbase-env.sh, Find the following content:

# Enable remote JDWP debugging of major HBase processes. Meant for Core Developers # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"

If you want to remotely tune the hbase-master process, removeHBASE_MASTER_OPTSAnd so on. Note that I am using hbase IN THE cdh-4.3.0.

Configure hive remote debugging

Stop the hive-server2 process, and then run the following command to start the hive-server2

hive --service hiveserver --debug

The process listens to the port 8000 and waits for the debugging connection. To change the listening port, you can modify the configuration file:${HIVE_HOME}bin/ext/debug.sh

If hadoop is a version later than 0.23, an error is reported when CLI is started in debug mode:

ERROR: Cannot load this JVM TI agent twice, check your java command line for duplicate jdwp options.

Open${Hadoop_HOME}/bin/hadoop, Comment out the following code

# Always respect HADOOP_OPTS and HADOOP_CLIENT_OPTSHADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"
Configure yarn for remote debugging

Add debugging parameters in the following code:

if [ "$COMMAND" = "classpath" ] ; thenif $cygwin; thenCLASSPATH=`cygpath -p -w "$CLASSPATH"`fiecho $CLASSPATHexitelif [ "$COMMAND" = "rmadmin" ] ; thenCLASS='org.apache.hadoop.yarn.client.RMAdmin'YARN_OPTS="$YARN_OPTS $YARN_CLIENT_OPTS"elif [ "$COMMAND" = "application" ] ; thenclass="org".apache.hadoop.yarn.client.cli.ApplicationCLIYARN_OPTS="$YARN_OPTS $YARN_CLIENT_OPTS"elif [ "$COMMAND" = "node" ] ; thenclass="org".apache.hadoop.yarn.client.cli.NodeCLIYARN_OPTS="$YARN_OPTS $YARN_CLIENT_OPTS"elif [ "$COMMAND" = "resourcemanager" ] ; thenCLASSPATH=${CLASSPATH}:$YARN_CONF_DIR/rm-config/log4j.propertiesCLASS='org.apache.hadoop.yarn.server.resourcemanager.ResourceManager'YARN_OPTS="$YARN_OPTS $YARN_RESOURCEMANAGER_OPTS"if [ "$YARN_RESOURCEMANAGER_HEAPSIZE" != "" ]; thenJAVA_HEAP_MAX="-Xmx""$YARN_RESOURCEMANAGER_HEAPSIZE""m"fielif [ "$COMMAND" = "nodemanager" ] ; thenCLASSPATH=${CLASSPATH}:$YARN_CONF_DIR/nm-config/log4j.propertiesCLASS='org.apache.hadoop.yarn.server.nodemanager.NodeManager'YARN_OPTS="$YARN_OPTS -server $YARN_NODEMANAGER_OPTS"if [ "$YARN_NODEMANAGER_HEAPSIZE" != "" ]; thenJAVA_HEAP_MAX="-Xmx""$YARN_NODEMANAGER_HEAPSIZE""m"fielif [ "$COMMAND" = "proxyserver" ] ; thenCLASS='org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer'YARN_OPTS="$YARN_OPTS $YARN_PROXYSERVER_OPTS"if [ "$YARN_PROXYSERVER_HEAPSIZE" != "" ]; thenJAVA_HEAP_MAX="-Xmx""$YARN_PROXYSERVER_HEAPSIZE""m"fi

For example, if you want to debug the ResourceManager codeelif [ "$COMMAND" = "resourcemanager" ]Add the following code to the Branch:

YARN_RESOURCEMANAGER_OPTS="$YARN_RESOURCEMANAGER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=6001"

For other processes, refer to the above.

Note: do not conflict with the port.

Configure mapreduce remote debugging

If you want to debug map or reduce tasks, modifybin/hadoopIt's useless, becausebin/hadoopNo startup parameters for map task.

At this point you need to modify the mapred-site.xml

<property>     <name>mapred.child.java.opts</name>     <value>-Xmx800m -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8000</value> </property

On a tasktracker, only one map task or one reduce task can be started. Otherwise, a port conflict occurs during start. Therefore, you must modifyconf/hadoop-site.xmlConfiguration items in:

<property>    <name>mapred.tasktracker.map.tasks.maximum</name>    <value>1</value></property><property>    <name>mapred.tasktracker.reduce.tasks.maximum</name>    <value>0</value></property>
Use methods in Eclipse:
  1. Open eclipse and findDebug Configurations..., Add a remout Java application:

  2. Source code can be associated with hive, and then click debug to enter the remote debug mode.

  3. Write a JDBC test class, run the code, this time because the hive-server2 side does not set the endpoint, so the program can run normally until the end.

  4. Set a breakpoint in hive code, suchExecDriver.javaOfexecuteMethod, and then run the JDBC test class.

References
  1. Remote debugging of hadoop in eclipse
  2. Hive remote debugging
  3. Use eclipse to remotely debug Java applications

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.