Background: A mapreduce program was written, and it was found that the program had a very memory footprint and needed a way to analyze the memory detail usage.
You can use Pmap–d <PID> on Linux to see the process logic address space usage, but there will be a lot of anno areas, obviously this can not meet the curiosity of the students.
In this article, the Eclipse remote debugging HDP source code mentions the method of using JMX to debug HDP remotely. JMX (Java Management Extensions, or Java Management extensions), words too literally, shows that this mechanism is related to management. On the basis of this mechanism, the operation of the JVM can be analyzed in real time. Here's how:
1. Modify the configuration of the vim/usr/hdp/2.3.0.0-2557/hadoop/etc/hadoop/hadoop-env.sh to add JMX-related parameters:
Text version (add 45 lines of content)
42 43 # The following applies to multiple commands (FS, DFS, fsck, distcp etc) 44 exp ORT hadoop_client_opts= " -xmx${hadoop_heapsize} M $HADOOP _client_opts " 45 Export Hadoop_client_opts= " - Dcom.sun.management.jmxremote.authenticate=false-dcom.sun.management.jmxremote.ssl=false- Dcom.sun.management.jmxremo te.local.only=false - Djava.net.preferipv4stack=true -dcom.sun.management.jmxremote.port=1499 $HADOOP _client_opts "
This will open a port on the machine executing the Hadoop jar, which is determined by the -dcom.sun.management.jmxremote.port=1499 parameter.
2. Start a mapreduce program, bash-4.1$ Hadoop jar /home/yanliming/workspace/mosaictest/videomapreduce/ videomapreduce-1.0-snapshot.jar/tmp/yanliming/wildlife.wmv/tmp/ryj/result/output012
On the cluster that started the MapReduce, you can see that the port you just configured is up:
3. Download VISUALVM Address:http://visualvm.java.net/download.html
In VISUALVM, configure the IP and port number of the remote machine for real-time monitoring:
Hadoop jar configuration uses JMX for remote JVM monitoring