JMX,JSTATD the last layer of protection for JVM applications on-line

Source: Internet
Author: User
Tags cpu usage file permissions

Directory

    • I understand the monitoring
    • Code exception Monitoring
      • Remote Host Configuration Jmx
      • Modify the Java program startup parameters (Jvm_opts) to start.
      • Modify file Permissions
      • Start JVISUALVM
      • Monitoring Java programs on the server
      • JSTATD connecting to a remote JVM
        • Start the JSTATD service
        • Create a security policy file
        • Start jjstatd with parameters
      • The difference between JMX connections and JSTATD connections
    • Linux commands monitor JVM programs
      • Top command to view CPU usage for each process
      • CPU usage per thread in the process
      • To locate the thread's running condition
        • jstack-l [PID] View all thread information
        • jstack-l [PID] | grep 16 Binary
    • SOURCE Insert Pile
一个成功的java项目标准并不仅仅是业务功能实现,但是纵观国内,很多项目组在前期项目开发设计中只考虑了业务功能,没有考虑项目后期维护的监控设计。没有完善的监控运维设计,项目存活的寿命应该也不长吧?好的项目能够吸引人留下来,并不断强化项目的功能优化每一处代码,坏的项目只会逼死人,不断的增加龌龊代码以至于根本无法维护。当然从公司来说,业务的首要实现是公司能够赚钱的有效保障,公司赚不了钱了,写的在好的代码也只能静静的躺在硬盘中。我想一个负责的开发人员不仅要能重视业务功能的实现,还能保证在项目上线运维中针对突发情况做到监控。
I understand the monitoring

I understand that there are two kinds of monitoring, one is operational monitoring-monitoring the use of the entire cluster of resources and the survival of individual services, the other is the development of monitoring-monitoring code problems caused by thread deadlock, Oom, and the history of business messages can be traced back.
I am an open, here mainly talk about my experience, the development of monitoring. How to reduce unnecessary overtime for developers.

Code exception Monitoring
应用代码在面对线上各种请求时,经常会发生死锁,OOM等问题。这个时候我们如何去查看呢?如果我们不想连上远程服务器,通过本地的一些可视化工具连接远程程序,查看远程程序的线程,CPU,GC,堆内存等使用情况。
Remote Host Configuration Jmx
这里只是演示JMX的监控功能,JMX还有动态修改bean属性等功能不在这一篇文章讲解。

Change the password, locate the configuration file $java_home/jre/lib/management/jmxremote.password.template, copy it and rename it to Jmxremote.password, Then modify the read-only permission and edit Jmxremote.passwrod, canceling the following two lines of comments:

#monitorRole QED#controlRole R&D
Modify the Java program startup parameters (Jvm_opts) to start.
打开tomcat的bin目录下的catalina.sh,加入以下内容**(非tomcat程序也类似)**
JAVA_OPTS="$JAVA_OPTS -Djava.rmi.server.hostname=192.168.19.131 -Dcom.sun.management.jmxremote.port=18999       -Dcom.sun.management.jmxremote.ssl=false    -Dcom.sun.management.jmxremote.authenticate=false"

The parameter authenticate indicates whether password authentication is required, and an assignment of true will use the password set by the Jmxremote.password.

Modify file Permissions
监控的程序是由哪个用户启动,则把jmxremote.password文件的权限改为这个用户的只读权限,否则启动程序会报错:Error: Password file read access must be restricted。这些在jmxremote.password里的注释都有说明。比如,如果你是用intsmaze用户启动java程序
chown intsmaze jmxremote.passwordchmod 400 jmxremote.password
Start JVISUALVM

Start the program you want to monitor first

sh startup.sh

Left column, right-click "Remote" >> "Add remote host"

Left column, right-click the remote host you just added >> add jmx link, use the configured port


If we do not configure the jvm_opts parameter, then we do not have access to the Tomcat service on the remote server by using JAVAVISUALVM locally, and if you want to know the status of the remote server, you must use tools such as the CRT to connect to the server using Linux commands to see how the program is running.

Monitoring Java programs on the server

Add the following parameters to the JAVA-CP command

java -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=22222 -cp jmx.jar cn.intsmaze.thread.TestDeadThread

The Testdeadthread class is as follows

public class TestDeadThread implements Runnable {    int a, b;    public TestDeadThread(int a, int b) {        this.a = a;        this.b = b;    }    public void run() {        synchronized (Integer.valueOf(a)) {            synchronized (Integer.valueOf(b)) {                System.out.println(a + b);            }        }    }    public static void main(String[] args) throws InterruptedException {        Thread.sleep(3000);        for (int i = 0; i < 100; i++) {            new Thread(new TestDeadThread(1, 2)).start();            new Thread(new TestDeadThread(2, 1)).start();        }    }}
JvisiualVM通过JMX的方式连接到远程服务器上的JVM,此时能获取到JVM的基本信息(启动参数、系统属性)、CPU使用情况、堆内存整体情况以及线程的整体情况等。但如果想通过Visual GC插件进一步了解堆内各区的情况的话,就会发现插件此时并不工作。


The Visual GC plug-in does not work because the protocol used by this plugin is RMI, and therefore needs to be connected using the following JSTATD method.

JSTATD connecting to a remote JVM
JVM jstat Daemon:守护进程,一个RMI服务器程序,用于监控本地所有JVM从创建开始直到销毁整个过程中的资源使用情况,同时提供接口给监控工具(如这里的VisualVM),让工具能连接到本机所有的JVM。
Start the JSTATD service
${java_home}/bin目录下启动jstatd服务
[intsmaze@centos-Reall-131 bin]./jstatdCould not create remote objectaccess denied ("java.util.PropertyPermission" "java.rmi.server.ignoreSubClasses" "write")java.security.AccessControlException: access denied ("java.util.PropertyPermission" "java.rmi.server.ignoreSubClasses" "write")        at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)        at java.security.AccessController.checkPermission(AccessController.java:884)        at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)        at java.lang.System.setProperty(System.java:792)        at sun.tools.jstatd.Jstatd.main(Jstatd.java:139)
由于jstatd server没有提供任何对远程client端的认证,客户端程序获取到本地当前用户的所有JVM信息后可能存在安全隐患,所以jstatd要求启动之前必须指定本地安全策略,否则jstatd进程无法启动,抛出上面错误。
Create a security policy file
在需要被监控的远程主机创建一个安全策略文件,比如保存为/home/intsmaze/jdk1.8.0_144/bin/jstatd-all.policy,内容如下:
grant codebase "file:/home/intsmaze/jdk1.8.0_144/lib/tools.jar" {permission java.security.AllPermission;};
Start jjstatd with parameters

You can successfully start JSTATD server by using the following command

./jstatd -J-Djava.security.policy=/home/intsmaze/jdk1.8.0_144/bin/jstatd-all.policy -J-Djava.rmi.server.logCalls=true./jstatd -J-Djava.security.policy=/home/intsmaze/jdk1.8.0_144/bin/jstatd-all.policy &
向通过jstatd命令启动的JVM(Main class:sun.tools.jstatd.Jstatd)传递参数,比如-J-Xms48m指定了Jstatd这个JVM的初始堆内存为48MB

Right-click to establish JSTATD connection

All running JVMs are automatically listed under the corresponding remote host node

The difference between JMX connections and JSTATD connections

JMX: Using JMX requires the remote JVM to turn on remote access support at boot time, set up JMX ports, and so on, each jmx connected to a remote JVM.
JSTATD: When using the JSTATD connection method, you need to create a security policy file on the remote host and then start the JSTATD process, and the process needs to remain running, and the client can see information about all the JVMs of the current user on the remote host, that is, just create a JSTATD connection.

Linux commands monitor JVM programs

If we do not configure JMX and JSTATD, then we can not use JVISIUALVM to monitor the remote JVM program, to know the operation status of the program we have to connect to the server to view.

Top command to view CPU usage for each process
[intsmaze@centos-Reall-131 ~]$ toptop - 13:04:07 up 3 min,  2 users,  load average: 0.00, 0.01, 0.00Tasks: 104 total,   1 running, 103 sleeping,   0 stopped,   0 zombieCpu(s):  0.0%us,  0.2%sy,  0.0%ni, 99.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%stMem:   2086348k total,   224720k used,  1861628k free,    37484k buffersSwap:  2064376k total,        0k used,  2064376k free,    91204k cached  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                         385 root      20   0     0    0    0 S  0.3  0.0   0:00.02 flush-8:0                                                                                                      2211 intsmaze  20   0  858m  25m 9448 S  0.3  1.2   0:00.87 java                                                                                                          
    1. First line: Load average:0.41, 0.45, 0.43 system load, which is the average length of the task queue. 1 minutes ago, 5 minutes ago, 15 minutes ago Average load

    2. Second line: tasks:141 total process count, 0 zombie zombie Process

    3. Third Behavior CPU Information
      6.1% US user space consumes CPU percentage
      1.5% SY core space CPU percent occupied
      0.0% CPU Percentage of processes that have changed priority within NI user process space
      92.2% ID Idle CPU percent
      0.0% wa wait for the input output CPU time percentage
      0.0% Hi Hardware Interrupt
      0.0% si software Interrupt
      0.0%st Real-time

    4. Fourth to fifth behavior memory information.
      MEM:191272K Total Physical Memory
      The amount of memory that 22052k buffers uses as the kernel cache
      Total swap area of swap:192772k
      123988k cached buffered Swap area total

CPU usage per thread in the process
[intsmaze@centos-Reall-131 ~]$ top -Hp 2461  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                                                                                             2462 intsmaze  20   0  870m  25m 9416 S  30.0  1.2   0:00.28 java                                                                                                           2463 intsmaze  20   0  870m  25m 9416 S  0.0  1.2   0:00.00 java                                                                                                           2464 intsmaze  20   0  870m  25m 9416 S  0.0  1.2   0:00.00 java
To locate the thread's running condition

Jstack is a command-line tool that comes with the JDK, primarily for thread dump analysis, to get the Java stack and the native stack that run Java programs, and to easily know how the current thread is running.

jstack-l [PID] View all thread information
jstack -l 2238 > intsmaze.log[intsmaze@centos-Reall-131 ~]$ jstack -l 2461"Thread-200":        at cn.intsmaze.thread.TestDeadThread.run(TestDeadThread.java:29)        - waiting to lock <0x9d62a3a0> (a java.lang.Integer)        at java.lang.Thread.run(Thread.java:748)"Thread-10":        at cn.intsmaze.thread.TestDeadThread.run(TestDeadThread.java:30)        - waiting to lock <0x9d62a390> (a java.lang.Integer)        - locked <0x9d62a3a0> (a java.lang.Integer)        at java.lang.Thread.run(Thread.java:748)

The thread dump information generated by the Jstack command contains all the surviving threads in the JVM, and in order to parse the specified thread, the call stack for the corresponding thread must be found.

jstack-l [PID] | grep 16 Binary

TOP-HP [PID] gets a high CPU-intensive thread PID, turns the PID into a 16-binary value, and in thread dump each thread has a nid to find the corresponding Nid.

得到2462 的十六进制值···[intsmaze@centos-Reall-131 ~]$ printf "%x\n" 2462 99e···jstack -l 21711 | grep 99e"PollIntervalRetrySchedulerThread" prio=10 tid=0x00007f950043e000 nid=0x99e in Object.wait()

In the nid=0x99e thread call stack, the CPU consumes the object.wait () of the Pollintervalretryschedulerthread class and then observes the business code that it writes.

SOURCE Insert Pile
当初小弟运气好,做了一个比较核心的红包业务,基本上每周都会有新的版本发布。而且面对的人群是普通用户,用户一发现消费没有中红包,就会打客服,然后我这边就会收到反馈,这个时候就要根据客户的交易id查询原因给出反馈。如果当初在开发的时候,没有考虑到源码插桩,那么这个时候我就会头疼,推出去的报文相应字段确实没有中红包,然后我去看规则是否是这笔交易没有满足,然后找了几天还是没有给出让人信服的答案。在这个系统架构师对我们所有的系统做了源码插桩,一条记录从进入系统,走过那些条件判断的流程,每一个条件判断的值都进行了插桩,然后汇聚成一条消息处理记录存储在hbase。然后面对这种情况,我们只需要去hbase中查询一下,拿出这条消息在整个系统的路径状况变一目了然了。

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.