Zabbix Auto-discovery combined with Shell enables Autodiscover to consume the largest TOP10 process and monitor its resources

Source: Internet
Author: User
Tags server memory

Recently in the thinking of a problem, the online server run a variety of services, it may be running Nginx, another run is MySQL, the other is to run NFS or other services, etc. Through a certain script fixed write some services to implement monitoring all the server's process resource consumption Zabbix server resources do not say, if the server ran the service is not on the fixed list, the monitoring service can not get the corresponding data.

In order to solve this problem, recently in the Zabbix of the automatic discovery can not realize the automatic discovery of the largest server memory consumption of the n processes, and then the process of memory and CPU resources to monitor the acquisition of data? So there is the birth of this article.

First, we need to get to the top command result, you can use the following command to redirect the results obtained from the top command to a file:

Top-b-N 1 >/tmp/top.txt

Where the command means to execute the top command once and redirect the result to the Top.txt file

Add the command to the Zabbix user's scheduled task, executed once per minute, with the following command:

CRONTAB-E*/1 * * * * top-b-N 1 >/tmp/top.txt

A top.txt file is generated in the TMP directory after it is put in

$ HEAD -10 /TMP/TOP.TXT TOP - 15:42:01 UP 72 DAYS, 22:25,   2 users,  load average: 0.09, 0.08, 0.06Tasks: 880  total,   1 running, 879 sleeping,   0 stopped,    0 ZOMBIE%CPU (s):   2.8 us,  0.7 sy,  0.0 ni, 96.5  id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 stkib  Mem : 13175284+total, 97396048 free, 20357148 used, 13999640  Buff/cachekib swap: 32767996 total, 32452380 free,   315616 used.  11058964+avail Mem    PID USER      PR   ni    virt    res    shr s  % cpu %mem      time+ command 20732 zabbix    20   0   130716   2436   1204 R  11.8  0.0    0:00.03 top126808 upload    20   0 8375636  945876  27268 s   5.9  0.7  63:33.97 java127591  upload    20   0 9898.1m 1.078g  27960 s    5.9  0.9  63:58.01 java

Well, once the data has been fetched, the data needs to be processed, and here are two scripts, one to get the process name that consumes the most memory resources, and the other to get information about the memory and CPU resources that a process consumes. Let's take a look at the first script:

$ cat scripts/check_process.sh #!/bin/bashtablespace= ' tail-n +8/tmp/top.txt|awk ' {a[$NF]+=$6}end{for (k in a) print a[k] /1024,k} ' |sort-gr|head-10|cut-d ' "-f2 ' count= ' echo" $TABLESPACE "|wc-l ' Index=0echo ' {" Data ": [' echo ' $TABLESPACE ' | while read line; Do echo-n ' {' {#TABLENAME} ': ' $LINE ' "} ' index= ' expr $INDEX + 1 ' If [$INDEX-lt $COUNT]; Then echo ', ' Fidoneecho ']} '

The most critical of these is ' Tail-n +8/tmp/top.txt|awk ' {a[$NF]+=$6}end{for (k in a) print a[k]/1024,k} ' |sort-gr|head-10|cut-d ' "-f2 ' this command: this The command means extracting data from the Top.txt file from line eighth to the end of the row, and then using awk to accumulate the data, with the result of the last column keyword, the value of the 6th column corresponding to each keyword being added, the output of the cumulative and final columns of the sixth column of data, and the sorting using sort, note that the parameters here is to use-GR instead of using-NR because the value of the sixth column obtained is in kilobytes, if a process takes up more than 10G of memory, it will be counted with scientific notation, the SORT-NR parameter cannot count the scientific notation, the parameter needs to be changed to-GR. Where-R is the reverse sort, and in order to prevent Zabbix obtain the value is scientific notation obtained by the value is not recognized, first the value/1024 will be converted to MB, when Zabbix obtained data and then *1024*1024 to restore the value to a byte unit. Head-10 is to take out the 10 most memory-intensive processes, then use cut to slice the data and get the process name of 10 processes. The following code is a JSON-formatted output of the 10 process names obtained, and the output is as follows:

$ sh./scripts/check_process.sh {"Data": [{"{#TABLENAME}": "Java"},{"{#TABLENAME}": "Docker"},{"{#TABLENAME}": "Nginx "},{" {#TABLENAME} ":" Sshd "},{" {#TABLENAME} ":" Tuned "},{" {#TABLENAME} ":" networkma+ "},{" {#TABLENAME} ":" Zabbix_ag+ " },{"{#TABLENAME}": "systemd-j+"},{"{#TABLENAME}": "Crond"},{"{#TABLENAME}": "Rsyslogd"}]}

As for why JSON formatting has been described in the previous blog, because Zabbix automatically discovers that the value format is JSON formatted value to be recognized.

The second script is to get the CPU and memory resources that a process consumes, and the script reads:

$ cat./scripts/processmonitor.sh #!/bin/bashprocess=$1name=$2case $ Inmem) echo "' Tail-n +8/tmp/top.txt|awk ' {a[$NF]+ =$6}end{for (k in a) print a[k]/1024,k} ' |grep ' $process ' |cut-d ' "-f1 '";; CPU) echo "' Tail-n +8/tmp/top.txt|awk ' {a[$NF]+=$9}end{for (k in a) print a[k],k} ' |grep ' $process" |cut-d ""-f1 ";; *) echo "Error input:";; Esacexit 0

The core of the script and the previous script are very similar, I believe the reader understands that the above script in understanding the following script is also very loose. Here is the result of the script execution:

$ sh./scripts/processmonitor.sh java mem13115.5$ sh./scripts/processmonitor.sh java cpu17.7

After you can get the value, you need to configure the corresponding key value in the zabbix_agentd.conf to get the data, the following is the configuration to add:

$ tail-3./etc/zabbix_agentd.conf#top_processuserparameter=process.discovery,/home/zabbix/zabbix-2.4.4/scripts/ Check_process.shuserparameter=process.resource[*],/home/zabbix/zabbix-2.4.4/scripts/processmonitor.sh $

After you add this configuration, you need to restart Zabbix_agentd for the configuration to take effect, restart requires Pkill Zabbix && zabbix-2.4.4/sbin/zabbix_agentd

OK, so that the client side has been configured successfully, the following need to verify on the server to obtain data, the server using the Zabbix_get command to obtain data, the following is the result of execution:

$ zabbix/bin/zabbix_get-s xxx.xxx.xxx.xxx-k "process.discovery" {"Data": [{"{#TABLENAME}": "Java"},{"{#TABLENAME}": " Docker "},{" {#TABLENAME} ":" Nginx "},{" {#TABLENAME} ":" Sshd "},{" {#TABLENAME} ":" Tuned "},{" {#TABLENAME} ":" Networkma+ "},{" {#TABLENAME} ":" zabbix_ag+ "},{" {#TABLENAME} ":" systemd-j+ "},{" {#TABLENAME} ":" Rsyslogd "},{" {#TABLENAME} ":" Bash "}]}

The xxx.xxx.xxx.xxx above represents the IP address of the client, and the parameter after-K is the one we added on the client.

$ zabbix/bin/zabbix_get-s xxx.xxx.xxx.xxx-k "Process.resource[java,mem]" 13115.6$ zabbix/bin/zabbix_get-s Xxx.xxx.xxx.xxx-k "Process.resource[java,cpu]" 0

Well, there is no problem in testing the client on the server, and we can get the data. Next you need to configure the template on the web side.

In the configuration---template---Create a template that is called Temple Top_process as shown in:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/71/78/wKioL1XRmXCDb5T_AAJl8lO-ygI159.jpg "title=" 1.png " alt= "Wkiol1xrmxcdb5t_aajl8lo-ygi159.jpg"/>

Create an app set called Top of process resource, as shown in:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/71/7C/wKiom1XRl7njh0okAAFvOvWQfJc761.jpg "title=" 2.png " alt= "Wkiom1xrl7njh0okaafvovwqfjc761.jpg"/>

Once you've created it, you need to add the discovery rule, which is our highlight. Create a new exploration rule as shown in:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/71/78/wKioL1XRmhjxpdm4AAKe2Tm1PYg386.jpg "title=" 4.png " alt= "Wkiol1xrmhjxpdm4aake2tm1pyg386.jpg"/>

The key value is the key value we configured on the client, the data update interval I set here to 5 minutes, that is, every 5 minutes interval it will go to the client to obtain the maximum memory of the 10 processes, and then take their memory and CPU consumption of resource data. You need to configure the project prototype as shown in the following:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/71/7C/wKiom1XRmLLDfMjYAAL1QLoZamE055.jpg "title=" 5.png " alt= "Wkiom1xrmlldfmjyaal1qlozame055.jpg"/>

As shown, {#TABLENAME} gets a list of 10 process names, process.resource[{#TABLENAME},mem] is the key value we configured on the client, where the memory value is in MB, which is converted to a byte unit, So we'll get the value *1024*1024=1048576, change the unit to byte, and apply the item to the top of process resourceying application set. In this way, a project prototype is successful. The following is a project prototype configuration for CPU-intensive resources:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/71/78/wKioL1XRm7mxH4UbAAL-gIBKW3M382.jpg "title=" 6.png " alt= "Wkiol1xrm7mxh4ubaal-gibkw3m382.jpg"/>

After you add the project prototype, you need to configure the graphical prototype as shown in:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/71/78/wKioL1XRm_TxDSUPAAJMwxVjvJU543.jpg "title=" 7.png " alt= "Wkiol1xrm_txdsupaajmwxvjvju543.jpg"/>

After adding a good graphic prototype, the template was made successful, and then added to the host, you can get the data, here because I set the automatic discovery interval is 5 minutes, so need to wait more than five minutes to appear graphics, the following is the appearance of the graphic effect.

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/71/7C/wKiom1XRmx_AOSB-AAEW17BFdNE243.jpg "title=" 8.png " alt= "Wkiom1xrmx_aosb-aaew17bfdne243.jpg"/>

This is the resource-intensive graph of the 10 most memory-intensive processes available, and the following is a detailed effect.

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/71/7C/wKiom1XRm9vy9WhOAAQoSP93UCw084.jpg "title=" 9.png " alt= "Wkiom1xrm9vy9whoaaqosp93ucw084.jpg"/>

This is the data that was just acquired, at this point, through the automatic discovery to obtain the TOP10 process occupation resources monitoring end, this is only in my haste to write a monitoring method, take out for everyone to make reference, if there is a better way, and I can discuss together, we progress together, Zabbix template I will be placed in the annex for everyone to download.

This article is from the "Lemon" blog, be sure to keep this source http://xianglinhu.blog.51cto.com/5787032/1685274

Zabbix Auto-discovery combined with Shell enables Autodiscover to consume the largest TOP10 process and monitor its resources

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.