Use shell scripts to monitor linux system load and CPU usage

Source: Internet
Author: User
These days I have been studying shell scripts and writing some system load and CPU monitoring script programs. Without nagios monitoring software, as long as the server can access the Internet, you can send an email to remind the administrator of the system load and CPU usage. I. install the shell script and write some system load and CPU monitoring script programs in linux these days. Without nagios monitoring software, as long as the server can access the Internet, you can send an email to remind the administrator of the system load and CPU usage.
1. install a mail client msmtp software (similar to a foxmail tool) in linux)
1, download installation: http://downloads.sourceforge.net/msmtp/msmtp-1.4.16.tar.bz2? Modtime = 1217206451 & big_mirror = 0
# Tar jxvf msmtp-1.4.16.tar.bz2
# Cd msmtp-1.4.16
#./Configure -- prefix =/usr/local/msmtp
# Make
# Make install
2. create the msmtp configuration file and log file (host is the mail domain name, mail username test, password 123456)
# Vim ~ /. Msmtprc
1. account default
2. host 126.com
3. from test@126.com
4. auth login
5. user test
6. password 123456
7. logfile ~ /. Msmtp. log
# Chmod 600 ~ /. Msmtprc
# Touch ~ /. Msmtp. log
3. mutt installation configuration: (mutt is installed by default in linux)
# Vim ~ /. Muttrc
1. set sendmail = "/usr/local/msmtp/bin/msmtp"
2. set use_from = yes
3. set realname = "moniter"
4. set from = test@126.com
5. set envelope_from = yes
6. set rfc2047_parameters = yes
7. set charset = "UTF-8"
 
4. Mail sending test (-s Mail title,-a table with attachments)
# Echo "Mail content 123456" | mutt-s "Mail title test mail"-a/scripts/test.txt test@126.com
 
II. monitoring server system load:
1. run the uptime Command to view the current load (1 minute, 5 minutes, 15 minutes average load)
# Uptime
15:43:59 up 186 days, 1 user, load average: 0.01, 0.02, 0.00
 
Empirical rules of system load: (from http://www.ruanyifeng.com/blog/2011/07/linux_load_average_explained.html)
(1) observe "15-minute system load" and use it as an indicator for normal computer operation.
(2) If the average load within 15 minutes (after the system load is divided by the number of CPU cores) exceeds 1.0, the problem persists, not a temporary phenomenon.
(3) when the system load continues to exceed 0.7, you must start investigating where the problem is and prevent the situation from deteriorating.
(4) when the system load continues to exceed 1.0, you must find a solution to reduce this value.
(5) when the system load reaches 5.0, it indicates that your system has a very serious problem, does not respond for a long time, or is close to a dead end.
 
2. view the total number of server cpu cores
# Grep-c 'model name'/proc/cpuinfo
 
3. server load is intercepted for 1 minute, 5 minutes, and 15 minutes.
# Uptime | awk '{print $8, $9, $10, $11, $12 }'
Load average: 0.01, 0.02, 0.00
 
4. view the average load intercepted for 15 minutes
# Uptime | awk '{print $12 }'
5. compile a script file for system load monitoring:
# Vim/scripts/load-check.sh
1 .#! /Bin/bash
2. # Use The uptime Command to monitor linux system load changes
3.
4. # obtain the current system Time (write files in append mode>)
5. date>/scripts/datetime-load.txt
6.
7. # extract the server load for 1 minute, 5 minutes, and 15 minutes
8. uptime | awk '{print $8, $9, $10, $11, $12}'>/scripts/load.txt
9.
10. # connect the above time and load-related row data (re-write the file each time>)
11. paste/scripts/datetime-load.txt/scripts/load.txt>/scripts/load_day.txt
# Chmod a + x/scripts/load-check.sh
6. Compile an email sending script for the system load result file:
# Vim/scripts/sendmail-load.sh
1 .#! /Bin/bash
2. the handler sends the load_day.txt file generated by the system load monitoring to the user by email.
3.
4. # extract the IP address information of the current server
5. IP = 'ifconfig eth0 | grep "inet addr" | cut-f 2-d ":" | cut-f 1-d ""'
6.
7. # extract the current date
8. today = 'date-d "0 day" + % Y year % m month % dday'
9.
10. # send the system load monitoring result email
11. echo "this is the system load monitoring report for $ IP server $ today. please download the attachment. "| Mutt-s" $ IP server $ today system load monitoring report "-a/scripts/load_day.txt test@126.com
# Chmod a + x/scripts/sendmail-load.sh
7. compile a script file for system load monitoring:
# Vim/scripts/load-warning.sh
1 .#! /Bin/bash
2. # Use The uptime Command to monitor linux system load changes
3.
4. # extract the IP address information of the current server
5. IP = 'ifconfig eth0 | grep "inet addr" | cut-f 2-d ":" | cut-f 1-d ""'
6.
7. # capture the total number of cpu cores
8. cpu_num = 'grep-c' model name'/proc/cpuinfo'
9.
10. # capture the average load value of the current system for 15 minutes
11. load_15 = 'uptime | awk' {print $12 }''
12.
13. # calculate the average load of a single core in the current system for 15 minutes. if the result is smaller than 1.0, the previous single digit is supplemented with 0.
14. average_load = 'echo "scale = 2; a = $ load_15/$ cpu_num; if (length (a) = scale (a) print 0; print a" | bc'
15.
# Take the one-digit integer of the above average load value
16. average_int = 'echo $ average_load | cut-f 1-d "."'
17.
18. # set the alarm value for the average load of a single core for 15 minutes to 0.70 (that is, when more than 70% is used ).
19. fig = 0.70
20.
21. # when the average load of a single core within 15 minutes is greater than or equal to 1.0 (that is, the single-digit integer is greater than 0), an alarm is sent directly. if the load is less than 1.0, a secondary comparison is performed.
22. if ($ average_int> 0); then
23. echo "$ the average system load of the IP server in 15 minutes is $ average_load, which exceeds the warning value of 1.0. please handle it now !!! "| Mutt-s" $ severe system load alert for IP servers !!! Test@126.com
24. else
25. # compare the average load value of the current system within 15 minutes with the alarm value (1 is returned when the alarm value is greater than 0.70, and 0 is returned if the alarm value is smaller)
26. load_now = 'expr $ average_load \> $ load_warn'
27.
28. # if the average load of a single core in 15 minutes is greater than 0.70 of the alarm value (the return value is 1), send an email to the administrator.
29. if ($ load_now = 1); then
30. echo "$ the average system load of the IP server in 15 minutes reaches $ average_load, exceeding the warning value of 0.70. please handle it in time. "| Mutt-s" $ IP server system load warning "test@126.com
31. fi
32. fi
# Chmod a + x/scripts/load-warning.sh
III. monitor the cpu usage of the server system:
1. run the top command to view the cpu usage in linux:
# Top-B-n 1 | grep Cpu (only one output result is required for the-B-n 1 table)
Cpu (s): 0.0% us, 0.0% sy, 0.0% ni, 99.9% id, 0.0% wa, 0.0% hi, 0.0% si, 0.0% st
(Idle value)
2. run the following command to view the percentage value of the idle cpu ):
# Top-B-n 1 | grep Cpu | awk '{print $5}' | cut-f 1-d "."
3. compile a script file for cpu monitoring:
# Vim/scripts/cpu-check.sh
1 .#! /Bin/bash
2. # use the top command to monitor linux cpu changes
3.
4. # obtain the current system Time (write files in append mode>)
5. date>/scripts/datetime-cpu.txt
6.
7. # capture the current cpu value (write the file in append mode>)
8. top-B-n 1 | grep Cpu>/scripts/cpu-now.txt
9.
10. # connect the preceding time and cpu-related data row by row (re-write the file each time>)
11. paste/scripts/datetime-cpu.txt/scripts/cpu-now.txt>/scripts/cpu.txt
# Chmod a + x/scripts/cpu-check.sh
 
4. view the result file of CPU monitoring:
# Cat/scripts/cpu.txt
 
5. write an email sending script for the cpu result file:
# Vim/scripts/sendmail-cpu.sh
1 .#! /Bin/bash
2. the worker sends the generated cpu.txt file to the user by email.
3.
4. # extract the IP address information of the current server
5. IP = 'ifconfig eth0 | grep "inet addr" | cut-f 2-d ":" | cut-f 1-d ""'
6.
7. # extract the current date
8. today = 'date-d "0 day" + % Y year % m month % dday'
9.
10. # send a cpu monitoring result email
11. echo "this is the cpu Monitoring Report for $ IP server $ today. please download the attachment. "| Mutt-s" $ IP server $ today CPU Monitoring Report "-a/scripts/cpu.txt test@126.com
# Chmod a + x/scripts/sendmail-cpu.sh
4. monitor the cpu usage of the system and send an alarm email when the cpu usage exceeds 80%:
# Vim/scripts/cpu-warning.sh
1 .#! /Bin/bash
2. # script program for monitoring system cpu status
3.
4. # extract the IP address information of the current server
5. IP = 'ifconfig eth0 | grep "inet addr" | cut-f 2-d ":" | cut-f 1-d ""'
6.
7. # obtain the ratio of the current idle cpu to a hundred (only an integer)
8. cpu_idle = 'top-B-n 1 | grep Cpu | awk '{print $5}' | cut-f 1-d "."'
9.
10. # set the alarm value of idle cpu to 20%. if the current cpu usage exceeds 80% (that is, the remaining cpu usage is less than 20%), send an email immediately.
11. if ($ cpu_idle <20); then
12. echo "$ IP server cpu remaining $ cpu_idle %, the usage has exceeded 80%, please handle it in time. "| Mutt-s" $ IP server CPU alarm "test@126.com
13. fi
# Chmod a + x/scripts/cpu-warning.sh
5. add a task plan: check the system load and CPU usage every ten minutes. If an alarm is triggered, send an email immediately (once every minute). send an email to the load and CPU detection result at every day.
 
# Crontab-e
1. */10 */scripts/load-check.sh
2. */10 */scripts/load-warning.sh
3. 0 8 ***/scripts/sendmail-load.sh
4.
5. */10 */scripts/cpu-check.sh
6. */10 */scripts/cpu-warning.sh
7. 0 8 ***/scripts/sendmail-cpu.sh
# Service crond restart

Author: My O & M path

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.