Shell script implements Linux system and process resource monitoring _ basics

Source: Internet
Author: User
Tags memory usage domain name server cpu usage disk usage egrep


In the process of server operation, often need to monitor the various resources of the server, such as: CPU load monitoring, disk usage monitoring, process number monitoring and so on, in order to alarm the system in time of abnormal, notify the system administrator. This article describes several common monitoring requirements and the preparation of shell scripts for Linux systems.



Article directory:



1.Linux use Shell to check if a process exists
2.Linux using shell to detect process CPU utilization
3.Linux use Shell to detect process memory usage
4.Linux use Shell to detect process handle usage
5.Linux use Shell to see if a TCP or UDP port is listening
6.Linux use the shell to see the number of running of a process name
7.Linux use Shell to detect system CPU load
8.Linux use Shell to detect system disk space
9. Summary



Check if a process exists



When monitoring a process, we generally need to get the ID of the process, the process ID is the unique identity of the process, but sometimes it is possible to run multiple processes with the same process name under different users on the server, the following function getpid the process ID to get the specified process name under the specified user Feature (currently only takes into account the process under which this process name is initiated under this user), it has two parameters for user name and process name, it first uses PS to find process information, while using grep to filter out the required process, and finally through SED and awk to find the ID value of the process (this function can be modified according to the actual situation, such as the need to filter other information, etc.).



Listing 1. Monitoring the process





code as follows:

function Getpid #User #Name
{
Psuser=$1
Psname=$2
Pid= ' ps-u $PsUser |grep $PsName |grep-v grep|grep-v vi|grep-v dbx\n
|grep-v tail|grep-v start|grep-v Stop |sed-n 1p |awk ' {print $} '
Echo $pid
}





Sample Demo:



1 The source program (for example, to find the user as root, process name is Cftestapp process ID)


 code as follows:

Pid= ' getpid root Cftestapp '

Echo $PID

2) Result output







 code as follows:

11426
[Dyu@xilinuxbldsrv shell]$





3) Result Analysis



Visible from the above output: 11426 is the process ID of the Cftestapp program under the root user.



4) Command Introduction



1. PS: View the instantaneous process information in the system. Parameter:-u< User ID > list the status of the program belonging to the user, or you can specify it by using the user name. -p< Process Identification Code > Specifies the process identifier and lists the status of the process. -o Specifies output format 2. grep: Used to find the current line in a file that matches a string. Parameters:-V reverse selection, which shows the line with no ' search string ' content. 3. Sed: A non-interactive text editor that edits a file or standard input to an exported file and can only handle one line at a time. Parameter:-N reads the next input line and processes the new row with the next command instead of the first command. P Flag Print matching row 4. AWK: A programming language used to process text and data under Linux/unix. Data can come from standard input, one or more files, or the output of other commands. It supports user-defined functions and dynamic regular expressions and other advanced functions, is a powerful programming tool under Linux/unix. It is used on the command line, but more is used as a script. Awk's way of working with text and data: It scans the file line by row, from the first line to the last line, looking for a matching row of specific patterns and doing what you want on those lines. If no processing action is specified, the matching row is displayed to the standard output (screen), and if no pattern is specified, all rows specified by the action are processed. Parameters:-F FS Or–field-separator fs: Specifies the input file break separator, FS is a string or a regular expression, such as-f:.
It is sometimes possible that the process does not start, and the following function is to check that the process ID exists if the process does not run the output:


 code as follows:

The process does not exist.
# Check if a process exists
If ["-$PID" = "-"]
Then
{
echo "The process does not exist."
}
Fi





Detecting Process CPU Utilization



In the maintenance of the application service, we often encounter the business interruption due to the high CPU caused the business to block. CPU is too high may be due to traffic overload or death cycle, such as abnormal situation, through the script to the business process CPU monitoring, can be in the CPU utilization anomaly timely notify maintenance personnel, facilitate the maintenance of timely analysis, positioning, and avoid business interruption. The following function obtains process CPU utilization for the specified process ID. It has a parameter for the process ID, which first uses PS to find process information, while filtering out%cpu rows through grep-v, and finally using awk to find the integer portion of the CPU utilization percentage (if there are multiple CPU,CPU utilization in the system that can exceed 100%).



Listing 2. Real-time monitoring of business process CPUs





 code as follows:

function getcpu
{
Cpuvalue= ' ps-p $1-o pcpu |grep-v CPU | awk ' {print $} ' | Awk-f. ' {print '} '
Echo $CpuValue
}

The following function getcpu the CPU utilization of this process through the function above, and then determines whether the CPU utilization exceeds the limit by the conditional statement, if more than 80% (can adjust according to the actual situation), output the alarm, otherwise the output normal information.





Listing 3. Determine if CPU utilization exceeds limit





 code as follows:

function checkcpu
{
Pid=$1
cpu= ' Getcpu $PID '
If [$cpu-GT 80]
Then
{
echo "The usage of CPU is larger than 80%"
}
Else
{
echo "The usage of CPU is normal"
}
Fi
}





Sample Demo:



1 source program (assuming the process ID of Cftestapp has been queried above for 11426)


 code as follows:

Checkcpu 11426

2) Result output
 code as follows:

The usage of CPU is 75
The usage of CPU is normal
[Dyu@xilinuxbldsrv shell]$

3) Result Analysis





From the above output visible: The Cftestapp program current CPU usage is 75%, is normal, does not have more than 80% alarm limit.



Detecting Process Memory usage



When maintaining an application service, it is also frequently encountered that a process crashes due to excessive memory usage, resulting in a business interruption (for example, a 32-bit program can address a maximum memory space of 4G, if the requested memory is exceeded, and physical memory is limited). Memory use is too high may be due to memory leaks, message accumulation, and so on, through the script to the business process memory usage of the constant monitoring, can be in memory usage anomaly timely send alarm (such as through SMS), facilitate maintenance personnel timely processing. The following function gets the process memory usage for the specified process ID. It has a parameter for the process ID, which first uses the PS to find the process information, while filtering out the VSZ rows through grep-v, and then using the memory usage in megabytes, in addition to 1000.



Listing 4. Monitor business Process Memory usage





 code as follows:

function Getmem
{
Memusage= ' ps-o vsz-p $1|grep-v vsz '
((memusage/= 1000))
Echo $MEMUsage
}

The following function is to obtain the memory usage of this process through the function Getmem above, and then judge whether the memory usage exceeds the limit by the conditional statement, if more than 1.6G (can adjust according to the actual situation), output the alarm, otherwise the output normal information.





Listing 5. Determine if memory usage exceeds limit


 code as follows:

mem= ' Getmem $PID '
If [$mem-GT 1600]
Then
{
echo "The usage of memory is larger than 1.6G"
}
Else
{
echo "The usage of memory is normal"
}
Fi





Sample Demo:



1 source program (assuming the process ID of Cftestapp has been queried above for 11426)


 code as follows:

mem= ' Getmem 11426 '

echo "The usage of memory is $mem M"

If [$mem-GT 1600]
Then
{
echo "The usage of memory is larger than 1.6G"
}
Else
{
echo "The usage of memory is normal"
}
Fi





2) Result output





 code as follows:

The usage of memory is 248 M
The usage of memory is normal
[Dyu@xilinuxbldsrv shell]$





3) Result Analysis



From the above output visible: Cftestapp program Current memory usage is 248M, is normal, no more than 1.6G alarm limit.



Detecting Process Handle Usage



When servicing an application service, it is also common to encounter situations where business disruption is caused by excessive use of the handle. Each platform has a limited handle on the process, for example, on the Linux platform, we can use the Ulimit–n command (open files (-N) 1024) or view the contents of/etc/security/limits.conf to get process handle restrictions. Handle use too high may be due to excessive load, handle leakage, and so on, through the script to the business process handle the use of constant monitoring, can be in time to send alarms in the event (for example, by SMS), easy to maintain timely processing. The following function gets the process handle usage for the specified process ID. It has a parameter for the process ID, it first uses LS to output process handle information, and then the number of output handles is statistically wc-l.





 code as follows:

function Getdes
{
Des= ' ls/proc/$1/fd | Wc-l '
Echo $DES
}





The following function is through the above function Getdes to obtain this process handle usage, and then through conditional statements to determine whether the handle use exceeds the limit, if more than 900 (can adjust according to the actual situation), then output alarm, otherwise output normal information.


 code as follows:

des= ' Getdes $PID '
If [$des-GT 900]
Then
{
echo "The number of DES is larger than 900"
}
Else
{
echo "The number of des is normal"
}
Fi





Sample Demo:



1 The source program (assuming the above query out the Cftestapp process ID is 11426)


 code as follows:

des= ' Getdes 11426 '

echo "The number of DES is $des"

If [$des-GT 900]
Then
{
echo "The number of DES is larger than 900"
}
Else
{
echo "The number of des is normal"
}
Fi





2) Result output





 code as follows:

The number of DES is 528
The number of DES is normal
[Dyu@xilinuxbldsrv shell]$

3) Result Analysis





Visible from the above output: the current handle of the Cftestapp program is used to 528, is normal, no more than 900 alarm restrictions.



4) Command Introduction



WC: Counts the number of bytes, words, lines in the specified file, and displays the results of the output. Parameter:-L counts the number of rows. -C counts the number of bytes. -W counts the number of words.



To see if a TCP or UDP port is listening



Port detection is often encountered in system resource detection, especially in the case of network communication, the detection of port status is very important. Sometimes the process, CPU, memory, etc. in the normal state, but the port is in an abnormal state, the business is not running normally. The following function can determine whether the specified port is listening. It has a parameter for the test port, it first uses the Netstat output port to occupy the information, then through grep, the AWK,WC filter output listens for the number of TCP ports, the second statement is the output UDP port listens the number, if TCP and UDP port listens for 0, returns 0, otherwise return Back to 1.



Listing 6. Port detection





 code as follows:

Function listening
{
Tcplisteningnum= ' Netstat-an | grep ": $" | \ n
awk ' = = = ' TCP ' && $NF = = ' LISTEN ' {print $} ' | Wc-l '
Udplisteningnum= ' netstat-an|grep ': $ \ n
|awk ' = = = "UDP" && $NF = = "0.0.0.0:*" {print $} ' | Wc-l '
((Listeningnum = Tcplisteningnum + udplisteningnum))
if [$Listeningnum = 0]
Then
{
echo "0"
}
Else
{
echo "1"
}
Fi
}

Sample Demo:





1 The source program (for example, query 8080 port status is listening)





 code as follows:

islisten= ' listening 8080 '
If [$isListen-eq 1]
Then
{
echo "The port is listening"
}
Else
{
echo "The port is not listening"
}
Fi

2) Result output







 code as follows:

The port is listening
[Dyu@xilinuxbldsrv shell]$

3) Result Analysis





Visible from the above output: the Linux server's 8080 ports are in the listening state.



4) Command Introduction



Netstat: Used to display statistics related to IP, TCP, UDP, and ICMP protocols, which are typically used to verify network connectivity across the local ports. Parameters:-A displays the sockets in all lines. -N uses the IP address directly, not through the domain name server.
The following function also detects whether a TCP or UDP port is in a normal state.





 code as follows:

Tcp:netstat-an|egrep $ |awk ' $ = = ' LISTEN ' && $ = = ' TCP ' {print $} '
Udp:netstat-an|egrep $ |awk ' = = ' UDP ' && $ = ' 0.0.0.0:* ' {print $} '

Command Introduction





Egrep: Finds the specified string within the file. Egrep execution effects such as GREP-E, the syntax and parameters used can refer to grep directives, unlike grep, which is the method of interpreting strings, Egrep is interpreted using extended regular expression syntax, while grep uses basic regular expression syntax, Extended regular expressions have a more complete representation than basic regular expressions.



To see the number of running process names



Sometimes we may need to get the number of boots on a process on the server, and the next feature is to detect the number of times a process is running, such as the process name Cftestapp.


 code as follows:

Runnum= ' Ps-ef | Grep-v VI | GREP-V Tail | grep "[/]cftestapp" | Grep-v grep | Wc-l





Detecting System CPU Load



When maintaining a server, there are times when the system CPU (utilization) overload causes business disruption. Multiple processes may be running on the server, and the CPU for a single process can be viewed as normal, but the CPU load on the entire system may be abnormal. Through the script to the system CPU load constantly monitoring, can be in the abnormal time to send alarms, to facilitate maintenance personnel timely treatment, prevention of accidents. The following function can detect system CPU usage. Use Vmstat to take the idle value of the system CPU 5 times, take the average, and then get the actual value of the current CPU through the difference with 100.





code as follows:

function getsyscpu
{
Cpuidle= ' Vmstat 1 5 |sed-n ' 3, $p ' \ n
|awk ' {x = x + $} end {print X/5} ' |awk-f. ' {print '} '
Cpunum= ' echo ' 100-$CpuIdle | BC '
Echo $CpuNum
}





Sample Demo:



1) source program





 code as follows:

cpu= ' Getsyscpu '

echo "The system CPU is $cpu"

If [$cpu-GT 90]
Then
{
echo "The usage of system CPU is larger than 90%"
}
Else
{
echo "The usage of system CPU is normal"
}
Fi





2) Result output


 code as follows:

The system CPU is 87
The usage of system CPU is normal
[Dyu@xilinuxbldsrv shell]$





3) Result Analysis



Visible from the above output: The current Linux server System CPU Utilization is 87%, is normal, no more than 90% alarm limit.



4) Command Introduction



Vmstat:virtual meomory Statistics (virtual memory statistics), which can monitor virtual memory, processes, and CPU activity of the operating system.
Parameters:-n indicates that the output header information is displayed only once when the output is periodically cycled.



Detecting system disk space



System disk space detection is an important part of system resource detection, in the system maintenance, we often need to view the server disk space usage. Because some businesses are always writing a list, log, or temporary file, if disk space runs out, it can cause business disruption, and the following function detects disk space usage for a directory in the current system's disk space. The input parameter is the directory name that needs to be detected, the use of DF to output system disk space usage information, and then the use of grep and awk to filter the percentage of disk space usage for a directory.




code as follows:


function getdiskspc
{
If [$#-ne 1]
Then
Return 1
Fi

Folder= "$1$"
Diskspace= ' df-k |grep $Folder |awk ' {print $} ' |awk-f% ' {print $} '
Echo $DiskSpace
}





Sample Demo:



1) source Program (detection directory for/boot)





 code as follows:

Folder= "/boot"

Diskspace= ' GETDISKSPC $Folder '

echo "The system $Folder disk is $DiskSpace%"

If [$DiskSpace-GT 90]
Then
{
echo "The usage of system disk ($Folder) is larger than 90%"
}
Else
{
echo "The usage of system disk ($Folder) is normal"
}
Fi





2) Result output





 code as follows:

The System/boot disk is 14%
The usage of system disk (/boot) is normal
[Dyu@xilinuxbldsrv shell]$

3) Result Analysis





Visible from the above output: currently the disk space of the/boot directory on this Linux server system has been used 14%, is normal, no more than the use of 90% alarm limit.



4) Command Introduction



DF: Check the file system for disk space consumption. You can use this command to get how much space the hard disk is occupied and how much space is left. Parameter:-K is displayed in K-byte units.



Summarize



Under the Linux platform, Shell script monitoring is a very simple, convenient and effective way to monitor the server and process, which is very helpful to the system development and maintenance personnel. It can not only monitor the information above, send alarms, but also can monitor the process of the log and so on, I hope this article is helpful to everyone.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.