Using Python scripts to monitor Linux servers

Source: Internet
Author: User

Linux system monitoring tools, such as Inotify-sync (file system security monitoring software), Glances (Resource monitoring tool), are currently in use in the Python language, Linux The system administrator can write a simple and practical script to monitor the Linux server according to the specific situation of the server he uses. This article describes the use of Python scripting to implement a monitoring script for the Linux server CPU memory network.

How it works: Based on the/proc file system

The Linux system provides an excellent way for administrators to change the kernel while the system is running, without rebooting the kernel system, which is achieved through the/proc virtual file system. The/proc file virtual system is a mechanism that the kernel and kernel modules use to send information to the process (so called "/proc"), a pseudo-file system that allows interaction with the kernel's internal data structures to obtain useful information about the process, in operation (on the Fly) to change the settings (by changing the kernel parameters). Unlike other file systems,/proc exists in memory and not on the hard disk. The information provided by the proc file system is as follows:

    • Process information: Any process in the system that has a process ID with the same name in the proc subdirectory can find CmdLine, mem, root, Stat, STATM, and status. Some information is only visible to the superuser, such as the process root directory. Each process that contains existing process information alone has some specialized links available, and any process in the system has a separate self-link pointing to the process information, which is useful for getting command-line information from the process.
    • System Information: If you need to know the entire system information can also be obtained from/proc/stat, including CPU usage, disk space, memory swap, interrupt and so on.
    • CPU information: Use the/proc/cpuinfo file to obtain the current accurate information of the CPU.
    • Payload information: The/proc/loadavg file contains system payload information.
    • System memory Information: The/proc/meminfo file contains details of the system memory, which shows the amount of physical memory, the amount of free swap space, and the amount of free memory.

Table 1 is a description of the main files in the/proc directory:

Table 1/proc Description of the main file in the directory
文件或目录名称 描 述
apm 高级电源管理信息
cmdline 这个文件给出了内核启动的命令行
CPUinfo 中央处理器信息
devices 可以用到的设备(块设备/字符设备)
dma 显示当前使用的 DMA 通道
filesystems 核心配置的文件系统
ioports 当前使用的 I/O 端口
interrupts 这个文件的每一行都有一个保留的中断
kcore 系统物理内存映像
kmsg 核心输出的消息,被送到日志文件
mdstat 这个文件包含了由 md 设备驱动程序控制的 RAID 设备信息
loadavg 系统平均负载均衡
meminfo 存储器使用信息,包括物理内存和交换内存
modules 这个文件给出可加载内核模块的信息。lsmod 程序用这些信息显示有关模块的名称,大小,使用数目方面的信息
net 网络协议状态信息
partitions 系统识别的分区表
pci pci 设备信息
scsi scsi 设备信息
self 到查看/proc 程序进程目录的符号连接
stat 这个文件包含的信息有 CPU 利用率,磁盘,内存页,内存对换,全部中断,接触开关以及赏赐自举时间
swaps 显示的是交换分区的使用情况
uptime 这个文件给出自从上次系统自举以来的秒数,以及其中有多少秒处于空闲
version 这个文件只有一行内容,说明正在运行的内核版本。可以用标准的编程方法进行分析获得所需的系统信息

下面本文的几个例子都是使用 Python 脚本读取/proc the main file in the directory to implement实现对 Linux 服务器的监控的 。

Using Python scripts to monitor Linux servers for CPU (central processing unit) monitoring

Script 1 name cpu1.py, which acts to obtain information about the CPU.

Listing 1. Getting information about the CPU
#!/usr/bin/Env Python from__future__ Import print_function fromCollections Import Ordereddictimport pprintdef CPUinfo ():" "Return the information in/proc/cpuinfo     asA dictionaryinchThe following format:cpu_info['proc0']={...} cpu_info['Proc1']={...} " "cpuinfo=ordereddict () ProcInfo=ordereddict () Nprocs=0With Open ('/proc/cpuinfo') asF: forLineinchF:ifNot Line.strip (): # End of one processor cpuinfo['proc%s'% Nprocs] =ProcInfo Nprocs=nprocs+1# Reset ProcInfo=ordereddict ()Else:                ifLen (Line.split (':')) ==2: Procinfo[line.split (':')[0].strip ()] = Line.split (':')[1].strip ()Else: Procinfo[line.split (':')[0].strip ()] ="'                returnCPUinfoif__name__=='__main__': CPUinfo=CPUinfo () forProcessorinchCpuinfo.keys (): Print (cpuinfo[processor]['model name'])

Simply explain listing 1, read the information in the/proc/cpuinfo, and return to the list, one dict per core. Where list is a set of ordered elements enclosed in square brackets. The List can be used as an array starting with the 0 subscript. Dict is one of the built-in data types for Python, which defines the relationship between the key and the value. Ordereddict is a dictionary subclass that remembers the order in which its contents are incremented. Regular dict do not track the insertion order, and the iteration process generates values based on the order in which the keys are stored in the hash table. In Ordereddict, instead, it remembers the order in which elements are inserted and uses that order when creating iterators.

For system load Monitoring

Script 2 name cpu2.py, function to get load information of the system

Listing 2 Obtaining Load information for the system
#!/usr/bin/env Python import os def load_stat (): Loadavg={} f= Open ("/proc/loadavg") Con=F.read (). Split () F.close () loadavg['lavg_1']=con[0] loadavg['lavg_5']=con[1] loadavg['lavg_15']=con[2] loadavg['nr']=con[3] loadavg['Last_pid']=con[4]     returnloadavg Print"Loadavg", Load_stat () ['lavg_15']

Simply explain Listing 2: Listing 2 reads the information from the/proc/loadavg, and import Os:python imports the different modules, including the system-provided and custom modules. The basic form is: Import module name [as Alias], if you only need to import some or all of the modules can be in the form: from the module name import * to the corresponding module. The OS Module OS module provides a unified operating system interface function that allows the OS module to automatically switch between specific functions in different operating system platforms, such as Nt,posix, for cross-platform operation.

Access to memory information

Script 3 name mem.py, which is used to get memory usage information

Listing 3 Getting memory usage
#!/usr/bin/Env Python from__future__ Import print_function fromCollections Import Ordereddictdef meminfo ():" "Return the information in/proc/meminfo     asA dictionary" "meminfo=ordereddict () with open ('/proc/meminfo') asF: forLineinchF:meminfo[line.split (':')[0]] = Line.split (':')[1].strip ()returnMeminfoif__name__=='__main__': #print (Meminfo ()) Meminfo=Meminfo () print ('Total Memory: {0}'. Format (meminfo['Memtotal']) print ('Free Memory: {0}'. Format (meminfo['Memfree']))

A brief explanation of Listing 3: Listing 3 reads the information in Proc/meminfo, and the split method of the Python string is used more frequently or more. For example, we need to store a very long data, and according to the structure of the method of storage, convenient to take the data later processing. Of course, it can be in JSON form. But you can also store the data in a field and then have some kind of identifier to split it. The strip in Python is used to remove the first character of the string, and the last listing 3 prints out the total memory and the number of idle numbers.

Monitoring of network interfaces

The script 4 name is net.py, which functions to obtain the usage of the network interface.

Listing 4 net.py getting the input and output of the network interface

#!/usr/bin/env pythonimport timeimport SYSifLen (SYS.ARGV) >1: INTERFACE= sys.argv[1]Else: INTERFACE='eth0'STATS=[]print'Interface:', InterfaceDef Rx (): Ifstat= Open ('/proc/net/dev'). ReadLines () for Interface inchIfstat:ifINTERFACEinch Interface: Stat=float(Interface. Split () [1]) stats[0:] =[Stat]def TX (): Ifstat= Open ('/proc/net/dev'). ReadLines () for Interface inchIfstat:ifINTERFACEinch Interface: Stat=float(Interface. Split () [9]) stats[1:] =[Stat]print' in Out'Rx () TX () whileTrue:time.sleep (1) Rxstat_o=list (STATS) Rx () TX () Rx=float(stats[0]) Rx_o= rxstat_o[0] TX=float(stats[1]) Tx_o= rxstat_o[1] Rx_rate= Round ((rx-rx_o)/1024x768/1024x768,3) Tx_rate= Round ((tx-tx_o)/1024x768/1024x768,3) Print rx_rate,'MB', Tx_rate,'MB'

A brief description of Listing 4: Listing 4 reads the information in/proc/net/dev, and the file operation in Python can be done through the open function, which is indeed much like the fopen in C. Use the Open function to get a file object, and then call Read (), write (), and so on to read and write the files. It is also very easy for Python to read the contents of a text file into a string variable that can be manipulated. The file object provides three "read" Methods: Read (), ReadLine (), and ReadLines (). Each method can accept a variable to limit the amount of data that is read each time, but they typically do not use variables. Read () reads the entire file each time, and it is typically used to place the contents of the file into a string variable. however. Read () produces the most direct string representation of the file content, but it is unnecessary for continuous row-oriented processing, and it is not possible to implement this processing if the file is larger than the available memory. The difference between ReadLine () and. ReadLines () is that the latter reads the entire file at once , like. Read (): ReadLines () automatically parses the contents of a file into a list of rows that can be used by Python for ... in ... Structure for processing. On the other hand,. ReadLine () reads only one line at a time, usually much slower than. ReadLines (). You should use. ReadLine () only if there is not enough memory to read the entire file at once. The last listing 4 prints out the input and output of the network interface.

Python script to monitor Apache server processes

The Apache server process may exit unexpectedly due to various reasons for the system, causing the Web service to pause. So I write a Python script file:

Listing 5 crtrl.py Python script for monitoring Apache server processes

#!/usr/bin/env Python import OS, sys, time whileTrue:time.sleep (4) Try: Ret= Os.popen ('ps-c apache-o pid,cmd'). ReadLines ()ifLen (ret) <2: Print"Apache process quits unexpectedly, restarts after 4 seconds"Time.sleep (3) Os.system ("Service apache2 Restart") Except:print"Error", Sys.exc_info () [1]

Set file permissions to execute properties (using command chmod +x crtrl.py), and then join to/etc/rc.local, which automatically checks and restarts once the Apache server process exits abnormally. A brief description of listing 5 This script is not based on the/proc pseudo file system, but is based on some of the modules provided by Python itself. This is the embedded time template for Python, which provides functions for various operating times.

Summarize

In the actual work, the Linux system administrator can write a simple and practical script to monitor the Linux server according to the specific situation of the server he uses. This article describes how to use a Python script to implement a monitoring script for Linux server CPU, System load, memory, and network usage.

http://www.ibm.com/developerworks/cn/linux/1312_caojh_pythonlinux/

Using Python scripts to monitor Linux servers

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.