An interface system that is made in the company is primarily connected to a third-party system interface, so the system will interact with many other companies ' projects. followed by a very painful problem, so many of the company's interface, the stability of the different company interface is very large, access to a large number of times, some not how to do the interface on all kinds of errors.
This interface system has just been developed, the whole system, in the comparative edge of the position, unlike other projects, there is a log library, as well as SMS alarm, once the problem, many cases are user feedback back, so, my idea is, pick up Python, write a monitoring for this project. If in the process of invoking a third-party interface, a large number of errors, indicating that the interface has a problem, you can take action faster.
The project also has a log library, all the Info,error logs are scanned every minute of storage, log library is used by MySQL, table has several particularly important fields:
- Level log Levels
- Message Log Content
- file_name Java code files
- Log_time Log Time
There is a log library, you do not have to go to the online Environment scan log analysis, directly from the log library start. Because the log library on the line every 1 minutes sweep, then I go to the log library every 2 minutes sweep, if sweep to a certain number of error logs on the alarm, if only one or two errors can be ignored, that is, a short time a large number of error logs, you can conclude that the system has a problem. Alarm mode is used to send mail, so there are a few things to do:
1. Operate MySQL.
2. Send the message.
3. Scheduled Tasks.
4. Log.
5. Run the script.
Clear the above several things, you can do it.
manipulating databases
Using MYSQLDB this drive, the direct operation of the database, mainly is the query operation.
To get a connection to a database:
Def Get_con (): host = "127.0.0.1" port = 3306 Logsdb = "logsdb" user = "root" password = "never tell you" con = mysqldb.co Nnect (Host=host, User=user, Passwd=password, Db=logsdb, Port=port, charset= "UTF8") return con
Get the data from the log library, get the data 2 minutes before the current time, first, calculate the time based on the current time. Before, the calculation has been problematic and has now been modified.
Def calculate_time (): Now = Time.mktime (DateTime.Now (). Timetuple ()) -60*2 result = Time.strftime ('%y-%m-%d%h:%m:% S ', Time.localtime (now)) return result
Then, the log library is queried for data based on time and log levels.
Def get_data (): Select_time = Calculate_time () logger.info ("Select Time:" +select_time) sql = "Select File_name,message fr Om logsdb.app_logs_record "\ " where Log_time > "+" ' "+select_time+" ' "\ " and level= "+" ' ERROR ' "\ " Order by log_time desc "conn = Get_con () cursor = Conn.cursor () cursor.execute (sql) results = Cursor.fetchall () cursor. Close () conn.close () return results
Send mail
Using Python to send mail is simple, using standard library Smtplib
There are 163 mailboxes to send, you can use other mailboxes or business mailboxes, but the host and port are set up correctly.
def send_email (content): Sender = "sender_monitor@163.com" receiver = ["rec01@163.com", "rec02@163.com"]host = ' Smtp.163.com ' port = 465msg = Mimetext (content) msg[' from '] = "sender_monitor@163.com" msg[' to '] = "rec01@163.com, Rec02@163.com "msg[' Subject '] =" System error Warning "TRY:SMTP = smtplib. Smtp_ssl (host, port) smtp.login (sender, ' 123456 ') smtp.sendmail (sender, receiver, msg.as_string ()) Logger.info ("Send Email success ") except Exception, E:logger.error (e)
Scheduled Tasks
Use a separate thread to scan every 2 minutes and send an email notification if the number of log bars in the error level exceeds 5.
Def task (): While True:logger.info ("Monitor running") results = Get_data () If results are not None and Len (results) > 5:co ntent = "Recharge Error:" Logger.info ("a lot of error,so send mail") for R in Results:content + = r[1]+ ' \ n ' send_email (content ) Sleep (2*60)
Log
Configure the log log.py for this little script so that the logs can be exported to the file and console.
# coding=utf-8import Logging logger = Logging.getlogger (' MyLogger ') logger.setlevel (logging. DEBUG) fh = logging. Filehandler (' Monitor.log ') fh.setlevel (logging.info) ch = logging. Streamhandler () ch.setlevel (logging.info) formatter = logging. Formatter ('% (asctime) s-% (name) s-% (levelname) s-% (message) s ') Fh.setformatter (Formatter) ch.setformatter (Formatter ) Logger.addhandler (FH) logger.addhandler (CH)
So, finally, this monitoring applet is like this app_monitor.py
# coding=utf-8import Threadingimport mysqldbfrom datetime import datetimeimport timeimport smtplibfrom Email.mime.text Import Mimetextfrom log import Logger def get_con (): host = "127.0.0.1" port = 3306 Logsdb = "logsdb" user = "root" passw Ord = "Never tell you" con = MySQLdb.connect (Host=host, User=user, Passwd=password, Db=logsdb, Port=port, charset= "UTF8") Return con def calculate_time (): now = Time.mktime (DateTime.Now (). Timetuple ()) -60*2 result = Time.strftime ('%y-%m-%d%H: %m:%s ', Time.localtime (now) return result def get_data (): Select_time = Calculate_time () logger.info ("Select Time:" +sel ect_time) sql = "Select file_name,message from Logsdb.app_logs_record" \ "where Log_time >" + "'" +select_time+ "'" \ "and level=" + "' ERROR '" \ "ORDER BY log_time DESC" conn = Get_con () cursor = Conn.cursor () cursor.execute (sql) results = Cursor.fetchall () Cursor.close () Conn.close () return results def send_email (content): Sender = "sender_monitor@163.co M "receiver = [" rec01@163.coM "," rec02@163.com "] host = ' smtp.163.com ' port = 465 msg = mimetext (content) msg[' from '] =" sender_monitor@163.com "msg[' To '] = "rec01@163.com,rec02@163.com" msg[' Subject ' = "System error Warning" TRY:SMTP = smtplib. Smtp_ssl (host, port) smtp.login (sender, ' 123456 ') smtp.sendmail (sender, receiver, msg.as_string ()) Logger.info ("Send E Mail Success ") except Exception, E:logger.error (e) def task (): While True:logger.info (" Monitor running ") results = g Et_data () If results is not None, and Len (results) > 5:content = "Recharge Error:" Logger.info ("a lot of error,so Send Mail ") for R in Results:content + = r[1]+ ' \ n ' send_email (content) Time.sleep (2*60) def run_monitor (): Monito R = Threading. Thread (Target=task) Monitor.start () if __name__ = = "__main__": Run_monitor ()
Run the script
The script runs on the server and is managed using supervisor.
Install supervisor on the server (CENTOS6), and then add the configuration in/etc/supervisor.conf:
Copy the Code code as follows:
[Program:app-monitor]
Command = python/root/monitor/app_monitor.py
Directory =/root/monitor
user = root
Then run Supervisord boot supervisor in the terminal.
Run Supervisorctl in the terminal, enter the shell, run status to view the running state of the script.
Summarize
This small monitoring idea is very clear, but also can continue to modify, such as: monitoring the specific interface, send SMS notifications and so on.
Because there is a log library, there is less to go to the official environment of the online scan log trouble, so, if there is no log library, it is necessary to own on-line environment scan, in the official online environment must be careful wow ~