Background: A request was received a while ago, and the leader said he wanted to know about the use of each application in a system in a production environment.
Demand:
Count the number of clicks per button;
Can not affect the production environment;
The data should be constantly increasing rather than looking at a certain time period;
Data to be stored permanently, not lost;
Ideas: I think this can be through the Nginx log to analyze, each action and the background of the Nginx received the request is definitely one-to-do, then we pass the Nginx log, then the need to solve;
Scheme:
Analysis of Nginx request log;
Early morning log analysis, and data may be to production environment;
do timed tasks;
Store to MongoDB;
Summary: Log analysis in the early hours of the day, the processing results stored in the offline MongoDB database
Code implementation:
Connect the MONGO code ↓
#coding =utf-8# auth: xinsir# date: 2017/10/02# version:3.0from pymongo import mongoclientimport pickle# establishing a MongoDB database connection client = mongoclient (' 192.168.1.197 ', 27017) # Connect the required database, test for the database name db=client.nginxlog# the collection used to connect, which is what we usually call a table, test for the table name collection=db.recording# write a method for deserializing the data def deserialization (name): data = pickle.load (name) return data# write a method to store the data def insterdata in the MONGO: collection.insert ( Data) # write a method to query the MONGO in Def sechmongo (link): for u in Collection.find ( {' Link ': link} ): return true else: return false# Write a method To update the data in MONGO Def update (wherelink): data = collection.find_one ({' Link ': Wherelink}) &Nbsp; collection.update ({' Link ': wherelink},{' $set ': {' cunt ': int (data[' cunt ']) +1})
Insert template Code ↓
#coding =utf-8# auth: xinsir# date: 2017/10/02# version:3.0#_*_ coding:utf-8 _*_# write a method to import the template of the request connection, before the log analysis, first to insert the template import systemmongoactionlog = ' in the MONGO collection. /nginxlog/result.txt ' Def sech (File): sechdic = {} with open (File, ' R ', encoding= ' UTF-8 ') as ActionLogFile: for line in actionlogfile.readlines (): a = line.split (' t ') b = a[-1].split (' \ n ') del a[-1] A.append (b[0]) sechdic[a[0]] = { ' ModuleName ': a[0], ' ButtonName ': a[1], ' Link ': a[2], ' cunt ': a [3], } systemmongo. Insterdata (Sechdic[a[0]]) actionlogfile.close () if __name__ == ' __main__ ' : sech (Actionlog)
Look at the format of the action template
650) this.width=650; "src=" https://s4.51cto.com/wyfs02/M02/A6/88/wKioL1nR9bWCvrsNAAENvSP7SYo315.jpg "title=" 1.jpg "alt=" Wkiol1nr9bwcvrsnaaenvsp7syo315.jpg "/>
Look at the inserted collection.
650) this.width=650; "src=" https://s3.51cto.com/wyfs02/M02/07/D7/wKiom1nR9SuBTrJ5AAFZZDcJ00w376.jpg "title=" 1.jpg "alt=" Wkiom1nr9subtrj5aafzzdcj00w376.jpg "/>
Analyze Nginx Log Code ↓
#_ *_ coding:utf-8 _*_# auth: xinsir# date: 2017/10/02# version:3.0import systemmongo,os# Write a method that determines whether the file exists. Returns a Boolean value Def judgment_file (FileName): if os.path.exists ( FileName ): return true else: return False# Write a method that Cut the string into a list, return listdef splitstr (Strname, format,): return Strname.split (Format) # reads the log file, gets all the records that contain the action in the file, and writes to the new log file Def read_file (File, new_file,): ReadPosition = 0 LastLine = 0 filesize = 0 filedic = {} # Open two files, one is a log file, one is a temporary file, the log file is changed, the format of the dictionary is serialized into the log file with open (file, ' R ') as log_file, open (New_file, ' W ') as new_log_file: # view the total length of the file, write to the temp file filesizenew = os.path.getsize (file) log_file.seek (readposition) if FileSize < FileSizeNew: for (Num, line) in enumerate (log_file): if '. Action ' in line.strip (): new_line = line.strip (). Split (sep= ' \ " \") ListFirstValue = Splitstr (new_line[0], ' \ "') &NBsp; listmostvalue = splitstr (new_line[-1], "') del new_line[0] del new_line[-1] new_line.insert (0, listfirstvalue[1]) new_ Line.append (listmostvalue[0]) method = str (New_line[3]). Split () [0] &nBsp; request = str (New_line[3]). Split () [1] httpversion = str (New_line[3]). Split () [2] if '? ' in request: uri = request.split (sep= '? ') [0] query_string = request.split (sep= '? ') [1] else: &Nbsp; uri = request.split ( sep= '? ' ) [0] Query_string = ' if '. Action ' in Uri: # if LogFileStatus : FileDic[num + 1 + LastLine] = { ' REMOTE_ADDR ': new_line[0], ' Host ': new_line[1], ' time_local ': new_line[2], ' request ': request, ' URI ':uri, ' query_string ': query_string, ' Eethod ': method, ' Httpversion ': httpversion, ' Body_bytes_sent ': new_line[4], ' Http_referer ': new_line[5], ' Http_user_ Agent ': new_line[6], ' Http_x_forwarded_for ': new_line[7], ' server_addr ': new_line[8], ' status ': new_line[9] , ' Request_time ': new_line[10], } else: print (' Static requests do not record! ') continue isnot = systemmongo. Sechmongo ( '/http ' + FileDic[num + 1 + lastline][' host '] + filedic[num + 1 + lastline][' URI ') if isnot: systemmongo. Update ( ' http//' + filedic[num + 1 + lastline][' Host '] + filedic[num + 1 + lastline][' URI ') print (' Update record succeeded: ' HTTP//' + filedic[num + 1 + lastline][' host '] + filedic[num + 1 + lastline][' URI ') else: print (' Action request does not exist! Don't make a record! ') else: print (' Static request not handled! ') else: print (' log file has not changed! ') new_log_file.write (str (filedic)) log_file.close () new_log_file.close () if __name__ == ' __main__ ': logpath_new = '. /nginxlog/b.txt ' for fpathe, dirs, fs in os.walk (' E:\python project \ Jlj-nginx-log-web\jlj-nginx-log-web\\nginxlog\log '): # All log files in the Loop directory for f in fs: logpath = os.path.join (fpathe, f) read_file (LOGPATH, LOGPATH_new)
With the crontab task of Linux, every morning to the production environment of the Nginx log SCP to the line, here specifically, because our Nginx access is very large, when the large number of online connections can reach 3000 +, imagine a day of nginx log to have eight hundred or nine hundred trillion, Before the small code is a one-time open file Each loop log each row, but this will put the machine's memory burst, so now is in accordance with the format of the month and day to cut Nginx log.
Look at the Nginx log cutting
[[email protected] ~]# cat/etc/nginx/sbin/cut_log_hour.sh #!/bin/bash#nginx log path log_dir= "/disk/log/nginx" #当前时间 2017/ 09/28/15date_dir= ' Date +%y/%m/%d/%h ' #创建时间目录/bin/mkdir-p ${log_dir}/${date_dir} >/dev/null 2>&1# Move the current log to the corresponding directory, and rename/bin/mv ${log_dir}/access.log ${log_dir}/${date_dir}/access.log# to regenerate a log file Kill-usr1 ' cat/var/ Run/nginx.pid '
Look at the following Nginx logging format, each keyword is separated by double quotation marks, which helps us log analysis in the background
Log_format main ' "$remote _addr" "$host" "$time _local" "$request" "$body _bytes_sent" "$http _referer" "$http _user_agent" "$http _x_forwarded_for" "$server _addr" "$status" "$request _time";
Small series is also the first time to write code, if there are not and ideas messy place, also please guide, the environment in the comment area for comments, also welcome everyone to add my QQ with learning.
qq:894747821
This article is from the "Learning to change the Destiny" blog, please make sure to keep this source http://xinsir.blog.51cto.com/5038915/1970198
Python statistics for each connection usage of the web App