Compare awk python: [File]web log information statistics. [Command]netstat command status statistics

Source: Internet
Author: User

Web log Filestatistical analysis, Netstat command LineStatistical analysis
How to compare awk and Python processing

1. Web Log content---file Form
[email protected]:/var/log/nginx# Cat access.log192.168.1.3--[04/feb/2018:19:59:42 +0800] "get/http/1.1" 200 39 6 "-" "mozilla/5.0 (Windows NT 6.1; Win64; x64) applewebkit/537.36 (khtml, like Gecko) chrome/64.0.3282.140 safari/537.36 "192.168.1.3--[04/feb/2018:19:59:43 + 0800] "Get/favicon.ico http/1.1" 404 208 "http://192.168.1.111/" "mozilla/5.0 (Windows NT 6.1; Win64; x64) applewebkit/537.36 (khtml, like Gecko) chrome/64.0.3282.140 safari/537.36 "192.168.1.3--[04/feb/2018:19:59:54 + 0800] "get/png http/1.1" 404 208 "-" "mozilla/5.0 (Windows NT 6.1; Win64; x64) applewebkit/537.36 (khtml, like Gecko) chrome/64.0.3282.140 safari/537.36 "192.168.1.3--[04/feb/2018:19:59:58 + 0800] "get/a.jpg http/1.1" 404 208 "-" "mozilla/5.0 (Windows NT 6.1; Win64; x64) applewebkit/537.36 (khtml, like Gecko) chrome/64.0.3282.140 safari/537.36 "192.168.1.3--[04/feb/2018:20:06:56 + 0800] "get/a.jpg http/1.1" 404 208 "-" "mozilla/5.0 (Windows NT 6.1; Win64; x64) applewebkit/537.36 (Khtml,Like Gecko) chrome/64.0.3282.140 safari/537.36 " 

1.1 Processing with awk
Where $7 is the file name, $ $ is the size. $[$7]+=$10 indicates that the same file name size accumulates. The awk array can also be understood as a Python dictionary.
++B[$7] The number of files with the same name is cumulative.

[email protected]:/var/log/nginx# awk ‘{a[$7]+=$10;++b[$7];total+=$10}END{for(x in a)print b[x],x,a[x]}‘  access.log         1 /png 2082 /a.jpg 4161 / 3961 /favicon.ico 208

1.2 Processing with Python
The counter is used here, and the assignment of 0 for the first time can be summed directly. The additive method is basically consistent with awk.

#/usr/bin/env python3#author infaaffrom collections import Counterc=Counter()s=Counter()with open(‘netstat.txt‘) as f:    for line in f:        key=line.split()[6]        value=line.split()[9]        c[key]+=1        s[key]+=int(value)print("次数:  %s"%c)print("大小:  %s"%s)for i in c:    print(i.center(30),c[i],s[i])

Results

2. TCP Status Statistics---command and pipeline form

Where TCP6 and TCP are counted together

2.1 awk Processing
NF indicates number of filed because number accumulates to the last, that is, the last column.
(awk in N number F field R row S split, such as Rs=row split, nr= number of row numbers)

[email protected]:~# netstat -an | awk ‘/^tcp/{++s[$NF]}END{for(i in s){print i,s[i]}}‘LISTEN 6ESTABLISHED 1

2.2 Python Processing
Counter is not used here to determine that when key does not appear obsolete, it needs to be manually initialized to 1.
This receives the Linux STDOUT pipeline form, utilizes the Fileinput library

File net.py

#!/usr/bin/env python3import fileinputd={}for line in fileinput.input():    if line.split()[0].startswith(‘tcp‘):        key=line.split()[5]        if key in d:            d[key]+=1        else:            

Pipe call net.py on the server

Description: 2 ways to actually think the same, using the (awk array or Python dictionary) dictionary Way
Number of occurrences: The same element was found to accumulate 1
Size: Find the same element cumulative size

Awk wins in the short and concise. In complex cases, statements are complex. awk syntax complex, long time not to use easy to forget ...
Python wins is simple and straightforward. The code is slightly longer, but good maintenance.

Compare awk python: [File]web log information statistics. [Command]netstat command status statistics

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.