Daily shell scripting Exercises (02)

Source: Internet
Author: User

1. Topics

There are log 1.log, part of the following:

112.111.12.248 - [25/Sep/2013:16:08:31 +0800]formula-x.haotui.com  "/seccode.php?update=0.5593110133088248" 200"http://formula-x.haotui.com/registerbbs.php" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1;)"61.147.76.51 - [25/Sep/2013:16:08:31 +0800]xyzdiy.5d6d.com "/attachment.php?aid=4554&k=9ce51e2c376bc861603c7689d97c04a1&t=1334564048&fid=9&sid=zgohwYoLZq2qPW233ZIRsJiUeu22XqE8f49jY9mouRSoE71" 301"http://xyzdiy.×××thread-1435-1-23.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)"

Please count the number of accesses per IP?

2. Topic analysis

According to the contents of the log, you can see that the IP address is the first paragraph of the content, so just the first paragraph of 1.log to filter out, and then further statistics on the number of each IP.

Filter the first paragraph, use awk, and count the number of accesses per IP to sort and then calculate the quantity, sort using the sort command, and count the traffic for each IP with Uniq.

3. Specific Shell commands

This problem, a command with a shell script is sufficient:

awk  ‘{print  $1}‘  1.log | sort   -n  | uniq  -c | sort  -n

Explain:

    1. The awk command is advantageous in terms of fragmentation, where {print} prints the first paragraph, and awk can specify the delimiter with-F, and if you do not specify a delimiter, the default is a blank character (such as a space, tab, and so on), where the IP address is the first paragraph.
    2. The sort command is sorting, and the-N option means sorting in numbers. If you do not add-N, it is sorted in ASCII, and the IP address of the subject is sorted more easily by numbers.
    3. The Uniq command is used to repeat a text that, if more than one line of content is identical, uses the Uniq command to delete the same content, leaving only a row. The-C option is to calculate the number of repeated rows. Therefore, the role of uniq-c is precisely the number of IP traffic statistics. However, be aware that Uniq is important after sorting.
    4. The last sort-n means to sort by the size of the traffic, and the higher the number of requests, the more the IP is behind. If you add a-r option, SORT-NR is sorted in reverse order.
4. Conclusion

There is another solution to the problem, and it will be updated tomorrow.

Daily shell scripting Exercises (02)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.