awk array count and de----statistics domain access times

Source: Internet
Author: User

1.awk arrays

Suppose we have a hotel

Hotel <===>green

There are several rooms in the hotel 515,516,517,519 these rooms

Hotel 515 rooms <===>greenhotel[515] Hotel 516 rooms <===>greenhotel[516] Hotel 517 room <===>greenhotel[517] Hotel 519 Room < ===>GREENHOTEL[519]

Guests staying in the hotel room

Hotel 110 Room Live xiaowei<===>greenhotel[515]= "Xiaowei" hotel room 119 rooms Live dakai<===>greenhotel[516]= "Dakai" Hotel 120 Room Live xiaoguangdong<===>greenhotel[517]= "Xiaoguangdong" hotel room 114 rooms dabaojian<===>greenhotel[519]= " Dabaojian "

Example:

[Email Protected]_kai files]# awk ' begin{greenhotel[515]= "Xiaowei"; greenhotel[516]= "dtel[519]=" Dabaojian "; for ( Hotel in Greenhotel) print Hotel,green[hotel]} '

516

517

519

515

[Email Protected]_kai files]# awk ' begin{greenhotel[515]= "Xiaowei"; greenhotel[516]= "Dakai"; greenhotel[517]= " Xiaoguangdong "; greenhotel[519]=" Dabaojian "; for (Hotelin Greenhotel) print Hotel,greenhotel[hotel]} ' 516 dakai517 xiaoguangdong519 dabaojian515 Xiaowei

Enterprise Face Question 1: Count the number of domain name visits

Process the following file contents, take out the domain name and sort by the domain name count: (Baidu and Sohu interview questions)

Http://www.etiantian.org/index.html
Http://www.etiantian.org/1.html
Http://post.etiantian.org/index.html
Http://mp3.etiantian.org/index.html
Http://www.etiantian.org/3.html
Http://post.etiantian.org/2.html

Ideas:
1) Take a slash for the chopper to remove the second column (domain name)
2) Create an array
3) Use the second column (domain name) as the subscript for the array
4) Counting in a form similar to i++
5) Output the results after statistics

Process Demo:
Step one: Take a look at the content

[[email protected] ~]# awk-f "[/]+" ' {print $} ' file www.etiantian.orgwww.etiantian.orgpost.etiantian.orgmp3.etiantian.orgwww.etiantian.orgpost.etiantian.org Command Description: This is what we need to count.

Step Two: Count

[[email protected] ~]# awk-f "[/]+" ' {i++;p rint $2,i} ' file www.etiantian.org 1www.etiantian.org 2post.etiantian.org 3mp3 . etiantian.org 4www.etiantian.org 5post.etiantian.org 6 Command Description: I++:i is empty at first, when Awk reads a line, I self +1

Step three: Replace I with an array

[[email protected] ~]# awk-f "[/]+" ' {h[$2]++;p rint $2,h[' www.etiantian.org ']} ' file www.etiantian.org 1www.etiantian.org 2post.etiantian.org 2mp3.etiantian.org 2www.etiantian.org 3post.etiantian.org 3 Command Description: 1) Replace I with h[$2] ; I created an array of h[], and then I used the sum as my room number. But there is nothing in the room at the moment. That is h[$2]=h["www.etiantian.org"] and h["post.etiantian.org"] and h["mp3.etiantian.org"] but nothing in the room is empty. 2) h[$2]++ is equal to i++: that is to say I began to add things to the room, when the same thing, I will ++3) print h["4", the output of the results, each time www.etiantian.org, h["www.etiantian.org" ] will be + +. So the final output number is 3.

Fourth step: Output Final count results

[[email protected] ~]# awk-f "[/]+" ' {h[$2]++}end{for (i in h) print I,h[i]} ' file mp3.etiantian.org 1post.etiantian.org 2w ww.etiantian.org 3[[email protected] ~]# command Description: We ultimately need to output is to repeat the subsequent statistical results, so we have to output in the End module for (i in H) to traverse this array, I stored in the room number print I,h [i]: output each room number and the contents of the room (count results)


awk array count and de----statistics domain access times

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.