Getting Started with Shell awk

Source: Internet
Author: User

AWK: useful Data processing tools

Awk is also a great data-processing tool! SED is often used for the processing of an entire row, and awk prefers to divide a row into several "columns" ( or, in other words, a column) to handle. Thus, awk is quite suitable for handling small data processing. The pattern that awk typically runs is this:

[[email protected] ~]# awk ' condition type 1{action 1} condition type 2{action 2} ... ' filename


Awk follows two single quotation marks and braces {} To configure the processing action that you want to make to the data. Awk can handle subsequent files, or it can read the standard output from the previous command. However, as stated earlier, AWK is primarily dealing with "data within each row of columns", while the default "field separator" is "blank key" or "[tab] key"! For example, we use the last to remove the data from the lander, and the results are as follows:

[[email protected] ~]# last-n 5 <== Remove the first five elements root     
PTS/1 192.168.1.100 Tue Feb 11:21 still logged in root
PTS/1 192.168.1.100 Tue Feb 00:46-02:28 (01:41) root
PTS/1 192.168.1.100 Mon Feb 9 11:41-18:30 (06:48) Dmtsai
PTS/1 192.168.1.100 Mon Feb 9 11:41-11:41 (00:00) root
Tty1 Fri Sep 5 14:09-14:10 (00:01)

If I want to remove the IP from the account and the login, and the account and IP are separated by [tab], this will be the case:

[[email protected] ~]# Last-n 5 | awk ' {print $ "\ T" $ $} ' root    192.168.1.100 root    192.168.1.100 root    192.168.1.100 Dmtsai  192.168.1.100 Root    Fri


The above table is the most commonly used action of awk! List the field data by using the PRINT function! Fields are separated by a blank key or [tab] key. Because no matter what line I have to deal with, so there is no need to have a "condition type" limit! What I want is the first column and the third column, but the contents of the fifth line are weird ~ this is because of the data format problem Ah! So Luo ~ when using awk, please confirm your data, if it is continuous data, please do not have space or [tab], otherwise, will be like this example, will be a miscarriage of wrong oh!

In addition, you will know from the above example that each field in each row has a variable name, that is, $ ... The name of the variable. In the above example, root is $ $, because he is the first column! As 192.168.1.100 is the third column, so he is $ $! Back and so on ~ hehe! And there's a variable Oh! That's the "a whole column of data" meaning--in the above example, the first line of "$" means "root ...." "That line!" So, just above the five elements, the entire awk process is:

    1. Read the first line and fill in the first line with the data, $ .... and other variables;
    2. According to the restriction of "condition type", the following "action" should be judged.
    3. Finish all the action and condition types;
    4. If there are subsequent "rows" of data, repeat the above steps until all the data has been read out.

After this step, you will know that awk is "the unit that behaves once", and "the smallest processing unit in a field." Okay, so how does awk know how many lines I have on this data? How many columns do you have? This will require the help of Awk's built-in variables.

Variable name Representative meaning
Nf Total number of fields per row ($)
Nr What awk is currently dealing with is the "number of rows" of data
Fs Current delimited byte, default is blank key

We continue with the example above Last-n 5, if I want to:

    • List the account number for each line (that is, $ $);
    • List the number of rows currently processed (that is, the NR variable in awk)
    • And it shows how many columns the line has (that is, the NF variable in awk)

You can do this:

Tips:
Note that all of AWK's subsequent actions are enclosed in single quotes "'", since both single and double quotes must be paired, so if you want to print the format of awk, remember the non-variable text part, which contains the format mentioned in the previous section of printf, Need to use double quotes to define it! Because the single quote is already a fixed use of AWK's commands!


Root lines:1 columns:10 root lines:2
Root lines:3 columns:10 dmtsai lines:4
Root lines:5 Columns:9 # Note that the NR, NF and other variables in awk are capitalized and do not require a rich-size $!
So can you understand the difference between NR and NF? OK, let's talk about the so-called "condition type". Note : The whole row is represented by $ A, which represents the first item

Getting Started with Shell awk

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.