A small chestnut that uses sed and awk to process text, sedawk text Chestnut

Last Update:2015-04-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I have encountered a problem in my homework for the Linux operating system course over the past two days. I feel very interesting and have a great test of awk proficiency. I have shared this with you.

First, the question is as follows:

RECORD # This is the redundant comment line one # record_type students # This is the redundant comment line twoF sno 1111111_f name Wang tie egg F gender Male F age 20F class network project 01F region Hubei Province Wuhan city. RECORD # This is a redundant comment line one # record_type scores # This is a redundant comment line twoF sno 1205110606F mathematics 92F english 88F chinese 86F history 79F politics 83

Then we need to use sed and awk to change it to the following:

How can we do this? First, we need to remove the extra comment line. That is good to remove, that is, we can use the sed regular expression to determine the result. It starts, if the character r is not followed, delete the line. The specific command is as follows:

In this way, the redundant comment lines can be removed.

Next we will go to the topic and use awk to process the text.

According to the results, we can see that awk has actually entered two records in this question. That is to say, our record separator RS cannot be used to press enter again. We have to use the point in the original text. At the same time, each field in each record should be a record row to ". ", so our domain separator should also be changed to press Enter. Therefore, the command to be executed in awkBEGIN should be the two.

Next we have to analyze how to print the Record (1) "students" at the beginning. The number in the brackets should be the current Record number, the students at the end should be taken from the record_type line. What should I do? It is actually very simple, as shown below:

Use the for loop to retrieve each field, and then judge that if it starts with #, it indicates that the line is # record_type, and we will use the gsub function to process this line.

What is the gsub function? It is actually a replacement function, which uses the regular expression in the first parameter to search for the third parameter, replace the searched content with the second parameter. Specifically, in this chestnut, it is to look for the content starting with # and ending with a space, and then replace it with an empty string. This is actually a delete function. After this processing, # record_type students is changed to students, and then we can output the data again. Note that the print function is similar to the C-language printf function, the string can be directly connected, so the second print parameter above actually outputs "students ".

After the header is output, we should output the intermediate content. We can easily find that all content starts with "F", so we can process it here.

First, determine whether the content of this domain starts with F. If yes, process it.

In the middle of the split function, the function is to split the content of the $ I string by a space in the third parameter, and then store the split result in the array represented by the second parameter, then output the content.

After the for loop is complete, print the record again, so that our function can be implemented.

Finally, paste the code as a whole, hoping to help you:

1 #! /Bin/bash 2 # History: 3 # Michael April, 27,2015 4 5 sed '/# [^ r]. */d 'recordmda.txt | \ 6 7 awk' 8 BEGIN {9 RS = ". "10 FS =" \ n "11} 12 {13 for (I = 1; I <= NF; I ++) 14 {15 if ($ I ~ /#. */) 16 {17 gsub (/#. */, "", $ I) 18 print "Record (" NR ")", "\" "$ I" \ "" 19} 20 else if ($ I ~ /F. */) 21 {22 split ($ I, ,"") 23 print "\ t" "\" "A [2]" \ "=" "\" "A [3]" \ "" 24} 25} 26 print" end of Record ("NR ") "27} 28'29 30 # the function of gsub is to use the regular expression of the first parameter to search for a specific string in the third parameter, and then replace the string with the second parameter, specifically, in this example, we will delete all the content before the space in the # record_type score string 31 32 # the function of the split function is to split, use the third parameter to split the content of the first parameter and store the result in the array represented by the second parameter.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

A small chestnut that uses sed and awk to process text, sedawk text Chestnut

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

A small chestnut that uses sed and awk to process text, sedawk text Chestnut

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support