A small chestnut that uses sed and awk to process text, sedawk text Chestnut

Source: Internet
Author: User

A small chestnut that uses sed and awk to process text, sedawk text Chestnut

I have encountered a problem in my homework for the Linux operating system course over the past two days. I feel very interesting and have a great test of awk proficiency. I have shared this with you.

 

First, the question is as follows:

RECORD # This is the redundant comment line one # record_type students # This is the redundant comment line twoF sno 1111111_f name Wang tie egg F gender Male F age 20F class network project 01F region Hubei Province Wuhan city. RECORD # This is a redundant comment line one # record_type scores # This is a redundant comment line twoF sno 1205110606F mathematics 92F english 88F chinese 86F history 79F politics 83

Then we need to use sed and awk to change it to the following:

  

  

How can we do this? First, we need to remove the extra comment line. That is good to remove, that is, we can use the sed regular expression to determine the result. It starts, if the character r is not followed, delete the line. The specific command is as follows:

  

In this way, the redundant comment lines can be removed.

Next we will go to the topic and use awk to process the text.

According to the results, we can see that awk has actually entered two records in this question. That is to say, our record separator RS cannot be used to press enter again. We have to use the point in the original text. At the same time, each field in each record should be a record row to ". ", so our domain separator should also be changed to press Enter. Therefore, the command to be executed in awkBEGIN should be the two.

  

Next we have to analyze how to print the Record (1) "students" at the beginning. The number in the brackets should be the current Record number, the students at the end should be taken from the record_type line. What should I do? It is actually very simple, as shown below:

  

Use the for loop to retrieve each field, and then judge that if it starts with #, it indicates that the line is # record_type, and we will use the gsub function to process this line.

What is the gsub function? It is actually a replacement function, which uses the regular expression in the first parameter to search for the third parameter, replace the searched content with the second parameter. Specifically, in this chestnut, it is to look for the content starting with # and ending with a space, and then replace it with an empty string. This is actually a delete function. After this processing, # record_type students is changed to students, and then we can output the data again. Note that the print function is similar to the C-language printf function, the string can be directly connected, so the second print parameter above actually outputs "students ".

  

After the header is output, we should output the intermediate content. We can easily find that all content starts with "F", so we can process it here.

  

First, determine whether the content of this domain starts with F. If yes, process it.

In the middle of the split function, the function is to split the content of the $ I string by a space in the third parameter, and then store the split result in the array represented by the second parameter, then output the content.

After the for loop is complete, print the record again, so that our function can be implemented.

 

Finally, paste the code as a whole, hoping to help you:

1 #! /Bin/bash 2 # History: 3 # Michael April, 27,2015 4 5 sed '/# [^ r]. */d 'recordmda.txt | \ 6 7 awk' 8 BEGIN {9 RS = ". "10 FS =" \ n "11} 12 {13 for (I = 1; I <= NF; I ++) 14 {15 if ($ I ~ /#. */) 16 {17 gsub (/#. */, "", $ I) 18 print "Record (" NR ")", "\" "$ I" \ "" 19} 20 else if ($ I ~ /F. */) 21 {22 split ($ I, ,"") 23 print "\ t" "\" "A [2]" \ "=" "\" "A [3]" \ "" 24} 25} 26 print" end of Record ("NR ") "27} 28'29 30 # the function of gsub is to use the regular expression of the first parameter to search for a specific string in the third parameter, and then replace the string with the second parameter, specifically, in this example, we will delete all the content before the space in the # record_type score string 31 32 # the function of the split function is to split, use the third parameter to split the content of the first parameter and store the result in the array represented by the second parameter.

 

  

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.