Grep awk SED

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. grep introduction grep (Global Search Regular Expression (re) and print out the line, comprehensive search for regular expressions and print out rows) is a powerful text search tool, it can use regular expressions to search for text and print matching rows. UNIX grep families include grep, egrep, and fgrep. The commands of egrep and fgrep are only slightly different from those of grep. Egrep is an extension of grep and supports more re metacharacters. fgrep is fixed grep or fast grep. They regard all the letters as words, that is, the metacharacters in a regular expression represent the literal meaning of the regular expression. They are no longer special. Linux uses GNU grep. It is more powerful and can use egrep and fgrep functions through the-G,-E,-F command line options. Grep works like this: it searches for string templates in one or more files. If the template contains spaces, it must be referenced. All strings after the template are treated as file names. The search result is sent to the screen without affecting the content of the original file. Grep can be used in shell scripts because grep returns a status value to indicate the search status. If the template search is successful, 0 is returned. If the search is unsuccessful, 1 is returned, if the searched file does not exist, 2 is returned. We can use these return values to automate text processing. 1. grep Regular Expression meta-character set (basic set) ^ start of the anchor row, for example, '^ grep' matches all rows starting with grep. $ The End Of The Anchor row is as follows: 'grep $ 'matches all rows ending with grep.. Match a non-linefeed character such as: 'gr. P' match gr followed by any character, followed by P. * Match zero or multiple previous characters, for example, '* grep'. Match All one or more spaces followed by the grep line.. * Represents any character. [] Matches a character in a specified range, for example, '[Gg] rep' matches grep and grep. [^] Match a character that is not within the specified range, for example, '[^ A-FH-Z] rep' match a line that does not start with a letter that does not contain the A-R and T-Z, followed by Rep. /(../) Indicates matching characters, such as '/(Love/)', and love is marked as 1. /<Start of an anchor, for example, '/end of an anchor, for example, 'grep/>' matches the row of a word ending with grep. X/{M/} repeated characters X, m times, such as: '0/{5/} 'matching rows containing 5 o. X/{M,/} repeat character X, at least m times, for example, 'o/{5,/} 'matches rows with at least 5 o. X/{M, N/} repeats the character X, at least m times, and no more than N times. For example, 'o/{5, 10/} 'matches rows of 5--10 O. /W matches text and numeric characters, that is, [A-Za-z0-9], such as: 'G/W * P' matches with g followed by zero or multiple characters or numbers, then p. The reverse form of/W matches one or more non-word characters, such as periods and periods. /B word lock, for example, '/bgrepb/' only matches grep. 2. The Meta character extension set for egrep and grep-e matches one or more previous characters. For example, '[A-Z] + able' matches one or more lower-case letters followed by able strings, such as loveable, enable, and disable.? Matches zero or multiple previous characters. For example, 'gr? P' matches gr followed by one or no characters, and then the row of P. A | B | C matches a, B, or C. For example, grep | sed matches grep or SED () grouping symbols, such as: Love (able | RS) ov + matches loveable or lovers and matches one or more ov. X {m}, X {M,}, X {m, n} act the same as X/{M/}, X/{M,/}, X/{M, n/} 4. the POSIX character class adds special character classes to POSIX (the Portable Operating System Interface) to ensure one character encoding in different countries, such as [: alnum:] is another way of writing A-Za-z0-9. Put them in the [] sign to become a regular expression, such as [A-Za-z0-9] or [[: alnum:]. In Linux, grep supports POSIX character classes except fgrep. [: Alnum:] character [: Alpha:] character [: digit:] digit character [: Graph:] non-null character (non-space, control character )[: lower:] lowercase character [: cntrl:] control character [: Print:] non-empty character (including space) [: punct:] punctuation [: Space:] all blank characters (new lines, spaces, tabs) [: Upper:] uppercase characters [: xdigit:] hexadecimal numbers (0-9, A-F, A-F) 4. grep Command Option -? At the same time, the upper and lower lines of matching rows are displayed? Line. For example, grep-2 pattern filename simultaneously displays the upper and lower rows of matching rows. -B, -- byte-offset: print the block number of the row before the matching row. -C, -- count: only the number of matched rows is printed, and the matching content is not displayed. -F file, -- file = file: extract the template from the file. The empty file contains 0 templates, so nothing matches. -H, -- no-filename: when multiple files are searched, the matching file name prefix is not displayed. -I, -- ignore-case ignore case differences. -Q, -- Quiet is not displayed. Only the exit status is returned. 0 indicates that the matched row is found. -L, -- files-with-Matches: print the file list matching the template. -L, -- files-without-match print the list of files that do not match the template. -N, -- line-Number Print the row number before the matched row. -S, -- silent does not display error messages about nonexistent or unreadable files. -V, -- revert-Match: Only unmatched rows are displayed. -W, -- word-Regexp if it is referenced by/<and/>, the expression is used as a word search. -V, -- version: displays the software version. 5. To use the grep tool for an instance, you must write a regular expression. Therefore, we will not explain all the functions of grep as an example here. We will only list a few examples to illustrate how to write a regular expression. $ LS-L | grep '^ a' filters the LS-L output content in the MPs queue and only displays rows starting with. $ Grep 'test' D *: displays all rows containing test in files starting with D. $ Grep 'test' aa bb cc is displayed in the AA, BB, and CC files that match the test row. $ Grep '[A-Z]/{5/}' AA displays all rows of strings containing at least five consecutive lowercase characters for each string. $ Grep 'W/(ES/) T. */1' Aa if the West is matched, the ES is stored in the memory, marked as 1, and any characters (. *). These characters are followed by another ES (/1). If they are found, the row is displayed. If you use egrep or grep-E, you do not need to escape the "/" number and write it as 'W (ES) T. */1. Awk usage: awk 'pattern' {action} 'variable name meaning argc command line variable number argv command line variable element array filename current input file name FNR current file record number FS input domain separator, default is a space Rs input record delimiter NF number of fields in the current record Nr number of records so far OFS output domain separator ors output record delimiter usage Introduction: 1, pattern match awk '/zqy/'filea # Find the row containing zqy in filea is equivalent to awk' $0 ~ /Zqy/'filea awk' $1 ~ /88/'filea # Find out the first line awk containing 88 in the Domain '$1 ~ /88/{print $2} 'filea # Find the row whose first domain contains 88, and then print only the second domain 2 of the row, perform operations on different domains awk '$2> 25 & $2 <= 55' filea # Find the rows that meet the conditions in the second domain, you can add {print $ n} to print any domain ############## fileb ############# #### 884 46 1 8 5 944 734 41 0 10 2 787 647 29 1 8 1 686 536 26 1 9 0 572 ############ ### fileb ################# $ less fileb 884 46 1 8 5 944 734 41 0 10 2 787 647 29 1 8 1 686 536 26 1 9 0 572 $ awk '{print NR, NF, $ NF} 'fileb # Nr: current file record Number (this can be understood as the number of rows); NF: the total number of domains (which can be understood as the number of columns); $ NF: Think about what it is? If you don't know, hit the wall. 1 6 944 2 6 787 3 6 686 4 6 572 3. The-F parameter is used to change the domain separator, FS sets the input separator, OFS sets the output separator, and awk supports pipelines for all operations. For example, DF | awk '$4> 1000000' is input through a pipeline operator. For example, a row with 4th fields meeting the conditions is displayed. Awk-F "|" '{print $1}' file is operated according to the new separator "|. Awk 'in in {FS = "[:/T |]"} {print $1, $2, $3} 'file by setting the input separator (FS = "[: /t |] ") modify the input separator. Begin indicates the operation performed before any row is processed. Awk 'in in {OFS = "%"} {print $1, $2, $3} 'file modifies the output format by setting the output separator (OFS = "%. SEP = "|" awk-F $ Sep '{print $1}' file uses the environment variable Sep value as the separator. Awk-F' [:/T |] ''{print $1} 'file uses the value of the regular expression as the separator. Here, space,:, tab, and | are used as the separator at the same time. Awk-F' [] [] ''{print $1} 'file uses the value of the regular expression as the separator, here, [,] 4 and awk-F awkfile are controlled sequentially through the content of the file awkfile. Cat awkfile/101/{print "/047 hello! /047 "} -- print 'Hello! './047 indicates single quotes. {Print $1, $2} -- because there is no mode control, print the first two fields of each row. 5. awk 'in in {max = 100; print "max =" max} {max = ($1> Max? $1: Max); print $1, "Now Max is" max} 'file gets the maximum value of the first domain of the file. Awk '{print ($1> 4? "High" $1: "low" $1)} 'file 6, awk' {$1 = 'chi '{$3 = 'China '; print} 'file: Find the matching row, replace the first 3rd fields, and then display the row (record ). Awk '{$ 7% = 3; print $7}' file divides the 7th domain by 3, assigns the remainder to the 7th domain, and then prints it. 7. awk '/Tom/{Wage = $2 + $3; printf wage}' file: Find the matching row, assign a value to the variable wage, and print the variable. 8. awk '/Tom/{count ++ ;} end {print "Tom was found" count "times"} 'file # end indicates processing after all input rows are processed. 9. awk 'gsub (// $/, ""); gsub (/,/, ""); Cost + = $4; end {print "the total is $" cost> "FILENAME"} 'file gsub function replaces $ and with an empty string, and then outputs the result to filename. 1 2 3 $1,200.00 1 2 3 $2,300.00 1 2 3 $4,000.00 awk '{gsub (// $/, ""); gsub (/,/,""); if ($4> 1000 & $4 <2000) C1 + = $4; else if ($4> 2000 & $4 <3000) C2 + = $4; else if ($4> 3000 & $4 <4000) C3 + = $4; else C4 + = $4;} end {printf "C1 = [% d]; c2 = [% d]; C3 = [% d]; C4 = [% d]/n ", C1, C2, C3, c4} "'file uses the IF and else if condition statements awk '{gsub (// $/," "); gsub (/,/,""); if ($4> 3000 & $4 <4000) Exit; else C4 + = $4;} end {printf "C1 = [% d]; c2 = [% d]; C3 = [% d]; C4 = [% D]/n ", C1, C2, C3, C4}" 'file exits with exit when a condition is specified, but the end operation is still executed. Awk '{gsub (// $/, ""); gsub (/,/, ""); if ($4> 3000) next; else C4 + = $4;} end {printf "C4 = [% d]/n", C4} "'file uses next to skip this row when a condition is specified, perform operations on the next row. 10. awk '{print filename, $0}' file1 file2 file3> fileall writes all the content of file1, file2, and file3 to fileall, and the file name is prefixed. 11. awk '$1! = Previous {close (previous); previous = $1} {print substr ($0, index ($0, "") + 1)> $1} 'fileall splits the merged file into three files. And is consistent with the original file. 12. awk 'in in {"date" | Getline D; print d} 'sends the execution result of date to Getline through the pipeline, assigns it to the variable D, and then prints it. 13. awk 'in in {system ("Echo/" input your name: // C/""); Getline D; print "/Nyour name is", D, "/B! /N "} 'enter the name through the Getline command and display it. Awk 'in in {FS = ":"; while (Getline <"/etc/passwd"> 0) {if ($1 ~ "050 [0-9] _") Print $1} 'print the username in the/etc/passwd file that contains the 050x _ username. 14. awk '{I = 1; while (I: Specifies the end of a word, for example,/love/>/matches a row that contains a word ending with love. X/{M/} repeated characters X, m times, such as:/0/{5/}/matched rows containing 5 o. X/{M,/} repeat character X, at least m times, such as:/O/{5, //}/matched rows with at least 5 o. X/{M, N/} repeats the character X, at least m times, no more than N times, for example,/O/{5, 10/}/matches rows of 5--10 O. 5. Delete instance 5.1: D command * $ sed '2d 'example ----- Delete the second line of the example file. * $ Sed '2, $ d' example ----- delete all rows from the second row to the end of the example file. * $ Sed '$ d' example ----- Delete the last row of the example file. * $ Sed '/test/'d example ----- delete all rows containing test in the example file. 5.2 Replace: s command * $ SED's/test/mytest/G' example ----- replace test with mytest in the entire line. If no G tag exists, only the first matched test in each row is replaced with mytest. * $ Sed-n's/^ test/mytest/P' example ----- (-N) option and the P Flag are used together to print only the replaced rows. That is to say, if the test at the beginning of a row is replaced with mytest, print it. * $ SED's/^ 192.168.0.1/& localhost/'example ----- & symbol represents the part found in the replacement string. All rows starting with 192.168.0.1 are replaced with their own localhost and changed to 192.168.0.1localhost. * $ Sed-n's // (Love/) able // 1RS/P' example ----- love is marked as 1, and all loveable is replaced with lovers, the replaced line is printed out. * $ SED's #10 #100 # g'example ----- whatever the character, followed by the S command is considered as a new separator. Therefore, "#" is a separator here, replaces the default "/" separator. Replace all 10 with 100. 5.3 range of selected rows: comma * $ sed-n'/test /, /check/P' example ----- all rows within the range specified by the template test and check are printed. * $ Sed-n'5,/^ test/P' example ----- print all rows starting from the fifth line to the first line containing the start of test. * $ Sed '/test/,/check/S/$/SED test/'example ----- for the rows between the template test and west, the end of each row is replaced by the string sed test. 5.4 multi-point Editing: e command * $ sed-e '1, 5d '-E's/test/check/'example ----- (-E) option allows multiple commands to be executed in the same line. As shown in the example, the First Command deletes lines 1 to 5, and the second command replaces test with check. The command execution sequence has an impact on the result. If both commands are replacement commands, the first replacement command will affect the result of the second replacement command. * $ Sed -- Expression ='s/test/check/'-- Expression ='/love/d' example ----- a better command than-E is -- expression. It can assign values to SED expressions. 5.5 read from a file: the content in the R command * $ sed '/test/R file 'example ----- file is read and displayed after the line Matching Test, if multiple matching rows exist, the file content is displayed below all matching rows. 5.6 file writing: W command * $ sed-n'/test/W file 'example ----- in example, all rows containing test are written into file. 5.7 APPEND command: A command * $ sed '/^ test/A // ---> This Is A example 'example <----- 'this is a example' is appended to the end of the row starting with test, sed requires that command a be followed by a backslash. 5.8 insert: I command $ sed '/test/I // New Line -------------------------' example if test is matched, insert the text following the backslash to the front of the matching row. Next: N command * $ sed '/test/{n; S/AA/BB/;}' example ----- if test is matched, move it to the next line of the matching line, replace the AA of this row with BB, print the row, and continue. 5.9 Deformation: The y command * $ sed '1, 10y/ABCDE/'example ----- converts all ABCDE in line 1-10 to uppercase. Note that this command cannot be used for regular expression metacharacters. 5.10 Exit: Q command * $ sed '10q' example ----- exit sed after printing 10th lines. 5.11 maintain and obtain: H command and G command * $ sed-e '/test/H'-e' $ g example ----- when sed processes files, each row is saved in a temporary buffer called the mode space. Unless the row is deleted or the output is canceled, all processed rows are printed on the screen. The mode space is cleared, and a new row is saved for processing. In this example, the row Matching Test is found and saved to the mode space. The H Command copies the row and saves it to a special buffer zone called keep cache. The second statement means that when the last line is reached, the G command extracts the row that maintains the buffer and places it back in the mode space, and append it to the end of the row that already exists in the mode space. In this example, It is appended to the last row. Simply put, any row containing test is copied and appended to the end of the file. 5.12 persistence and swapping: The H command and the X command * $ sed-e '/test/H'-e'/check/X' example ----- swap mode space and keep the buffer content. That is, to swap the rows containing test and check. 6. The SED script is a sed command list. When SED is started, the file name of the script is guided by the-F option. Sed is very picky about the commands entered in the script. There cannot be any blank or text at the end of the command. If there are multiple commands in one line, separate them with semicolons. Comments rows starting with # And cannot span rows. 7. Tips * use double quotation marks when referencing shell variables in the SED command line, instead of the common single quotation marks. The following describes how to delete named based on the name variable. the script for the zone section in the conf file: Name = 'zone/"localhost" 'sed "/$ name/,/};/D" named. conf

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Grep awk SED

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support