On the text processing tools of Linux
awk, boss.
"Feature description"
Language for text processing (row, filter), support for regular
NR represents the number of rows, $n take a column, $NF the last column
Nr==20,nr==30 from 20 rows to 30 rows
FS vertical cut, Column delimiter
RS cross-cut, row separator
"Syntax format"
awk [–F] ["[Delimiter]"] [ ' {print$1, $NF} '] [destination file]awk ' begin{fs= "[column delimiter]+]; rs= "[Line delimiter]+";p rint "-gegin-"} nr==n{Action} end{print "-end-"} ' Xxx.txt
"Built-in variables"
The nth field of the current record, separated by FS between the fields. $ 0 The complete input record. ARGC The number of command-line arguments. Argind The location of the current file in the command line (starting at 0). The ARGV contains an array of command-line arguments. CONVFMT number conversion format (default is%.6g) ENVIRON environment variable associative array. ERRNO A description of the last system error. fieldwidths Field Width list (separated by Space key). FileName The current file name. FNR with NR, but relative to the current file. FS The field delimiter (the default is any space). IGNORECASE If true, the matching of the case is ignored. The number of fields in the current record. The current number of records. OFMT The output format of the number (the default value is%.6g). OFS the Output field delimiter (the default value is a space). ORS the output record delimiter (the default value is a newline character). Rlength the length of the string that is matched by the match function. RS Record delimiter (default is a line break). Rstart the first position of a string that is matched by the match function. Subsep Array Subscript delimiter (the default value is \034).
operator
= + = = *=/=%= ^= **= assignment?: C-conditional expression | | Logic or && logic and ~ ~! Match regular expressions and mismatched regular expressions < <= > >=! = = relational operator Space join +- Add, Subtract */& multiply, divide and seek +-! Unary Plus, minus, and logical non ^ * * * Power +-- increase or decrease, as prefix or suffix $ field reference in array member
"String Function"
The sub matches the regular expression of the largest, leftmost substring in the record, replacing the strings with replacement strings. If you do not specify a target string, the entire record is used by default. Replace only occurs at the time of the first match gsub the entire document matches the position of the index return substring that was first matched, offset starting at position 1 substr returns the substring starting at position 1, if the specified length exceeds the actual length, Returns the entire string split splits the string into an array by the given delimiter. If the delimiter is not provided, then split by the current FS value length returns the number of characters in the record match returns the index of the position of the expression in the string, and returns 0 if the specified regular expression is not found. The match function sets the built-in variable Rstart to the beginning of a substring of a string, rlength the number of characters to the end of the substring. SUBSTR can facilitate these variables to intercept strings ToUpper and ToLower can be used for conversion between string sizes, which is only valid in Gawk
"String Function"
atan2 (x, y) y,x in the range of cotangent cos (XS) cosine function exp (x) exponentiation int (x) rounding log (x) natural logarithm rand () random number sin (x) Sine sqrt (x) square root srand (x) x is the seed int (x) rounding of the rand () function , the procedure does not round rand () produces a random number greater than or equal to 0 and less than 1
"Use Example"
1. View only the contents of lines 20th through 30th in the Ett.txt file (total 100 lines)
awk ' nr>19&&nr<31 ' Ett.txtawk ' {if (nr>19&&nr<31) print $} ' Ett.txt
2. Add a line number to the contents of the file
awk ' {print nr,$0} '/etc/inittab
3. Output line 24th and add line number
awk ' nr==24 {print nr,$0} '/etc/inittab
4. Standard notation
Awk-f ' [:]+ ' Nr==2{print $ (NF-1)} ' /etc/passwd equivalent to awk ' begin{fs= ' [:]+ '}nr==2{print $ (NF-1)} '/etc/passwdawk ' Begi N{rs= "/"} {print $} '/etc/passwd
5. Print the second column of the second row with one or more/rows of delimiters, the delimiter of the column is the default space, and the line number is printed
awk ' begin{rs= ' [/]+ '} nr==2{print nr,$2} ' test
AWK supports the regular:
6, to: For the delimiter, print the 5th column with s beginning of an entire line
Awk-f ":" ' $5~/^s/{print $} '/etc/passwd
7, with/As delimiter, match the second-to-last line of S or no s followed by the entire row of the bin
Awk-f "/" ' $ (NF-1) ~/(s|) Bin/'/etc/passwd
8. Match the first column with SSH or FTP or MySQL line beginning or ending
awk ' $1~/^ (ssh|ftp|mysql) $/{print $1,$2} '/etc/services
9. Output result 6 0 1 2
echo "[Email protected]@@@@@@@@@@@@@@0=============1############# #2" |awk-f ' [@=#]+ '
10.
awk ' begin{print '---BEGIN---"} nr==2{print $ end{print"---END----"}" xxx.conf
11. The problem of awk statistics percentage
Example one:
examples of log appearance are as follows:
HTTP/youku.com 200
HTTP/youku.com 302
HTTP/youku.com 403
HTTP/youku.com 502
HTTP/baidu.com 302
HTTP/baidu.com 404
Now want to use the awk command by the domain name statistics return code is greater than or equal to 400 percent, if Youku total 4 lines, a return code greater than or equal to 400 has two lines, that is 50%
awk ' { count[$1]++; if ($2>400) above400[$1]++ } end{ for (i in count) { print I, Count[i], above400[i]/count[i] } } ' < Xxx.txt
Example two:
Count the percentage of all the error in a file
awk '/error/{err++}end{print err,nr,err/nr*100 '% '} ' < Xxx.txt
12. Associative array access problems
A.txt and b.txt two files with the same two fields (Id|money), output the same ID in A and b files and a large line of money value for B file
Cat >>a.txt <<eof 1|13|35|57|79|9eof
Cat >>b.txt<<eof1|12|23|304|45|56|67|708|89|910|10eof
Awk-f ' | ' ' begin{while (Getline < "a.txt") {user_map[$1] = $;}} { if ($ user_map) {if (User_map[$1] < $) print $;} ' B.txt
Note: If A.txt does not exist, Getline will return-1, causing a dead loop . I've been there before because of this cause the program hangs dead, so special put forward to let everybody notice
13, 99 multiplication table
awk ' Begin{for (i=1;i<10;i++) {for (j=1;j<=i;j++) printf "%d%s%d%s%d\t", I, "X", J, "=", I*j;print}} '
14, Tomcat concurrency number
Netstat-an|grep 10050|awk ' {count[$6]++} end{for (i in count) print (I,count[i])} '
Sed dick
"Feature description"
Sed is the Strem editor (stream editor) abbreviation and is a powerful tool for manipulating, filtering, and transforming text content. Commonly used functions have to increase the deletion of search, filter, take the line.
Parameters
-N #取消默认输出-R #使用扩展正则-I #刷到磁盘- e #执行多条sed指令- f #指令放在文件里
Sed-command
A append i insert D remove C Replace the specified line s replace each line match to the first character G replace all P output w save file E to execute bash command q Do not continue reading down
Summary process: SED software reads a line from a file or pipe, processes one line, prints one line, reads one line, processes one line, and then outputs a line ...
Change and delete
A appends text to the specified line
I insert text before the specified line
Increase
Single line increase
Sed ' 2a 106,dandan,cso ' person.txtsed ' 2i 106,DANDAN,CSO ' person.txt
Multi-line increase
Sed ' 2a 106,dandan,cso\n107,bingbing,cco ' person.txt
Enterprise Case 1 : Optimizing SSH configuration (add several parameters with one-click Completion)
When we learn about system optimization, there is an optimization point: Change the configuration of the SSH service telnet. The main operation is to add the following 5 lines of text to the SSH configuration file. (The specific meanings of the following parameters are found in other courses.) )
Port 52113PermitRootLogin nopermitemptypasswords Nousedns nogssapiauthentication No
We can use the VI command to edit this text, but this is more troublesome, now want a command to add 5 lines of text to line 13th before?
Sed-ir ' I # # # #Chris-sshd-2016.5.4-youhua######\nport 52113\npermitrootlogin no\npermitemptypasswords No\nUseDNS No \ngssapiauthentication no\n#####--end--#######\n '/etc/ssh/sshd_config
Addresses are separated by commas, and n1,n2 can be represented by numbers, regular expressions, or a combination of the two.
Other use examples
10{sed-commands} action on line 10th
10,20{sed-commands} for 10 to 20 rows, including 10th, 20 rows
10,+20{sed-commands} for 10 to 30 (10+20) lines, including 10th, 30 rows
1~2{sed-commands} to 1,3,5,7,...... Row operations
10,${sed-commands} action on 10 to last line ($ for last row), including line 10th
/oldboy/{sed-commands} line operation to match Oldboy
/oldboy/,/alex/{sed-commands} to match the line of Oldboy to the row operation of Alex
/oldboy/,${sed-commands} line-to-last row for matching Oldboy
/oldboy/,10{sed-commands} to match the Oldboy row to line 10th operation, Note: If the first 10 rows do not match to the oldboy,sed software will display 10 lines after the matching Oldboy line, if any.
1,/alex/{sed-commands} line action on line 1th to match Alex
/oldboy/,+2{sed-commands} to 2 rows following the line that matches the Oldboy
By deleting
D deletes the specified row
Sed ' d ' person.txt #删除全部
Sed ' 2d ' person.txt #删除第二行
Sed ' 2,5d ' person.txt #删除2到5行
Sed ' 3, $d ' Person.txt #删除3到结尾
Sed ' 1~2d ' person.txt #删除1, 3, 5 rows
Sed ' 1,+2d ' person.txt #删除1, 2,3
Sed '/zhangyao/d ' person.txt #删除匹配的zhangyao行
Sed '/oldboy/,/alex/d ' person.txt #删除匹配oldboy到Alex行
Sed '/oldboy/,3d ' person.txt #删除从匹配oldboy的3行
Enterprise Case 2 : Prints the contents of the file but does not contain Oldboy
Sed '/oldboy/d ' person.txt #删除包含 line of "Oldboy"
Change by row substitution
C replace old rows with new lines
Sed ' 2c 106,dandan,cso ' Person.txt #替换第2行的内容
Text substitution
S: Used alone to replace the first matched string in each row
G: replace each line with all
-I: Modifying file contents
sed Software Replacement Model (Box ▇ is replaced with a triangular ▲)
Sed-i ' s/▇/▲/g ' oldboy.log sed-i ' s#▇#▲ #g ' Oldboy.log
Enterprise Case 3 : Specify rows to modify the configuration file
Specify the line to precisely modify the configuration file, which prevents changes to the place.
Sed ' 3s#0#9# ' person.txt
Variable substitution
X=ay=becho $x $ysed s# $x # $y #g test.txt
Instructions for using group substitution \ (\) and \1
The \ (\) function of the SED software can remember part of the regular expression, where \1 is the first remembered pattern, the match in the first parenthesis, \2 the second remembered pattern, the match in the second parenthesis, and the SED can remember up to 9.
Example: Echo I am Oldboy teacher. If you want to keep the word oldboy in this line, delete the remainder, and use parentheses to mark the part you want to keep.
Echo I am Oldboy teacher. |sed ' s#^.*am \ ([a-z].*\) Tea.*$#\1#g ' echo I am Oldboy teacher. |sed-r ' S#^.*am ([a-z].*) Tea.*$#\1#g ' echo I am Oldboy teacher. |sed-r ' s#i (. *) (. *) Teacher.#\1\2#g '
Command description
Idea: Replace the I am Oldboy teacher with Oldboy characters.
The following explanation is used-instead of spaces
- ^.*am-–> This sentence means to start with any character to am-, matching the file of the I am-string;
- \ ([a-z].*\)-–> This sentence of the shell is the brackets \ (\), the inside of [a-z] to match any one of 26 letters, [a-z].* together is to match any number of characters, the subject is to match the Oldboy string, because the Oldboy string is to be preserved , so enclose the match in parentheses, followed by \1 to fetch the Oldboy string.
- -tea.*$–> represents a space tea start, any character end, is actually matched Oldboy string, followed by the string-teacher.;
- The \1 in the later replaced content is the contents of the preceding parentheses, which is the Oldboy string we want.
- () is a meta-character that extends the regular expression, the SED software recognizes the basic regular expression by default and wants to use the extension to use \ Escape, that is, \ (\).
- SED uses the-r option to recognize an extended regular expression, which in turn uses \ (\) error.
Enterprise Case 4 : System boot Item Optimization
Chkconfig--list|grep "3:on" |grep-ve "Sshd|crond|network|rsyslog|sysstat" |awk ' {print '} ' |sed-r ' s#^ (. *) #chkconfig \ 1 off#g ' |bashchkconfig--list|grep "3:on"
Special symbols & representative of replaced content
#→ Replace 1 to 3 rows of C with--c--
Sed ' 1,3s#c#--&--#g ' person.txt #→ here & equals C
Enterprise Case 5 : Batch renaming files
For i in ' seq 5 ';d o touch stu_102999_${i}_finished.jpg;done ls |sed-r ' s/(. *) _finished (. *)/MV &
Check
P outputs the specified content, but outputs 2 matches by default, so use N to cancel the default output
Query by row
Sed ' 2p ' person.txtsed-n ' 2p ' person.txtsed-n ' 2,3p ' person.txtsed-n ' 1~2p ' person.txtsed-n ' P ' person.txt
Query by string
Sed-n '/cto/p ' person.txtsed-n '/cto/,/cfo/p ' person.txt
Mixed query
Sed-n ' 2,/cfo/p ' person.txtsed-n '/feixue/,2p ' person.txt
#特殊情况, the first two lines do not match to Feixue, they match backwards, and if matched to Feixue, the line is printed.
Other features
Backup function
Sed-i.bak ' $a 1111111111 ' xxx.txt
Back up the Xxx.txt file as Xxx.txt.bak, modify the source file, add the last line 111111111
Save function
Replace SB with an entire line of SB's output to New.txt
Uppercase and lowercase conversions
\l #全部转换成小写
\l #单个转换成小写
\u #全部转换成大写
\u #单个转换成大写
\e #需要和 \u and \l to turn off \u and \l functions
Sed-r ' s/(. *), (. *), (. *)/\l\3,\e\1,\u\2/g ' Xxx.txt
perform multiple sed instruction
Sed-e ' 3, $d '-e ' s#10#01#g ' xxx.txtsed ' 3, $d; S#10#01#g ' Xxx.txt
Print Invisible characters L
Sed-n ' l ' xxx.txt
ABC Replace ABC (one by one corresponds)
TR ' abc ' abc ' xxx.txtsed ' y#abc#abc# ' xxx.txt
Can manipulate multiple files
Sed ' y#abc#abc# ' xxx.txt 222.txt
Simulate other commands
Automatically cancel # and modify paths when creating SVN libraries
Sed-i-R ' 12,13s/#//g ' svnserve.confsed-i-R ' 20s/^# (. *)/\1/g ' svnserve.confsed-i-R ' 27s/^# (. *)/\1/g ' svnserve.conf Sed-i-R ' 12,13s/^# (. *)/\1/g ' svnserve.confsed-i-R ' 32s/# (. *=) (. *)/\1 \/usr\/svndata\//' svnserve.conf
One command Execution (Gas)
Svnpath= ' Zhangzhicheng ' sed-i-r-e ' 20s/^# (. *)/\1/g '-E ' 27s/^# (. *)/\1/g '-E ' 12,13s/^# (. *)/\1/g '-E "32s/# (. *=) (. *)/ \1 \/usr\/svndata\/$SvnPath/"svnserve.conf
grep old End
"Feature description"
The Three Musketeers old three. Search text, filter text string –v inverse
"Option description"
Parameter options |
Explanatory notes (with ※ Focus) |
-V |
To read out the contents of the specified content |
-A |
Print the contents of the following n rows |
-B |
Print the contents of the previous N rows |
-C |
The contents of the N rows before and after printing |
-N |
Output Line line number |
-E (Egrep) |
Using an extended regular expression |
-O |
Output only the matching results |
-I. |
Ignore case |
-A |
Add-A when grep thinks it is a binary file |
"Basic Paradigm"
Example 1: Known file Test.txt content is:
Test
Liyao
Oldboy
Please give the command that does not contain the Oldboy string when outputting the contents of the Test.txt file.
Grep–v Oldboy Test.txt
Example 2: Filtering out the contents of a row containing a/etc/services file that contains a 3306 or 15,212 database port
Grep–e "3306|1521"/etc/services
Example 3:
"Skill Example"
To remove a blank line from a file:
Grep-v ' ^$ ' Test.txtegrep-o "^[^:]+" Xxx.txt #匹配开头以非: rows and outputs matching content (-O is not an entire line of output)
A brief discussion on the text processing tools of Linux-awk sed grep