Stream Editor: sed
Sed ' s/pattern/replace_string/' file #从给定文本中的字符串利用正则表达式进行匹配并替换每一行中第一次符合样式的内容
Sed ' s/text/replace/' file > NewFile #替换每一行中第一次符合样式的内容并将替换结果重定向到新文件
Sed-i ' s/test/replace/' file #参数-I use to replace the content result of the first conforming style in each row to the source file
Sed ' s/pattern/replace_string/g ' file #后缀/g means that each match is replaced instead of the first match in each row
Sed ' s:test:rep;ace:g ' #使用: replace/, these two symbols are delimiters, with other symbols also do not matter, but when the delimiter in the matching style inside, need to add \ to escape
sed ' expression; Expression ' #组合多个表达式
Sed '/^$/d ' file #移除空白行, ^$ represents a blank line,/d means that the matching style is removed
echo Thisthisthisthis | Sed ' s/this/this/2g ' #后缀/2g means starting at 2nd (including the second) to start the match. At nth, use/ng. Results: Thisthisthisthis
Cat File | Sed ' s/pattern/replace_string/' file #从stdin中读取输入并替换每一行中第一次符合样式的内容
Echo This was an example | Sed ' s/\w\+/[&]/g ' #符号 & represents a matched string. The regular expression \w\+ matches each word and replaces it with [&], result: [This] [was] [an] [example]
echo this was digit 7 in a number | Sed '/digit \ ([0-9]\)/\l/' #参数 \1 (number 1) converts digit 7 to 7
---------------------------------------------------------------
Text=hello
echo Hello World | Sed "s/$text/hello/" #输出结果HELLO World
---------------------------------------------------------------
Text clutter and return to normal (replace space, line break, tab, etc.)
Cat Test.js | Sed ' s/;/;\n/g; s/{/{\n\n/g; s/}/\n\n}/g ' # s/;/;\n/g will be replaced by \ n; s/{/{\n\n/g {replace {\ n \ s/}/\n\n/g} with \ n}
Cat Test.js | Sed ' s/;/;\n/g ' |sed ' s/{/{\n\n/g ' | sed ' s/}/\n\n}/g ' #同上
Sed ' s/[^.] *mobbile phones[^.] *\.//g ' Test.txt #移除文件test a sentence containing the word "mobile phones" in. txt
Data Flow tool: awk
How it works: awk ' begin{PRINT ' start '} pattern {commands} end{print "END"} file
The BEGIN statement block is executed first, then a line is read from the file or stdin, and then the pattern{commands} is executed. Until all the files have been read. Executes the end{commands} statement block when the end of the input stream is read. Three statement blocks are optional. If the pattern statement block is not provided, each read-to line is printed by default.
Special variables for awk:
NR: Indicates the number of records that corresponds to the current line number during execution.
NF: Indicates the number of fields in the execution process relative to the current row.
$NF: Represents the last field of the current row. $ (NF-1) represents the second-to-last field of the current row. In turn
$: This variable contains the text content of the current line during execution.
$: This variable contains the text content of the first field.
$: This variable contains the text content of the second field. In turn.
awk ' BEGIN {i=0} {i++} end{print i} ' filename #逐行读取文件并打印行数
Echo-e "Line1\nline2" | awk ' begin{print ' Start ' {print} end {print ' End '} '
echo | awk ' {var1= ' v1 ', var2= ' v2 '; var3= ' v3 '; print var1 "-" var2 "-" VAR3;} '
Echo-e "line1 F2 f3\nline2 f4 f5\nline3 f6 F7" | awk ' {print ' Line no: "NR", No. of Fields: "NF," $0= "$," $1= "$," $2= "," $3= "$"
awk ' {print $3,$2} ' file #打印文件中每一行的第2和第3个字段.
awk ' end{print NR} ' file #统计文件中的行数, with only an end statement block indicating that the file is executed to the last line before the row number is output
awk ' NR < 5 ' file #打印文件中行号小于5的行
awk ' nr==2,nr==5 ' file #打印文件中行号在2到5之间的行
awk '/linux/' file #打印文件中包含样式linux的行 (style can use regular expressions)
awk ' 1/linux/' file #打印文件中不包含包含样式linux的行
Awk-f: ' {print $NF} '/etc/passwd #读取并打印 the contents of the/etc/passwd file, set the delimiter to ': ', the default delimiter is a space
var1= ' test '; var2= ' Text ' # (1) External variables
echo | awk ' {print v1,v2} v1= $var 1 V2=var2 # (2) Print multiple external variables passed from standard input to awk
awk ' {print v1,v2} ' v1= $var 1 v2=var2 filename # (3) input from file
Cat Test.txt | Getline output #将cat的输出读入变量output中.
awk ' BEGIN {fs= ': '} {print $NF} '/etc/passwd #BEGIN语句块中则使用FS = ' delimiter ' Sets the delimiter for the output field
awk ' {arr[$1]+=1}end{for (i in arr) {print arr[i] "\ t" i} ' file_name | Sort-rn #统计每个单词的出现频率并排序
Seq 5 | awk ' begin{sum=0;print ' summation:} {print $ "+"; sum+=$1} end{print "= ="; Print sum} ' #将每一行第一个字段的值按照给定形式进行累加
echo | awk ' {"grep root/etc/passwd" | getline cmdout; print cmdout} ' #通过getline将外部shell命令的输出读入变量cmdout. The variable cmdout includes the output of the command grep Root/etc/passwdde, and then prints the row that contains the root.
Use the built-in function of loop and awk in awk
for (i=0;i<10;i++) {print $i;} or for (i in array) {print array[i];}
Length (String): Returns the lengths of a String
index (String, search_string): Returns the position search_string appears in the string.
Split (string, array, delimiter): Fuzhou into a list of strings and storing the list in an array
substr (String, start-position, end-position) : Creates a substring with a character start and end offset in a string and returns the substring.
Sub (regex, Replacement_str, String): Replaces the first content of the regular expression with REPLACEMENT_STR.
Gsub (Regex, Replacement_str, string): Replaces all the contents of a regular expression with replacement_str.
Match (Regex, string): Checks that the regular expression is not able to match the string, and if so, returns a value other than 0; otherwise, 0.
Replacement tool: TR
echo 12345 | Tr ' 0-9 ' 9876543210 ' #加密
echo 87654 | Tr ' 9876543210 ' 0-9 ' #解密
echo "Hello 123 World 456" | Tr-d ' 0-9 ' #使用-D remove and print numbers from stdin
Cat Test.txt | Tr-d ' 0-9 ' #同上
echo "Hello 1 char 2 next 3" | Tr-d-C ' 0-9 \ n ' #参数-C is using a complement set. Remove all numbers and characters outside of the stdin (these characters are the complement of the collection of ' 0-9 \ n ')
echo "This is a test!" | Tr-s ' #参数-s compresses multiple spaces for a single
------------------------------------------------
TR can use a variety of different character classes as if it were a collection:
Alnum: Letters and Numbers
Alpha: Letters
Cntrl: Controlling (non-printing) characters
Digit: Digital
Graph: Graphic characters
Lower: lowercase Letters
Print: Printable characters
Punct: Punctuation
space; blank character
Upper: Uppercase
Xdigit: Hexadecimal characters
How to use:
TR [: Class:] [: Class:]
For example:
TR ' [: Lower:] ' [: Upper:] ' #将所有小写字母换成大写字母
----------------------------------------------------
Linux Learning: SED and awk and TR usage collation