This article describes the most common tools for using the shell to process text under Linux:
Find, grep, Xargs, Sort, uniq, tr, cut, paste, WC, sed, awk;
The examples and parameters provided are the most commonly used and most practical.
I use the principle of Shell script is command line writing, try not to more than 2 lines;
If you have more complex task requirements, consider python.
Find File Lookup
Find. \ (-name "*.txt"-o-name "*.pdf" \)-print
Find. -regex ". *\ (\.txt|\.pdf\) $"
-iregex: Ignoring case-sensitive regular
Find. ! -name "*.txt"-print
Find. -maxdepth 1-type F
Custom Search
Find. -type d-print//List all directories only
-type f File/L Symbolic link
Search by Time:
-atime access Time (in days, minutes units is-amin, similar to the following)
-mtime modification Time (content modified)
-ctime Change Time (metadata or permission changes)
All files that have been visited in the last 7 days:
Find. -atime 7-type F-print
Find. -type f-size +2k
Search by permissions:
Find. -type f-perm 644-print//Find all files with executable permissions
Search by User:
Find. -type f-user weber-print//Find the files owned by the user Weber
Follow-up action found after
Find. -type f-name "*.SWP"-delete
Find. -type f-user root-exec chown Weber {} \; Change ownership in the current directory to Weber
Note: {} is a special string, for each matching file, {} will be replaced with the corresponding file name;
Eg: Copy all the found files to another directory:
Find. -type f-mtime +10-name "*.txt"-exec cp {} old \;
-exec./commands.sh {} \;
Delimiter of the-print
Use ' \ n ' as the file delimiter by default;
-print0 uses ' \ n ' as the delimiter for the file, so you can search for files that contain spaces;
grep Text Search
grep match_patten File//default access matching line
Grep-c "text" filename
-N Prints matching line numbers
-I ignore case when searching
-L print File name only
grep "Class". -r-n
Grep-e "Class"-E "vitural" file
grep "Test" file*-lz| xargs-0 RM
Xargs command-line argument conversions
Xargs can convert input data into command-line arguments for a particular command, so that it can be combined with a number of commands. Like grep, like find;
Convert multi-line output to single-line output
Cat file.txt| Xargs
\ n is a delimiter between multiple lines of text
Convert a single line to multiple lines of output
Cat Single.txt | Xargs-n 3
-N: Specify the number of fields to display per row
Xargs parameter Description
-D defines delimiters (the delimiter is \ n for multiple lines by default)
-n Specifies that the output is multiple lines
-I {} Specifies the replacement string, which is replaced when the xargs extension is used, when multiple arguments are required for the command to be executed
eg
Cat File.txt | Xargs-i {}./command.sh-p {}-1
-0: Specify as input delimiter
Eg: number of statistical program lines
Find source_dir/-type f-name "*.cpp"-print0 |xargs-0 wc-l
Sort sorts
Field Description:
-N Sort by number vs-d in dictionary order
-R Reverse Order
-k n Specifies sorting by nth column
eg
SORT-NRK 1 Data.txt
SORT-BD Data//ignores leading whitespace characters such as spaces
Uniq Eliminating Duplicate rows
Sort Unsort.txt | Uniq
Sort Unsort.txt | Uniq-c
Sort Unsort.txt | Uniq-d
You can specify the duplicates that need to be compared in each row:-S start position-W comparison character number
Convert with TR
echo 12345 | Tr ' 0-9 ' 9876543210 '//Add decryption conversion, replace the corresponding character
Cat text| Tr ' t ' '//tab to space
Cat File | Tr-d ' 0-9 '//Delete all numbers
-C Seeking complement set
Cat File | Tr-c ' 0-9 '//Get all the numbers in the file
Cat File | Tr-d-C ' 0-9 \ n '//delete non-numeric data
Cat File | Tr-s "
Eg:tr ' [: Lower:] ' [: Upper:] '
Cut split text by column
cut-f2,4 filename
CUT-F3--complement filename
Cat-f2-d ";" FileName
Cut-c1-5 File//print first to 5th character
Cut-c-2 File//print first 2 characters
Paste stitching text by column
Stitch two text together by column;
Cat File112cat File2
Colin
Book
Paste File1 file21 colin2 Book
The default delimiter is a tab character, which can be specified with-D
Paste File1 file2-d ","
1,colin
2,book
Tools for WC statistics lines and characters
Wc-l File//Count rows
Wc-w File//Count of words
Wc-c File//Count characters
Sed Text Replacement tool
Seg ' s/text/replace_text/' file//replace the first matching text of each line
Seg ' s/text/replace_text/g ' file
After the default substitution, output the replaced content, if you need to replace the original file directly, use-I:
Seg-i ' s/text/repalce_text/g ' file
Sed '/^$/d ' file
echo this is en example | Seg ' s/\w+/[&]/g '
$>[this] [is] [en] [Example]
Sed ' s/hello\ ([0-9]\)/\1/'
Sed ' s/$var/hlloe/'
When using double quotes, we can specify variables in the SED style and in the replacement string;
Eg:p=patten
r=replaced
echo "line con a patten" | Sed "s/$p/$r/g" $>line con a replaced
Sed ' s/^.\{3\}/&\//g ' file
awk Data Flow processing tool
AWK script Structure
awk ' begin{statements} statements2 end{statements} '
Working style
1. Execute the statement block in begin;
2. Read a line from the file or stdin, and then execute the STATEMENTS2, repeating the process until the file is fully read;
3. Execute the end statement block;
Print printing when moving forward
Echo-e "Line1\nline2" | awk ' Begin{print ' "start"} {print} end{print "END"} '
echo | awk ' {var1 = ' v1 '; var2 = "V2"; var3= "V3"; \
Print var1, var2, VAR3; }‘
$>V1 V2 v3
echo | awk ' {var1 = ' v1 '; var2 = "V2"; var3= "V3"; \
Print Var1 "-" var2 "-" VAR3; }‘
$>v1-v2-v3
Special variable: NR NF $ $ $
NR: Indicates the number of records, in the course of the implementation of the forward number;
NF: Indicates the number of fields, the total number of fields that should go forward during the execution;
$: This variable contains the text content of the current line during execution;
$: The text content of the first field;
$: The text content of the second field;
Echo-e "line1 f2 f3\n line2 \ Line 3" | awk ' {print NR ': ' $ '-' $ '-' $ '
awk ' {print $, $ $} ' file
awk ' END {print NR} ' file
Echo-e "1\n 2\n 3\n 4\n" | awk ' begin{num = 0;
print "Begin";} {sum + = $;} END {print "= ="; Print sum} '
Passing external variables
var=1000
echo | awk ' {print Vara} ' vara= $var # input from stdin
awk ' {print Vara} ' vara= $var file # input from files
To filter the rows that awk handles with a style
awk ' NR < 5 ' #行号小于5
awk ' nr==1,nr==4 {print} ' file #行号等于1和4的打印出来
awk '/linux/' #包含linux文本的行 (can be specified with regular expressions, super powerful)
awk '!/linux/' #不包含linux文本的行
Set delimiter
Use-F to set delimiters (default is a space)
Awk-f: ' {print $NF} '/etc/passwd
Read command output
Using Getline, the output of the external shell command is read into the variable cmdout;
echo | awk ' {"grep root/etc/passwd" | getline cmdout; print Cmdout} '
Using loops in awk
for (i=0;i<10;i++) {print $i;}
For (i in array) {print array[i];}
eg
Print lines in reverse order: (Implementation of the TAC command)
Seq 9| \
awk ' {LIFO[NR] = $; LNO=NR} \
end{for (; lno>-1;lno--) {print Lifo[lno];}
} ‘
AWK implements head, tail commands
awk ' Nr<=10{print} ' filename
awk ' {buffer[nr%10] = $;} End{for (i=0;i<11;i++) {\
Print Buffer[i%10]}} ' filename
Print the specified column
LS-LRT | awk ' {print $6} '
LS-LRT | Cut-f6
Print the specified text area
Seq 100| awk ' Nr==4,nr==6{print} '
awk '/start_pattern/,/end_pattern/' filename
eg
SEQ 100 | awk '/13/,/15/'
cat/etc/passwd| awk '/mai.*mail/,/news.*news/'
awk common built-in functions
Index (string,search_string): Returns the position search_string appears in the string
Sub (regex,replacement_str,string): Replace the first content of the regular match with the REPLACEMENT_STR;
Match (regex,string): Checks if the regular expression matches the string;
Length (String): Returns the string length
echo | awk ' {"grep root/etc/passwd" | getline cmdout; print length (cmdout)} '
printf, similar to the C language, formats the output
eg
Seq 10 | awk ' {printf '->%4s\n ', ' $ '
Iterate over lines, words, and characters in a file
1. Iterate through each line in the file
while read line;
Do
Echo $line;
Done < file.txt
Change to Child shell:
Cat File.txt | (While read line;do echo $line;d one)
2. Iterate through each word in a row
for word in $line;
Do
Echo $word;
Done
3. Iterate through each of the characters
${string:start_pos:num_of_chars}: Extracts a character from a string; (bash text slices)
${#word}: Returns the length of a variable word
For ((i=0;i<${#word};i++))
Do
echo ${word:i:1);
Done
Read the original
Linux Shell Text Processing tool Highlights