Shell common Text Processing commands

Source: Internet
Author: User
Tags egrep

grepIf you want to use a regular expression, add the parameter Grep-e "[a-z]+" #使用正则表达式
or egrep "[a-z]+"

-a-b output matches a few lines before or after the line-C to display a few lines at the same time
-e matches multiple styles, such as GREP-E "cat"-E "dog" file
-I ignores the case of text
-O to output only text that matches in text
-C Statistics match to the number of rows (note not the number of times) if you want to count the number of matches, you can use ECHO-E "1 2 3 4\nhello\n5 6" | Egrep-o "[0-9]" | Wc-l
-L Find out the name of a string that appears
-N Output Line count
-R Multilevel Recursive search in the directory
--include search for a specific file, such as--include *. {C,cpp}
-Q Silent Output


CutSlicing files by column
-f1,2,3 Show data for the first-and-third column
-D Specify column separators

-c1-5,6-9--output-delimiter "," Display 第1-5 and 7-9 characters


sedSed is an online editor that processes a single line of content at a time. Used to make data rows ReplaceDelete, additions, selections, and other specific tasks.
Common options:
-N: Use Quiet (silent) mode. In the usage of general sed, all data from stdin is generally listed on the screen. However, if you add the-n parameter, only the line (or action) that is specially processed by SED is listed.
-E: The Action of SED is edited directly on the instruction list mode;
-F: The action of SED is written directly in a file, and-f filename can perform the SED action within filename;
-r:sed's actions support the syntax of extended formal notation. (Presupposition is the basic formal notation of French law)
-I: Directly modify the contents of the read file instead of the screen output.

Common commands:
A: New, a can be followed by a string, and these strings will appear in a new line (the current next line) ~
C: Replace, C can be followed by strings, these strings can replace the line between N1,N2!
d: Delete, because it is deleted, so D usually does not pick up any
banging on the back;

I: Insert, I can be followed by the string, and these strings will appear on a new line (the current line);
P: Print, that is, print out a selected material. Normally p will work with parameter Sed-n ~
S: Replace, can be directly replaced by work! Usually this s action can be paired with formal notation! For example 1,20s/old/new/g is!


Example: (Suppose we have a file named AB)
Delete a row
[[email protected] ruby] # sed ' 1d ' ab #删除第一行
[[email protected] ruby] # sed ' $d ' ab #删除最后一行
[[email protected] ruby] # sed ' 1,2d ' ab #删除第一行到第二行
[[email protected] ruby] # sed ' 2, $d ' AB #删除第二行到最后一行


Show a row
. [[email protected] ruby] # sed-n ' 1p ' ab #显示第一行
[[email protected] ruby] # sed-n ' $p ' ab #显示最后一行
[[email protected] ruby] # sed-n ' 1,2p ' ab #显示第一行到第二行
[[email protected] ruby] # sed-n ' 2, $p ' AB #显示第二行到最后一行


Querying using a pattern
[[email protected] ruby] # sed-n '/ruby/p ' ab #查询包括关键字ruby所在所有行
[[email protected] ruby] # sed-n '/\$/p ' AB #查询包括关键字 $ where all lines, using backslashes \ Shielding special meaning


Add one or more lines of string
[email protected] ruby]# Cat AB
Hello!
Ruby is me,welcome to my blog.
End
[[email protected] ruby] # sed ' 1a drink tea ' ab #第一行后增加字符串 "Drink Tea"
Hello!
Drink tea
Ruby is me,welcome to my blog.
End
[[email protected] ruby] # sed ' 1,3a drink tea ' ab #第一行到第三行后增加字符串 ' drink tea '
Hello!
Drink tea
Ruby is me,welcome to my blog.
Drink tea
End
Drink tea
[[email protected] ruby] # sed ' 1a drink tea\nor coffee ' ab #第一行后增加多行, using line break \ n
Hello!
Drink tea
or coffee
Ruby is me,welcome to my blog.
End


Instead of one row or more rows
[[email protected] ruby] # sed ' 1c Hi ' AB #第一行代替为Hi
Hi
Ruby is me,welcome to my blog.
End
[[email protected] ruby] # sed ' 1,2c Hi ' ab #第一行到第二行代替为Hi
Hi
End


Replace a section in a row
Format: sed ' s/string to replace/new string/g ' (the string to replace can be used with regular expressions)
[[email protected] ruby] # sed-n '/ruby/p ' ab | Sed ' s/ruby/bird/g ' #替换ruby为bird
[[email protected] ruby] # sed-n '/ruby/p ' ab | Sed ' s/ruby//g ' #删除ruby


Insert
[[email protected] ruby] # sed-i ' $a bye ' ab #在文件ab中最后一行直接输入 "Bye"
[email protected] ruby]# Cat AB
Hello!
Ruby is me,welcome to my blog.
End
Bye




awk

Awk is a powerful text analysis tool, with the search for grep and the editing of SED, which is especially powerful when it comes to analyzing data and generating reports.

To put it simply, awk reads the file line- by-row, using spaces as the default delimiter to slice each row , and then perform various analytical processing of the cut.

awk has 3 different versions: AWK, Nawk, and gawk, which are not specifically described, generally referred to as the GNU version of awk, Gawk,gawk.
Basic syntax: awk ' {pattern + action} ' {filenames}


#last-N 5 | awk  ' {print '} ' rootrootrootdmtsairoot

the awk workflow is this: reads a record with a ' \ n ' line break, then divides the record by the specified domain delimiter, fills the field, and $ $ represents all fields, representing the first field, $n representing the nth field. The default Domain delimiter is the "blank key" or "[tab] key", so the login user, $ $ means the login user IP, and so on.


CAT/ETC/PASSWD |awk-  F ': '  BEGIN {print ' Name,shell '}  {print $ ', ' $7} END {print ' Blue,/bin/nosh '} ' name, Shellroot,/bin/bashdaemon,/bin/shbin,/bin/shsys,/bin/sh

The awk workflow is done by first executing the beging, then reading the file, reading a record with the/n line break, and then dividing the record by the specified field delimiter, populating the field, and $ $ representing all fields, representing the first field, $n representing the nth field, The action action corresponding to the execution pattern is then started. Then start reading the second record ... Until all the records have been read, the end operation is performed.


awk built-in variables
Awk has many built-in variables for setting up environment information, which can be changed, and some of the most commonly used variables are given below.
ARGC number of command line arguments
ARGV Command line parameter arrangement
ENVIRON support for the use of system environment variables in queues
FileName awk browses the file name
FNR number of records to browse files
FS sets the input domain delimiter, which is equivalent to the command line-F option
NF browsing the number of fields recorded
NR number of records read
OFS output Field delimiter
ORS Output Record delimiter
RS Control record delimiter




Statistics/etc/passwd: File name, line number per line, number of columns per row, corresponding full line contents:
#awk-  F ': '  {print ' filename: ' filename ', linenumber: ' NR ', columns: ' NF ', linecontent: ' $ '/etc/ passwdfilename:/etc/passwd,linenumber:1,columns:7,linecontent:root:x:0:0:root:/root:/bin/bashfilename:/etc/ Passwd,linenumber:2,columns:7,linecontent:daemon:x:1:1:daemon:/usr/sbin:/bin/shfilename:/etc/passwd,linenumber : 3,columns:7,linecontent:bin:x:2:2:bin:/bin:/bin/shfilename:/etc/passwd,linenumber:4,columns:7,linecontent:sys : x:3:3:sys:/dev:/bin/sh

Count the number of bytes occupied by a file under a folder
Ls-l |awk ' BEGIN {size=0;} {size=size+$5;} End{print "[End]size is", size} ' [End]size is  8657198


If displayed in units of M:
Ls-l |awk ' BEGIN {size=0;} {size=size+$5;} End{print "[End]size is", size/1024/1024, "M"} ' [End]size is  8.25889 M

Note that the statistics do not include subdirectories of folders.


Conditional statements
The conditional statements in awk are drawn from the C language, as described in the following declaration:
if (expression) {
Statement
Statement
... ...
}


if (expression) {
Statement
} else {
Statement2;
}


if (expression) {
Statement1;
} else if (expression1) {
Statement2;
} else {
Statement3;
}



Count the number of bytes in a file under a folder, filtering files of 4096 size (typically folders):
Ls-l |awk ' BEGIN {size=0;print ' [start]size is ', size} {if ($5!=4096) {size=size+$5;}} End{print "[End]size is", size/1024/1024, "M"} ' [End]size is  8.22339 M


Looping statements
The looping statements in awk also draw on the C language, supporting while, Do/while, for, break, continue, which are semantically identical to the semantics of the C language.


Array
Because the subscript of an array in awk can be numbers and letters, the subscript of an array is often referred to as the keyword (key). The values and keywords are stored inside a table for the Key/value application hash. Since hash is not stored sequentially, it is found in the display of array contents, which are not displayed in the order you expect. Arrays, like variables, are created automatically when they are used, and awk automatically determines whether they store numbers or strings. In general, an array in awk is used to collect information from records, which can be used to calculate sums, count words, and how many times the tracking template is matched.


Show/ETC/PASSWD's account
Awk-f ': ' BEGIN {count=0;} {Name[count] = $1;count++;}; End{for (i = 0; i < NR; i++) print I, Name[i]} '/etc/passwd0 root1 daemon2 bin3 sys4 sync5 Games ...

This uses the For loop to iterate through the array

Shell common Text Processing commands

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.