One of the Three Musketeers of Linux-->grep

Source: Internet
Author: User
Tags egrep

========================================================================================
*######------Linux Regular Expressions------######
*######------Three Musketeers grep text filter------######
*/
========================================================================================
linux-wildcard characters
#作用: Wildcard characters are primarily applied to match filenames, while regular is used primarily on strings
Often wildcard characters *? ^ [] {} and so on
Symbolic effect
* indicates matching any character
? means match any one character
^ Denotes reverse operation
[A-z] means matching a-Z any letter
{} represents a combination of a set of expressions

========================================================================================
linux-Special Symbols

\a alert BEL
\b Backspace (back Space)
\f FormFeed page only affects printer
\ n Newline, carriage return line
\ r Return, carriage return to the beginning of
\ t Tab (horizontal 4,8 Lattice)
\v Vertical Tab affects only printer
\o null character
\xnn convert NN number becomes character
\d dd octal value
\c line break symbol
\e ESCAPE, jump key
\ ' In double quotes, use only single quotation marks
\ "Double quotation marks
\ \ \" Question mark character


========================================================================================
#什么是正则表达式 #
1: Regular expressions are a set of rules and methods that are defined to handle a large number of strings
2: With the aid of these special symbols defined, the system administrator can quickly filter, replace and output the required strings, and Linux regular expressions are typically handled in the behavioral unit
Simply put:
A set of rules and methods defined for handling large numbers of strings
Linux regular is typically handled in behavioral units
#经常使用正则表达式的shell命令和工具: Linux Three Musketeers (grep, sed, awk)
#grep, VI, sed all belong to the BRE (basic regular expression) This faction, I guarantee compatible EBR, you can use \ to transfer special symbols
#egrep, awk belongs to ere (extended regular expression) this row
**grep support: BREs, EREs, Pres**egrep support: BREs, EREs
grep does not use the parameter-e representation using "BREs" Egrep using "EREs" without a parameter representation
Grep-e means using "EREs" egrep-p means using "PREs"
Grep-p means using "PREs"
Grep-g means using "BREs"
Grep-f indicates the use of the non-regular expression meta character set
**sed regular feature (by row), default BRE, you can use Sed-r to turn on extended regular Expressions (ERE)
**awk (gawk) regular feature (Action on column), default extended regular expression (EREs)

#****** Basic Regular expression (BRE basic Regular expression)
^word matches the beginning, searching for lines beginning with Word (for example: searching for script comment lines that begin with #). grep ' ^# ' Cdly.txt)
word$ matches the end, searching for lines ending in Word (for example: Searching for '. ') The end of the line. grep '. $ ' cdly.txt)
^$ indicates a matching blank line
* represents repeating the previous character 0 this or several times (for example: matching Gle,gogle,google,gooogle and so on. Grep–n ' Go*gle ' Cdly.txt)
Represents a string that matches any one of the characters "Cd.y" satisfies the cdly cdqy cdmy
. * indicates matching any character (for example: Gle,gogle,google,gooogle and so on. Grep–n ' G.*gle ' Cdly.txt)
[A-z] indicates a matching character set (for example: Match gl,gf. grep ' g[lf] ' cdly.txt)
[0-9] Indicates a character that matches one of the character ranges (for example: match numeric characters. grep ' [0-9] ' Cdly.txt)
[^a-z] means matching characters other than A-Z (for example: matching non-C characters ' G [^c] ' Cdly.txt)
\ escape character, shielding the special meaning of a character (for example: Search ' * ' is a special character that has a special meaning in a regular expression and must be escaped first.) grep ' \* ' Cdly.txt)
\b matches the starting position of the word, using \b in awk to represent the back character, because awk (gawk) uses \y to represent this feature
\b Matches the ending position of a word
\d matches any number in 0-9, equivalent to [0-9]
\d matches non-numeric, equivalent to [^0-9]
\w matches the character of any word, equivalent to [[: Alnum:]_] or [a-za-z0-9]
\w matches any non-word-composed character equivalent to [^[:alnum:]_] or [^a-za-z0-9]
\s matches any whitespace character, including spaces, tabs, page breaks, and so on, equivalent to [\f\n\r\t\v]
\s Match person worsens non-whitespace character, equivalent to [^\f\n\r\t\v]
\ t matches a horizontal tab, equivalent to \x09 or \CI
\v matches a vertical tab, equivalent to \x0b or \ck
\ n matches a line break, equivalent to \x0a or \CJ
\f matches a page break, equivalent to \x0c or \CL
\ r matches a carriage return, equivalent to \x0d or \cm
\ \ Matches the escape character itself \
\CX matches the control character indicated by x, for example: \CM matches a control-m or carriage return, the value of x must be one of a-Z or a-Z, otherwise, C is treated as a literal ' C ' character
\XN matches N, where n is the hexadecimal escape value. The hexadecimal escape value must be two digits long, for example: ' \x41 ' matches ' A '. ' \x041 ' is equivalent to ' \x04 ' & ' 1 '. ASCII encoding can be used in regular
\num matches num, where num is a positive integer. Represents a reference to the obtained match
\ ' Matches the start of the Emacs buffer, similar to ^
\ ' Matches the end of the Emacs buffer, similar to $
\< indicates the beginning of a matching word (for example: matches a word that begins with G. grep ' \<g ' Cdly.txt)
\> indicates the end of a matching word (for example: matches a word ending in G. grep ' g\> ' Cdly.txt)
\<\> represents the starting and ending positions of a matched word, which is equivalent to matching a word
\{n,m\} indicates that the preceding characters appear at least n times, with a maximum of M times (for example: Match google,gooogle. grep ' Go\{2,3\}gle ' Cdly.txt)
\{n\} indicates that the preceding characters appear n times
\{n,\} indicates that the preceding character matches at least N times (for example: Match o grep ' Go\{2,\}gle ' cdly.txt with a minimum of 2 occurrences)
\{,m\} indicates that the preceding character is matched to a maximum of M times, in which version is not supported and can be used \{0,m\}


#****** Extended Regular expression (ERE Extend Regular expression)
? Indicates that the preceding character of the match appears 0 or 1 times (for example, match gd,god. Grep–e ' go?d ' cdly.txt or grep ' go\?d ' Cdly.txt)
+ indicates that the preceding characters appear 1 or more times (for example: match God,good,goood, etc.). Grep–e ' go+d ' cdly.txt or grep ' go\+d ' Cdly.txt)
() indicates that the string matching the entire parenthesis is matched to a single character (for example: Search good or glad. Grep-e ' g (oo|la) ' Cdly.txt)
| represents "or" to match a set of optional characters (for example: Match God or good. Grep-e ' God|good ' cdly.txt or grep ' God\|good ' cdly.txt


#posix字符类 (use with Bre, ERE, PRE) is required to use double brackets [[: Alnum:]]
#特殊符号代表意义
[: alnum:] On behalf of the English letter case and number, for example: [A-za-z0-9]
[: Alpha:] on behalf of any English case, for example: [A-za-z]
[: Lower:] For lowercase letters, such as: [A-z]
[: Upper:] for uppercase English letters, such as: [A-z]
[:d Igit:] Represents a number, such as: [ 0-9]
[: blank:] Represents an empty key or [tab] key
[: Cntrl:] represents the control key on the keyboard, such as: CR, LF, Tab, Del, etc.
[: graph:] Except (blank key [Tab] button) All other keys outside
[:p rint:] Represent any character that can be printed
[:p UNCT:] Represents a special symbol, such as: semicolon \ "single quotation mark \"?!;: # $ ...
[: space:] Any blank key, including a blank key, [Tab], CR, and so on
[: Xdigit:] represents 16 binary numeric types, so includes: 0-9, a-f, a-f numbers and characters

========================================================================================
**grep command

#功能: Find a string in each line of the input file, enclose the search string in single quotation marks
#作用: Full search of regular expressions and printing of found rows
#基本用法: grep [-ACINV] [--color=auto] [-A n] [-B N] ' search string ' file name
#参数说明:
-A: Binary documents are processed as text (or--binary-files=text)
-B: Displays the total number of bytes matched
-C: Show number of matches
-I: Ignore case
-N: Show line numbers at the beginning
-O: Show only matching content
-E: Use regular expressions to interpret styles
-F: Writes the grep expression to a file and then uses the-f reference
-A num:after meaning that displays the data of num rows after matching a string
-B num:before, display data for num rows before matching string
-C num: Displays data for each num row before and after the matching string
-V: Reverse lookup, matching content is not displayed, only mismatched content is displayed
-L Output Matching file name
-L output mismatched file names
-H suppress file name output
-Q returns only the matching state, only the echo $? 0 indicates that a matching row is found, not 0 indicates no matching rows found
-R recursive query to sub-directory search
-W performs a word search, and lines that exactly match that word are listed, equivalent to \<\>
--color: Highlight matching keywords in a specific color (alias grep= ' grep--color=auto ') (preferably added in. bashrc or. bash_profile files)

For easy grep testing, the following commands can be named aliases: The checked results are displayed in red font
[email protected]/cdly/tmp]# cat >> ~/.BASHRC <<eof
Alias egrep= ' Egrep--color=auto '
Alias grep= ' grep--color=auto '
Eof

[Email protected]/cdly/tmp]# zcat aa.tar.gz |grep--binary-files=text ' one ' #从压缩包中的文件过滤所需内容
[Email protected]/cdly/tmp]# zcat aa.tar.gz |grep-a ' one ' #从压缩包中的文件过滤所需内容

[[email protected]/cdly/tmp]# grep-c ' ^$ ' file# count spaces
[[email protected]/cdly/tmp]# grep-c ' ^ *$ ' file# match blank line
[[email protected]/cdly/tmp]# grep nw file #打印所有包含正则表达式 NW row.
[Email protected]/cdly/tmp]# grep NW d* #打印所有以d开头的文件中且包含正则表达式NW的行
[[email protected]/cdly/tmp]# grep ' ^n ' file #打印所有以n开头的行. ^ Indicates the beginning of the anchoring line.
[[email protected]/cdly/tmp]# grep ' 4$ ' file #打印所有以4结束的行. $ represents the end of the anchoring line.
[[email protected]/cdly/tmp]# grep TB Savage file #在文件Savage和file查找包含TB的行
[[email protected]/cdly/tmp]# grep ' TB Savage ' file #打印所有包含TB Savage rows.
[[email protected]/cdly/tmp]# grep ' 5\. ' file# The first one is 5, followed by a dot, and then any character
[[email protected]/cdly/tmp]# grep ' \.5 ' file# prints all lines containing the string ". 5".
[[email protected]/cdly/tmp]# grep ' ^[we] ' file# print all lines beginning with W or E.
[[email protected]/cdly/tmp]# grep ' [^0-9] ' file# in parentheses denotes any character that is not within the range of parentheses.
[[email protected]/cdly/tmp]# grep ' [a-z][a-z] [A-z] ' file #打印所有包含前两个字符是大写字母, followed by a line followed by a space and a string of uppercase letters. For example, TB Savage and am Main.
[[email protected]/cdly/tmp]# grep ' ss* ' file# prints all lines that contain one or more s and a string followed by a space. For example, Charles and Dalsass
[[email protected]/cdly/tmp]# grep ' [a-z]\{9\} ' file# print all rows containing at least 9 consecutive lowercase strings per string
[[email protected]/cdly/tmp]# grep ' \ (3\) \. [0-9].*\1 *\1 ' file# The first character is 3, followed by a period, then any number, then any number, then a 3, then any tab, then another 3. Because 3 in a pair of parentheses, it can be referenced by the \1 of the following
#匹配类型: 3.2cdly3 3 or 3.5aaa3 3
[[email protected]/cdly/tmp]# grep ' \<north ' file# all lines containing words starting with North
[[email protected]/cdly/tmp]# grep ' \bnorth\b ' file# all lines containing words starting with North
[[email protected]/cdly/tmp]# grep ' ^n\w*\w ' file# the first character is N, followed by any letter or numeric character, followed by a non-alphanumeric character, \w and \w are standard word-match characters
[[email protected]/cdly/tmp]# grep ' \<[a-z].*n\> ' file# the first character is a lowercase letter, followed by any character, and ends with the character N. Note. *, which represents any character, including spaces
[[email protected]/cdly/tmp]# ls–l|grep ' ^[^d] ' #不匹配开头是d的目录, means remove all directories

Example: File contents are as follows
[Email protected]/cdly/tmp]# seq-w > File;cat file
01
02
03
04
05
06
07
08
09
10

[[email protected]/cdly/tmp]# grep-a 2 ' file#-a 2 shows the matching line, and will match the next two lines also show up
3:03
4-04
5-05
[[email protected]/cdly/tmp]# grep-b 2 ' File#-b 2 shows the matching line, and will match the last two lines are also displayed
1-01
2-02
3:03
[[email protected]/cdly/tmp]# grep-c 2 ' "File #-c 2 shows the matching line, and will match the front and back of the two lines are also displayed
1-01
2-02
3:03
4-04
5-05

[Email protected]/cdly/grep]# echo 1111 > A1.txt
[Email protected]/cdly/grep]# echo-e "aaa\n222" > A2.txt
[[email protected]/cdly/grep]# grep-l ' ^[a-z] ' *#-l show file names that are not compliant
A1.txt
[[email protected]/cdly/grep]# grep-l ' ^[a-z] ' * #-l display the file name of the matching content
A2.txt
[[email protected]/cdly/grep]# grep-q ' File;echo $? #核查到匹配的结果, so return to status 0
0
[[email protected]/cdly/grep]# grep-q ' 088 ' File;echo $? #未核查到匹配的结果, so return status not 0
1

One of the Three Musketeers of Linux-->grep

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.