Learn the grep and WC commands of the Linux command every day---(6/25)

Source: Internet
Author: User
Tags character classes control characters posix svn egrep

The grep command in a Linux system is a powerful text search tool that uses regular expressions to search for text and print matching lines. The grep full name is global Regular expression Print, which represents the globally regular expression version, and its use rights are for all users.

Wc

1. Command format: WC [options] File ... 2. Command function: Statistics the number of bytes in the specified file, Word count, number of lines, and display the results of the output. This command counts the number of bytes, word count, and number of lines in the specified file. If the file name is not given, it is read from the standard input. The WC also gives the president count of the files specified. 3. Command parameters:-C count bytes. -L counts the number of rows. -M counts the number of characters. This flag cannot be used with the-C flag. -W count words. A word is defined as a string separated by a blank, a jump, or a newline character. -L Prints the length of the longest line. --HELP Display Help Information--version display version Information example: in file A, count the number of rows that Hello appears in:
grep Hello A | Wc-l

Count the number of Hello occurrences in file a:

Grep-o Hello A | Wc-l

The grep command and the WC command can be passed on the command line | Link to each other, as the input command originally had the same name as::::::: PIPE command!!!!

Yes, that's the feeling underneath.

  

  

1, Pipeline command only handle the correct output of the previous command, do not handle error output

2, the pipe command to the right, you must be able to receive the standard input stream command.

Cat test1.sh test.sh 2>/dev/null | Grep-n ' good ' 1:echo very good2:echo Good5:echo good# will test1.sh not find error output redirect output to/dev/null file, correct output sent to grep via pipeline

> is redirect

| is pipeline redirection

The difference is:

1, the left command should have standard output | The command on the right should accept the standard input
The command on the left should have standard output > the right can only be a file
The command on the left should require standard input < The right can only be a file

2, pipeline trigger two sub-Process Execution "|" On both sides of the program, and redirection is performed within a process

grep can be used for Shell scripting, because grep describes the status of the search by returning a status value, or 0 if the template search succeeds, or 1 if the search is unsuccessful, or 2 if the searched file does not exist. We can use these return values to do some automated text processing work.

1. Command format:

grep [option] Pattern file

2. Command function:

A specific character used for filtering/searching. The use of regular expressions can be used in conjunction with a variety of commands, the use of very flexible.

3. Command parameters:

-A--text #不要忽略二进制的数据.

-a< Displays rows >--after-context=< displays the number of rows > #除了显示符合范本样式的那一列之外, and displays the contents after that line.

-B--byte-offset #在显示符合样式的那一行之前, indicating the number of the first character of the line.

-b< Displays rows >--before-context=< displays the number of rows > #除了显示符合样式的那一行之外, and displays the contents before the line.

-C--count #计算符合样式的列数.

-c< Display rows >--context=< Displays the number of rows > or-< Displays the number of rows > #除了显示符合样式的那一行之外, and displays the contents before the line.

-D < action >--directories=< action > #当指定要查找的是目录而非文件时, this parameter must be used, otherwise the grep command returns information and stops the action.

-e< template style >--regexp=< template style > #指定字符串做为查找文件内容的样式.

-E--extended-regexp #将样式为延伸的普通表示法来使用.

-f< Rules file >--file=< rule File > #指定规则文件, with one or more rule styles, that grep finds the content of the file that matches the rule condition, in the form of a rule style per line.

-F--fixed-regexp #将样式视为固定字符串的列表.

-G--basic-regexp #将样式视为普通的表示法来使用.

-H--no-filename #在显示符合样式的那一行之前, does not indicate the name of the file to which the line belongs.

-H--with-filename #在显示符合样式的那一行之前 that represents the name of the file to which the row belongs.

-I.--ignore-case #忽略字符大小写的差别.

-L--file-with-matches #列出文件内容符合指定的样式的文件名称.

-L--files-without-match #列出文件内容不符合指定的样式的文件名称.

-N--line-number #在显示符合样式的那一行之前, indicating the number of columns in the row.

-Q--quiet or--silent #不显示任何信息.

-R--recursive #此参数的效果和指定 the "-D recurse" parameter.

-S--no-messages #不显示错误信息.

-V--revert-match #显示不包含匹配文本的所有行.

-V--version #显示版本信息.

-W--word-regexp #只显示全字符合的列.

-X--line-regexp #只显示全列符合的列.

-y #此参数的效果和指定 the same as the "-i" parameter.

4. Rule expression:

Rule Expressions for grep:

^ #锚定行的开始 such as: ' ^grep ' matches all lines that begin with grep.

$ #锚定行的结束 such as: ' grep$ ' matches all lines that end with grep.

. #匹配一个非换行符的字符 such as: ' GR.P ' matches gr followed by an arbitrary character followed by P.

* #匹配零个或多个先前字符 such as: ' *grep ' matches all one or more spaces followed by the line of grep.

. * #一起用代表任意字符.

[] #匹配一个指定范围内的字符, such as ' [Gg]rep ' matches grep and grep.

[^] #匹配一个不在指定范围内的字符, such as: ' [^a-fh-z]rep ' matches the beginning of a letter that does not contain a-r and t-z, immediately following the line of the Rep.

\(.. \) #标记匹配字符, such as ' \ (love\) ', Love is marked as 1.

\< #锚定单词的开始, such as: ' \<grep ' matches lines that contain words that begin with grep.

\> #锚定单词的结束, such as ' grep\> ', matches lines that contain words ending with grep.

X\{m\} #重复字符x, M times, such as: ' 0\{5\} ' matches rows containing 5 O.

X\{m,\} #重复字符x, at least m times, such as: ' O\{5,\} ' matches rows with at least 5 O.

X\{m,n\} #重复字符x, at least m times, not more than n times, such as: ' O\{5,10\} ' matches rows of 5--10 O.

\w #匹配文字和数字字符, that is, [a-za-z0-9], such as: ' G\w*p ' is matched with a G followed by 0 or more literal or numeric characters, followed by P.

\w #\w The reverse form, matching one or more non-word characters, such as the dot period, and so on.

\b #单词锁定符, such as: ' \bgrep\b ' only matches grep.

POSIX characters:

POSIX (The Portable Operating System Interface) adds special character classes to the character encodings in different countries, such as [: Alnum:] is another notation for [a-za-z0-9]. You can place them in the [] number to be a regular expression, such as [a-za-z0-9] or [[: Alnum:]]. grep under Linux supports POSIX character classes in addition to Fgrep.

[: Alnum:] #文字数字字符

[: Alpha:] #文字字符

[:d igit:] #数字字符

[: Graph:] #非空字符 (non-whitespace, control characters)

[: Lower:] #小写字符

[: Cntrl:] #控制字符

[:p rint:] #非空字符 (including spaces)

[:p UNCT:] #标点符号

[: Space:] #所有空白字符 (New line, Space, TAB)

[: Upper:] #大写字符

[: Xdigit:] #十六进制数字 (0-9,A-F,A-F)

5. Usage examples:

Example 1: Finding the specified process

Command: Ps-ef|grep SVN

Note: The first record is the process of finding out; the second result is the grep process itself, not the process you are really looking for.

Example 2: Find the specified number of processes

Command:

Ps-ef|grep svn-c

Ps-ef|grep-c SVN

Example 3: Search by reading keywords from a file

Command: Cat Test.txt | Grep-f Test2.txt

The output Test.txt file contains the content lines of the keywords read out from the Test2.txt file

Example 3: Reading a keyword from a file to search for and displaying line numbers

Command: Cat Test.txt | GREP-NF Test2.txt

The output Test.txt file contains the line of contents of the keyword read from the test2.txt file and displays the line number of each line

Example 5: Finding keywords from a file

Command grep ' Linux ' test.txt

Example 6: Find keywords from multiple files

Command:

grep ' Linux ' Test.txt test2.txt

Multiple files, when the output of the query to the information content line, the name of the file will be the first line output and ":" as the identifier

Instance 7:grep does not show itself process

Command:

PS Aux|grep \[s]sh

PS aux | grep SSH | Grep-v "grep"

Example 8: Find the line content beginning with u

Command: Cat test.txt |grep ^u

Example 9: Output line content that does not start with u

Command: Cat test.txt |grep ^[^u]

Example 10: Output line content ending in hat

Command: Cat test.txt |grep hat$

Example 11: Output IP Address

Command: Ifconfig eth0|grep-e "([0-9]{1,3}\.) {3} [0-9] "

Example 12: Displaying a content line containing an ED or at character

Command: Cat test.txt |grep-e "Ed|at"

Example 13: Displays all rows in the current directory that contain at least 7 consecutive lowercase characters for each string in a file ending in. txt

Command: grep ' [a-z]\{7\} ' *.txt

Example 14: Log file too large, not good to see, we want to see what we want, or get the same kind of data, such as no 404 log Information

Command: grep '. ' Access1.log|grep-ev ' 404 ' > Access2.log

grep '. ' Access1.log|grep-ev ' (404|/photo/|/css/) ' > Access2.log

grep '. ' Access1.log|grep-e ' 404 ' > Access2.log

Output: [[email protected] test]# grep "." Access1.log|grep-ev "404" > Access2.log

Note: The above 3 commands in the preceding two sentences are found in the current directory of the Access1.log file, find those that do not contain 404 of the rows, put them into the access2.log, and then remove the ' V ', that is, 404 of the rows into the Access2.log

About the or,and,not operation of the grep command

In the grep command, there are equivalent options for the OR and not operators, but there is no grep and this operator. However, you can use patterns to simulate and manipulate. Here are some examples of how or,and,not is used in the grep command of Linux.

1 OR semantics

‘pattern1|pattern2‘ filename  
    • 1

2 and semantics

grep -E ‘pattern1.*pattern2‘ filename  
    • 1

3 Not semantics

grep -v ‘pattern1‘ filename    

grep matches rows that meet multiple criteria

Egrep -i ' ^ (from | Subject | Date): ' Maixbox   

grep uses multiple query criteria--orRecommended Method \| Symbolic method

[[email protected] ~]# grep ' Usrquota\|grpquota '/etc/fstab

Other methods "1" use multiple-e parameters

Such as:

Netstat-an | Grep-e "established| WAIT "

Attention:

Netstat-an | GREP-E est-e WAIT

Using multiple-e parameters in parallel can be implemented or conditional

"2" using the extension-E

Netstat-an | Grep-e "established| WAIT "

The-E is capitalized here, and the matching criteria must be quoted

下面是一些有意思且常用的命令行参数: 
grep -i pattern files :不区分大小写地搜索(例如:grep -i "hello" ./test.txt)。默认情况区分大小写。
grep -l pattern files :只列出匹配的文件名, 
grep -L pattern files :列出不匹配的文件名, 
grep -w pattern files :只匹配整个单词,而不是字符串的一部分(如匹配‘magic’,而不是‘magical’), 
grep -C number pattern files :匹配的上下文分别显示[number]行, 
grep pattern1 | pattern2 files :显示匹配 pattern1 或 pattern2 的行, 
grep pattern1 files | grep pattern2 :显示既匹配 pattern1 又匹配 pattern2 的行 

Here are some special symbols for searching:

\< 和 \> 分别标注单词的开始与结尾。 
例如: 
grep man * 会匹配 ‘batman’、‘manic’、‘man’等, 
grep ‘\<man‘ * 匹配‘manic’和‘man’,但不是‘batman’, 
grep ‘\<man\>‘ 只匹配‘man’,而不是‘batman’或‘manic’等其他的字符串。 
‘^‘:指匹配的字符串在行首, 
‘$‘:指匹配的字符串在行尾, 

^ 符号,在字符类符号(括号[])之内与之外是不同的! 在 [] 内代表『反向选择』,在 [] 之外则代表定位在行首的意义!

Any one of the bytes. With repeating bytes *
The meanings of these two symbols in regular expressions are as follows:

. (decimal point): means "must have an arbitrary byte" meaning; * (asterisk): represents "repeating the previous character, 0 to infinity" meaning, for the combined form

Suppose I need to find out g?? A string of D, that is, a total of four bytes, beginning with G and ending with D, I can do this:

[Email protected] ~]# grep-n ' G.. d ' regular_express.txt1: "Open Source" is a good mechanism to develop programs.9:oh! The Soup taste good.16:the World <Happy> are the same with "glad".

Because the emphasis between G and D must exist two bytes, so the 13th line of God and the 14th line of GD will not be listed!

If I want to find the line where G begins and ends with G, the characters are optional

[Email protected] ~]# grep-n ' g.*g ' regular_express.txt1: "Open Source" is a good mechanism to develop programs.20:go! Go! Let ' s go.
since it is the beginning of G and the end of G, any byte in the middle can be accepted, so the 1th, 14, 20 lines are acceptable Oh! this. * RE means that any character is very common.

What if I want to find the line of "any number"? Because there are only numbers, it becomes:

[[email protected] ~]# grep-n ' [0-9][0-9]* ' Regular_express.txt 

Qualifying Continuous RE character range {}

We can use it. With the re character and * To configure 0 to infinitely multiple repeating bytes, then what if I want to limit the number of repetitions in a range of intervals?

For example, I want to find out two to five o continuous string, how to do? This is the time to use the qualified character {}. But since the symbol {and} is of special significance in the shell, we have to use the character \ To make him lose special meaning. The syntax for {} Is this, assuming I want to find two o strings, which can be:

[[email protected] ~]# grep-n ' o\{2\} ' regular_express.txt
Extended grep (GREP-E or Egrep):

The main benefit of using the extended grep is the addition of additional regular expression meta-character sets.

Prints all rows that contain NW or EA. If you are not using Egrep, but grep, there will be no results detected.

    # egrep ' nw| EA ' testfile         Northwest       NW      Charles Main        3.0     . 98     3    Eastern         EA      TB Savage           4.4     .     5       20

For standard grep, if \,grep is preceded by an extension metacharacters, the extended option-E is automatically enabled.

#grep ' nw\| EA ' testfilenorthwest       NW      Charles Main        3.0     . 98     3       34eastern         ea      TB Savage           4.4     .     5       20

Searches for all rows that contain one or more 3.

# egrep ' testfile# grep-e ' testfile# grep ' 3\+ ' testfile        #这3条命令将会northwest       NW      Charles Main          3.0
   .98     3       34western         WE      Sharon Gray           5.3     .     5       23northeast       NE      AM Main Jr.           5.1     . 94     3       13central         CT      Ann Stephens          5.7     . 94     5       13

Searches for all rows that contain 0 or 1 decimal points characters.

# egrep ' 2\.? [0-9] ' testfile # grep-e ' 2\.? [0-9] ' testfile# grep ' 2\.\? [0-9] ' testfile #首先含有2字符, followed by 0 or 1 points, followed by a number between 0 and 9. Western         WE       Sharon Gray          5.3     .     5       23southwest       SW      Lewis dalsass         2.7     . 8      2       18eastern         EA       TB Savage             4.4     5       20

A row that searches for one or more contiguous no.

# egrep ' (NO) + ' testfile# grep-e ' (no) + ' testfile# grep ' \ (no\) \+ ' testfile   #3个命令返回相同结果, Northwest       NW      Charles Main        3.0     98     3       34northeast       NE       AM Main Jr.        5.1     . 94     3       13north           NO      Margot Weber        4.5     .     5       9

do not use regular expressions

The Fgrep query is faster than the grep command, but not flexible enough: it can only find fixed text, not regular expressions.

If you want to find a line that contains an asterisk character in a file or output

Fgrep  ' * '/etc/profile

Learn the grep and WC commands of the Linux command every day---(6/25)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.