Sort out grep practical text search and filtering skills

Source: Internet
Author: User
Tags grep regular expression egrep

Sort out grep practical text search and filtering skills

I. Introduction to grep:

The text search tool performs a row-by-row search on the target file based on the text mode specified by the user to display the rows that can be matched by the mode. The use of regular expressions can achieve powerful text processing. The following example describes the regular expression.

Ii. Classification of text processing tools

Commonly used are grep, egrep, and fgrep.

Differences:

Grep: If no parameter exists, only the RE (Regular Expression) characters are output.

Egrep: equivalent to grep-E. The biggest difference between grep and grep is the Escape Character. For example, when grep matches the number of times, \ {n, m \} egrep does not need to be directly {n, m }. Egrep is convenient.

Fgrep: equivalent to grep-f, but cannot use a regular expression. All character matching functions have been removed.

3. grep parameter description:

Format: grep [OPTIONS] PATTERN (mode) [FILE...]

Common options:

-- Color = auto: displays the color parameters.

-C: only counts matching rows are output.

-I: It is case-insensitive (only applicable to single characters ).

-H: When querying multiple files, the file name is not displayed.

-L: When querying multiple files, only names containing matching characters are output.

-N: displays matching rows and row numbers.

-S: the error message that does not exist or does not match the text is not displayed.

-V: displays all rows that do not contain matched text.

Main Parameters of the regular expression pattern:

\: Ignore the original meaning of special characters in regular expressions.

^: Match the start line of the regular expression.

$: Matches the end row of the regular expression.

\ <: Starts from the row that matches the regular expression.

\>: Ends with the row that matches the regular expression.

[]: A single character. For example, [A] indicates that A meets the requirements.

[-]: Range, such as [A-Z], that is, A, B, C Until Z all meet the requirements.

. : All single characters.

*: It can contain 0 characters.

Example:


Grep command to use a simple instance

$ Grep 'test' d *

Display all the lines containing test in files starting with d.

$ Grep 'test' aa bb cc

The row Matching test is displayed in the aa, bb, and cc files.

$ Grep '[a-z] \ {5 \} 'aa

Display All rows of a string that contains at least five consecutive lowercase characters.

$ Grep 'W \ (es \) t. * \ 1' aa

If west is matched, es is stored in the memory, marked as 1, and any characters (. *). These characters are followed by another es (\ 1). If they are found, the row is displayed. If you use egrep or grep-E, you do not need to escape it by using the "\" character. You can directly write it as 'W (es) t. * \ 1.

5. The grep command uses a complex instance

Suppose you are searching for a file with a character string 'Magic 'in the'/usr/src/Linux/doc' directory:

$ Grep magic/usr/src/Linux/Doc /*

Sysrq.txt: * How do I enable the magic SysRQ key?

Sysrq.txt: * How do I use the magic SysRQ key?

The 'sysrp.txt 'file contains this string. The SysRQ function is discussed.

By default, 'grep' only searches for the current directory. If the directory contains many subdirectories, 'grep' is listed as follows:

Grep: sound: Is a directory

This may make the output of 'grep' difficult to read. There are two solutions:

Search for subdirectories: grep-r

Or ignore the subdirectory: grep-d skip

If there are many outputs, you can use the pipeline to convert them to 'less '. Read:

$ Grep magic/usr/src/Linux/Documentation/* | less

In this way, you can read more conveniently.

Note that you must provide a file filtering method (* for searching all files *). If you forget, 'grep' will wait until the program is interrupted. If this happens, press <CTRL c> and try again.

The following are some interesting command line parameters:

Grep-I pattern files: searches case-insensitive. Case Sensitive by default,

Grep-l pattern files: only names of matched files are listed,

Grep-L pattern files: Lists unmatched file names,

Grep-w pattern files: match only the entire word, not a part of the string (for example, match 'Magic ', not 'magical '),

Grep-C number pattern files: the matching context displays the rows of [number,

Grep pattern1 | pattern2 files: displays the rows matching pattern1 or pattern2,

Grep pattern1 files | grep pattern2: displays rows that match both pattern1 and pattern2.

Grep-n pattern files to display the row number information

Grep-c pattern files to find the total number of rows

Here are some special symbols used for search:

\ <And \> respectively indicate the start and end of a word.

For example:

Grep man * matches 'Batman ', 'manic', 'Man ', etc,

Grep '\ <man' * matches 'manic 'and 'man', but not 'Batman ',

Grep '\ <man \>' only matches 'man ', not other strings such as 'Batman' or 'manic.

'^': Indicates the first row of the matched string,

'$': Indicates the end of a matched string,

Grep command usage

1. parameters:

-I: case insensitive

-C: print the number of matched rows.

-L: searches for matching items from multiple files.

-V: searches for rows that do not contain matching items.

-N: print the rows and Row labels containing matching items.

2. RE (regular expression)

\ Ignore the original meaning of special characters in Regular Expressions

^ Match the starting line of the Regular Expression

$ Match the end row of the Regular Expression

\ <Starts from the row that matches the Regular Expression

\> Ends with the row matching the Regular Expression

[] A single character. For example, [A] means that A meets the requirements.

[-] Range; for example, [A-Z] That is A, B, C Until Z all meet the requirements

. All single characters

* All characters. The length can be 0.

3. Example

# Ps-ef | grep in. telnetd

Root 19955 181 0 13:43:53? 0: 00 in. telnetd

# More size.txt size file content

B124230

B034325

A081016

M7187998

M7282064

A022021

A061048

M9324822

B103303

A013386

B044525

M8987131

B081016

M45678

B103303

BADc2345

# More size.txt | grep '[a-B]' range; for example, [A-Z] That is, A, B, C Until Z all meet the requirements

B124230

B034325

A081016

A022021

A061048

B103303

A013386

B044525

# More size.txt | grep '[a-B]' *

B124230

B034325

A081016

M7187998

M7282064

A022021

A061048

M9324822

B103303

A013386

B044525

M8987131

B081016

M45678

B103303

BADc2345

# More size.txt | grep 'B' single character. For example, [A] indicates that A meets the requirements.

B124230

B034325

B103303

B044525

# More size.txt | grep '[bB]'

B124230

B034325

B103303

B044525

B081016

B103303

BADc2345

# Grep 'root'/etc/group

Root: 0: root

Bin: 2: root, bin, daemon

Sys: 3: root, bin, sys, adm

Adm: 4: root, adm, daemon

Uucp: 5: root, uucp

Mail: 6: root

Tty: 7: root, tty, adm

Lp: 8: root, lp, adm

Nuucp: 9: root, nuucp

Daemon: 12: root, daemon

# Grep '^ root'/etc/group match the starting line of the Regular Expression

Root: 0: root

# Grep 'ucp'/etc/group

Uucp: 5: root, uucp

Nuucp: 9: root, nuucp

# Grep '\ <ucp'/etc/group

Uucp: 5: root, uucp

# Grep 'root $ '/etc/group match the end row of the Regular Expression

Root: 0: root

Mail: 6: root

# More size.txt | grep-I 'b1 .. * 3 '-I: case insensitive

B124230

B103303

B103303

# More size.txt | grep-iv 'b1 .. * 3 '-v: searches for rows that do not contain matching items.

B034325

A081016

M7187998

M7282064

A022021

A061048

M9324822

A013386

B044525

M8987131

B081016

M45678

BADc2345

# More size.txt | grep-in 'b1... * 3'

1: b124230

9: b103303

15: B103303

# Grep '$'/etc/init. d/nfs. server | wc-l

128

# Grep '\ $'/etc/init. d/nfs. server | wc-l ignores the original meaning of special characters in Regular Expressions

15

# Grep '\ $'/etc/init. d/nfs. server

Case "$1" in

>/Tmp/sharetab. $

["X $ fstype "! = Xnfs] &

Echo "$ path \ t $ res \ t $ fstype \ t $ opts \ t $ desc"

>/Tmp/sharetab. $

/Usr/bin/touch-r/etc/dfs/sharetab/tmp/sharetab. $

/Usr/bin/mv-f/tmp/sharetab. $/etc/dfs/sharetab

If [-f/etc/dfs/dfstab] & amp;/usr/bin/egrep-v '^ [] * (# | $ )'

If [$ startnfsd-eq 0-a-f/etc/rmmount. conf] &

If [$ startnfsd-ne 0]; then

Elif [! -N "$ _ INIT_RUN_LEVEL"]; then

While [$ wtime-gt 0]; do

Wtime = 'expr $ wtime-1'

If [$ wtime-eq 0]; then

Echo "Usage: $0 {start | stop }"

# More size.txt

The test file

Their are files

The end

# Grep 'the 'size.txt

The test file

Their are files

# Grep '\ <the 'size.txt

The test file

Their are files

# Grep 'the \> 'size.txt

The test file

# Grep '\ <the \> 'size.txt

The test file

# Grep '\ <[Tt] he \> 'size.txt

The test file

========================================================== ======================================

1. Introduction

A multi-purpose text search tool using regular expressions. This php? Name = % C3 % FC % C1 % EE "class =" t_tag "> the command is a php? Name = % C3 % FC % C1 % EE "class =" t_tag "> command/filter:

G/re/p -- global-regular expression-print.

Basic Format

Grep pattern [file...]

(1) grep search string [filename]

(2) grep regular expression [filename]

Search for the location where all pattern appears in the file. pattern can be either a string to be searched or a regular expression.

Note: It is best to use double quotation marks when entering the string to be searched. When using regular expressions for pattern matching, use single quotation marks.

2. grep options

-C: only counts matching rows.

-I is case insensitive (for single character)

-N: displays the matched row number.

-V does not show any rows that do not contain matched text

-S does not display error messages

-E: use an extended regular expression.

For more options, see man grep.

3. Common grep instances

(1) query multiple files

Grep "sort" *. doc # See file name matching.

(2) Row matching: count of the output matched rows

Grep-c "48" data.doc # number of lines containing 48 characters in the output document

(3) display matching rows and rows

Grep-n "48" data.doc # Show All rows and row numbers that match 48

(4) show unmatched rows

Grep-vn "48" data.doc # output all rows that do not contain 48

(4) show unmatched rows

Grep-vn "48" data.doc # output all rows that do not contain 48

(5) case sensitive

Grep-I "AB" data.doc # output all the rows containing the AB or AB strings

4. Application of Regular Expressions

(1) Application of Regular Expressions (Note: it is best to enclose Regular Expressions in single quotes)

Grep '[239]. 'data.doc # output all rows that contain numbers starting with 2, 3 or 9

(2) mismatched Test

Grep '^ [^ 48] 'data.doc # The first line of the unmatched row is 48.

(3) Use extended mode matching

Grep-E '2017 | 100' data.doc

(4 )...

This requires constant application and summarization in practice to master regular expressions.

5. Use the class name

You can use the following class names for international mode matching:

[[: Upper:] [A-Z]

[[: Lower:] [a-z]

[[: Digit:] [0-9]

[[: Alnum:] [0-9a-zA-Z]

[[: Space:] space or tab

[[: Alpha:] [a-zA-Z]

(1) Use

Grep '5 [[: upper:] [[: upper:] 'data.doc # query rows whose names start with 5 and end with two uppercase letters

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.