Linux Learning notes: grep, Egrep

Source: Internet
Author: User
Tags expression engine line editor egrep

Text Processing Three musketeers:

grep system: grep, Egrep, Fgrep, Text Search tool, based on "PATTERN" for a given text fuzzy search, grep system is working in greedy mode by default.

Sed:stream Editor, Stream editor, line editor, text Editing tool;

Awk:gawk--gun awk, text Formatting tool, text Report generator, text processing programming language;

grep system:

grep: Use regular expressions to search globally and display matching rows;

grep [OPTIONS] PATTERN [FILE ...]

PATTERN: Filter condition consists of regular expression meta-characters and text characters without special meanings;

Metacharacters of regular Expressions:

is interpreted as a special meaning by the regular expression engine;

The regular expression engine for the Pcre--perl language; the most complete regular expression engine


Basic Regular Expressions: BRE

Extended Regular expression: ERE


grep only supports basic regular expressions by default;

Egrep only extended regular expressions are supported by default;

Fgrep The regular expression engine is not turned on by default;


Text characters:

Only those characters that have the meaning of the character surface;

Common options:

-i,--ignore-case: Ignores the case of text characters;

-v,--invert-match: reverse matching; The result is a line that pattern does not match successfully;

-c,--count: Count, statistics match all the lines of pattern;

-o,--only-matching: Turns off greedy mode, showing only what pattern can match.

-q,--quiet,--silent: Quiet mode, do not output any results

--color=auto results are highlighted ...

-E, equivalent to Egrep

-f,--fixed-strings,--fixed-regexp:grep-f equivalent to Fgrep

-G, equivalent to grep basic regular expression

-p,--perl-regexp: Use the Pcre engine.

-A num: Displays the number of rows in front of the line that matches the pattern,

-B Num: Displays the number of rows followed by the line that matches the pattern;

-C num: Displays the number of rows before and after the line matching the pattern is displayed;


PATTERN:

Regular expression meta-characters:

Basic regular Expression meta-characters:

GLOBBING-----A simplified version of the regular expression: []? *


Character Matching:

.: Matches any single character

All of the following character sets can be placed in [] to match a single character.

[]: matches any single character within the specified range;

[^]: matches any single character outside the specified range;

[: Lower:]

[: Upper:]

[:d Igit:]

[: Alpha:]

[: Space:]

[: Alnum:]

[:p UNCT:]

[: Blank:]

[: Xdigit:]: all hexadecimal digits;

A-Z: all lowercase letters

A-Z: all uppercase letters

0-9: Identify all decimal digits


Number of occurrences: the number of times that the character before the character can appear;

*: The preceding characters can appear any time (0 times, 1 times or more);

\?: The preceding character is optional (0 or 1 times).

\+: The preceding character appears at least once (one or more times)

\{m\}: The preceding character must appear m times.

\{m,n\}: The preceding characters appear at least m times at most n times. (m<n)

\{0,n\}: The preceding characters appear at least 0 times, at most, n times;

\{m,\}: The characters in front of it appear at least m times, the more the more.


In a regular expression, the method of representing any character of any length:. *


Positional anchor characters:

Line anchoring:

Beginning of line anchoring: ^

End of line anchor: $

Word Anchor:

Head anchor:\< or \b

The end of the word anchor:\> or \b

\b: The anchoring method in the old version, so it is recommended not to use;


For the regular expression engine, the word is a continuous string consisting of non-special characters;


Grouping and referencing characters:

\ (pattern\): All characters matched by PATTERN are treated as an integral whole.


In the regular expression engine, there are a series of built-in variables that hold all the character information within all the groupings for the back reference; These variables are: \1,\2,\3 ....


Pattern1\ (pattern2\) pattern3\ (pattern4\ (pattern5\) \)


\1:pattern2 the character that the pattern in the first set of parentheses matches to;

\2:pattern4 the pattern in the second set of parentheses matches the character;

\3:pattern5


Or:

\|



Note: or character treats the characters on the left and right side as a whole;


A\|americam:a or American


By default, only one pattern is allowed behind the grep command;

If you want to write more than one pattern during a grep search, you need to use the-e option, with an-e in front of each pattern


Writing the required pattern to a file, guaranteeing that there is only one pattern per line, we can use the-F file method to implement the multi-pattern option.



Egrep

Egrep [OPTIONS] PATTERN [FILE ....]

Extended regular expression meta-characters:

Character Matching:

.

[]

[^]

Number of matches:

*

?

+

{m}

{M,n}

{m,}

{0,n}


Location anchoring:

^

$

\<,\b

\>,\b


Grouping and referencing:

()

\1,\2,\3 ...


Or:

|


All characters in Fgrep:pattern are treated as text characters;



Other text-processing commands:

Wc:

WC [OPTION] ... [FILE] ...

-L: Show only the number of rows

-W: Displays only the number of words

-C: Show only the number of characters


Cut: (clip)

Files that can be modified by the cut command are usually text documents with a certain structure or format.

Cut OPTION ... [FILE] ...

-D,--delimiter=delim: Specifies the delimiter to be relied upon when the pruning operation is implemented, by default the whitespace character;

-F,--fields=list: Specifies the number of the field according to the defined delimiter.

How to use address delimitation:

#: Select a single field to be specified

#,#: Discrete number of individual fields that are specified

#-#: Multiple specified fields in a row

--output-delimiter=string: Specifying the output delimiter


Awk:

Awk-f "DELIMITER" ' [/pattern/]{print $1,$2.$3 ... $NF} ' FILE ...

-F "DELIMITER": Specifies the field delimiter, which defaults to a blank character;

$1,$2,$3.., $NF: The text fragments that are cut from the field delimiter are stored in the corresponding internal variables;


Sort:sort lines of text files, which continue to be sorted by line, the default collation is in the order of the characters in the ASCII table, and this sort criterion can be modified.

-r,--reverse: Reverse Sort

-r,--random-sort: Random sorting, this random algorithm is relatively rudimentary

-U,--unique: Repeated rows show only one row,

-N,--numeric-sort: Sorts by numeric numeric size.

-T,--field-separator=sep: Specifies the field delimiter

-K,--key=keydef: Indicates which key field to sort by, and is used in both general and-T.


Uniq:report or omit repeated lines

-D,--repeated: Displays only the repeating rows, and the repeating row displays only one row ...

-U,--unique: Displays only rows that are not duplicates.

-C,--count: The number of repetitions of repeated rows is displayed in the prefix of each line;


Diff:compare Files line by line

Different modified versions of the same file;


Patch:apply Changes to Files

Patch [-R] [-I patchfile] [file]


Linux Learning notes: grep, Egrep

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.