Linux Regular expressions

Source: Internet
Author: User
Tags alphabetic character egrep

1th. What is a regular expression
    1. Regular expressions are a set of rules and methods that are defined to handle a large number of text | strings
    2. By defining the assistance of these special symbols, the system administrator can quickly filter, replace, or output the required strings. Linux regular expressions are typically handled in a behavioral unit.

Simply put

    • A set of rules and methods defined for processing large amounts of text | strings
    • One line at a time with a unit of behavior

A regular expression is a pattern that describes a set of strings, similar to a numeric expression, that makes up smaller expressions with various operators

2nd why regular expressions are used

Linux operation and maintenance work, a large number of filtering log work, to simplify the complex.
Simple and efficient.
Regular expression advanced tools; all three musketeers support

The 3rd chapter is easy to confuse two precautions
    • Regular expressions are widely used in various languages, and PHP perl grep sed is supported by awk. LS * wildcard character
    • But now we are learning regular expressions in Linux, and the command that most often uses regular expressions is grep (egrep), Sed,awk.
    • There are essential differences between regular expressions and wildcard characters

The regular expression is used to find: "File" content, text, string. Generally only three Musketeers support
Wildcards are used to find: file names, common commands are supported

4th notes on the use of regular expressions
    1. Linux Regular expressions handle strings in behavioral units

    2. Easy to distinguish the filter out of the string, must cooperate with grep/egrep command learning.

    1. Note the character set, Exportlc_all=c: Whenever you do, pay attention to the character set
Chapter 5th Classification of regular expressions

The POSIX specification divides regular expressions into two

    • Basic Regular Expressions (bre,basic regular expression)
    • Advanced Features: Extended regular Expressions (ere,extended regular expression)
The difference between the 5.1 bre and Ere is only the difference between metacharacters:
    • The BRE (underlying regular expression) admits only metacharacters with ^$. []* other characters recognized as ordinary characters: \ (\)
    • ERE (extended regular expression) adds () {}?+|, etc.
    • The character () {} is only treated as a meta-character in the Bre when escaped with a backslash "", and Ere, any meta-symbol preceded by a backslash will instead be treated as a normal character.
The 6th chapter How to distinguish the wildcards regular expression of the pass-distribution
    1. No need to think about the method of judgment: In the Three Musketeers awk,sed,grep,egrep are regular, the other is a wildcard
    2. The simplest way to differentiate between the wildcards regular of a pass and the expression:

(1) file directory name ===> wildcard character
(2) file contents (string, Text "file" content) ===> Regular expression

    1. Wildcard wildcards regular expressions have "*", "?", "" ", but these symbols of wildcards can represent any character themselves, and these symbols of regular expressions can only represent the characters in front of these symbols
7th Basic Regular Expression 7.1 basic regular expression
character Description
^ ^word Search for content that starts with Word

$ word$ Search for content ending in Word

^$ represents a blank line, not a space

. represents and can only represent any one character (does not match a blank line)

\ escape character, let the character with special meaning take off vest, show the prototype, like \. Denotes only the decimal point

more times in a row
* repeats the previous character or text 0 or more, preceding the text or character 0 or

.* any number of characters

^.* start with any number of strings,. * As much as possible, how much to count, greedy sex

Bracket Expression  
[ABC] [0-9] [\.,/] Matches any one of the characters in a character set, A or B, or c:[a-z] matches all lowercase letters; denotes a whole, with infinite possibilities; [ABC] Find A or B or C can be written in [A-c]

[^ABC] matches any character A or B or C that does not contain a ^, is an inverse of [ABC] and differs from the meaning of ^

a\{n,m\} repeats the front a character N to M times (if you use Egrep or sed-r to remove the slash)

a\{n,\} repeat the previous a character at least n times, if you use Egrep or sed-r to remove the slash
A\{n\} Repeat the previous a character n times, if you use Egrep or sed-r to remove the slash
--- ---
The 8th chapter extends the regular expression ere
Special Characters Meanings and examples
+ Repeats the previous character one or more times, one or more of the previous characters, and takes the consecutive text/character out

? Repeat the previous character 0 or 1 times (. Yes and only 1)

Pipe character indicates or filters multiple characters at the same time

A
()grouping filter is surrounded by something that represents a whole (one character), a back reference

The 9th chapter of the regular summary
    • Basic Regular: BRE
      |^|$|.| |.| [abc]| [^abc]|
      |---|---|

    • Extended Regular: ERE
      |+|||?| ()| {}|a{n,m}|a{n,}|a{n}|
      |---|---|

    • Escape character \: Change the meaning of a character (does not support regular symbols, change character meaning is regular, support regular conversion to ordinary character meaning)

Attention:

  • GREP does not support regularization by default, so the notation for regular expressions is equivalent to the ordinary character meaning for grep, so it is necessary for grep to handle the regular symbol directly with the escape character \{\}.
  • Grep-e force grep to know the regular symbol directly, no need to escape
  • Egrep equivalent Grep-e is born to recognize regular symbols
  • We usually backup can be done through the form of CP file name {,. bak}, to avoid hitting the file name again
    Sed-r: let sed support the regular
The 10th Chapter basic regular and extended regular difference
the base regular Bre Extended Regular Ere
\? ?
\+ +
\{\} {}
\( \ ) ()
\

The so-called basic regular is actually the need to escape the character mate expression of the regular, and the extension is to let the command extend its permissions so that he directly know the regular expression symbol (Egrep,sed-r,awk direct support)

The 11th chapter adds 11.1 Some pre-defined:
Regular Expressions Description Example
[: Alnum:] [a-za-z0-9] matches any one letter or number character [[: alnum:]]+
[: Alpha:] Match any alphabetic character (including uppercase and lowercase letters) [[: Alpha:]] {4}
[: Blank:] Spaces and tabs (Landscape portrait) [[: blank:]]*
[:d Igit:] Match any numeric character [[:d Igit:]]?
[: Lower:] Match lowercase letters [[: Lower:]] {5,}
[: Upper:] Match uppercase letters ([[: upper:]]+)?
[:p UNCT:] Match punctuation [[:p UNCT:]]
[: Space:] Match all whitespace characters including line feed, enter, etc. [[: space:]]+
[: Graph:] Matches any character that can be seen and can be printed [[: Graph:]]
[: Xdigit:] Any one hexadecimal number [[: xdigit:]]+
[: Cntrl:] Any one control character (the first 32 characters in the ASCII character set) [[: Cntrl:]]
[:p rint:] Any one of the characters that can be printed [[:p rint:]]
11.2 Meta Characters

Metacharacters is a Perl-style regular expression, and only a subset of the text processing tools support it, not all text processing tools support

Regular Expressions Description Example
\b Word boundaries \bcool\b matching cool, mismatched coolant
\b Non-word boundary cool\b matching coolant not matching cool
\d Single numeric character B\DB match Business-to-business, mismatch BCB
\d Single non-numeric characters B\DB matching BCB mismatch business-to-business
\w Single word characters (letters, numbers, and _) \w match 1 or a, mismatch &
\w Single non-word character \w match &, mismatch 1 or a
\ n Line break \ n matches a new row
\s Single whitespace character X\SX matches xx, does not match xx
\s Single non-whitespace character X\S\X matches Xkx, does not match xx
\ r Enter \ r Match Carriage return
\ t Horizontal tab \ t matches a horizontal tab
\v Vertical tab \v matches a vertical tab
\f Page break \f Match a page break
The 12th chapter summary of regular expressions
      • Egrep/grep understand the regular, simple look at the effect, the results
      • Egrep/grep-o parameters See what exactly matches the exact
      • More good, with Grep,egrep,sed-r,awk more powerful

Linux Regular expressions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.