Tempting regular Expressions

Source: Internet
Author: User
Tags egrep

Regular expressions, also known as formal representations, formal representations, regular expressions, regular expressions, conventional representations (English: Regular expression, often abbreviated in code as regex, RegExp, or re), is a concept of computer science. A regular expression uses a single string to describe and match a series of strings that conform to a certain syntactic rule. In many text editors, regular expressions are often used to retrieve and replace text that conforms to a pattern. --Wikipedia

Regular expression is a manifestation of computer intelligence, she can let us in the complex computer text to find what we want most, under her regular and hazy veil is an elegant and naughty face, for the first to enter the Linux gate of the man, all bowed to her pomegranate skirt, What this article has to do is how to capture her heart.

First, to understand her life

In 1956, a mathematical scientist named Stephen Kleene in the United States, who, based on the early work of Warren McCulloch and Walter Pitts, published a paper titled "Representation of Neural network events", This model is described using mathematical notation called the regular set, and the concept of regular expressions is introduced. A regular expression is used as an expression to describe what it calls "the algebra of a regular set", and thus the term "regular expression" is used. After a while, it was found that the work could be applied to other aspects. Ken Thompson has applied this to some early research on computational search algorithms, and Ken Thompson is the main inventor of Unix, the father of the famous Unix. The Unix father introduced this symbology into the editor QED, then the editor Ed on Unix, and eventually introduced grep. For nearly 20 years, the ideas and applications of regular expressions have been supported and embedded in most Windows developer toolkits under the window's camp! In the current mainstream development language (PHP, C #, Java, C + +, VB, Javascript, Ruby, Python, etc.), a number of hundreds of millions of applications, can see the regular expression of graceful dance.

Second, uncover her veil.

Well, after the last inode we had another thing that we couldn't start with, but anyway we had to learn from her, so let's take a look at what regular expressions are all about:

Basic regular Expression meta-characters:

Character Matching:

. : Matches any single character

[]: matches any single character within the specified range

[^]: matches any single character outside the specified range

[0-9], [[:d Igit:]], [^0-9], [^[:d igit:]]

[A-z], [[: Lower:]]

[A-z], [[: Upper:]]

[[: Space:]]

[[:p UNCT:]]

[0-9a-za-z], [[: Alnum:]]

[A-za-z], [[: Alpha:]]

Number of matches: A control is provided after the expected match character, which is used to express the number of occurrences of the preceding character specified

*: Any length, indicated 0 times, 1 times or more;

. *: Any character of any length

Working in greedy mode

\?:0 or 1 times; indicates that its left character is optional

\+: 1 or more times, indicating that its left character appears at least 1 times;

\{m\}:m, indicating that its left character appears precisely m times;

\{m,n\}: At least m times, up to n times;

\{0,n\}: Up to n times;

\{m,\}: at least m times;

Location anchoring:

^: Anchoring the beginning of the line

$: Anchor Line End

^$: Match blank line;

Word anchoring: A continuous string consisting of non-special characters

\<: Anchor word head, can also be used \b

\>: Anchor ending, can also be used \b

\<pattern\>: Matches the entire word that PATTERN can match

Group: \ (\)

\1,\2...\n: Used to represent the contents of a previous group of references


An extended regular expression:

Character Matching:

.

[]

[^]

Number of matches:

*: Any time

?: 0 or 1 times

+: at least 1 times

{m}: exact match m times;

{M,n}: At least m times, at most times;

{m,}: at least m times;

{0,n}: to multiple times;

Location anchoring:

^

$

\<, \b

\>, \b

Group:

()

Citation: \1, \2, ...

Or:

A|b:a or B

or all the contents on both sides;

Three, strong kiss her

Well, I know it might be a bit of a surprise to go this far, but you know that kissing a girl is the quickest way to get a response, right? OK, if you want a strong kiss, we need a hand and a mouth, so let's see what we can do with our hands and mouth. When we learned about the goddess, we knew that our Unix father introduced grep, including grep, Egrep, and Fgrep in the Unix grep family. Egrep and Fgrep commands are only a small difference from grep. Egrep is the extension of grep, which supports more metacharacters, and fgrep all the letters as words, that is to say, the metacharacters in the regular expression returns to its own literal meaning, no longer special.

Well, our hands are ready to go:

Let's start with a simple action--find/proc/meminfo in s| Lines starting with S

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/5D/12/wKiom1Uf9GrQZVZBAACPlhl04vg795.jpg "title=" C $ 30MZW) 3 ' x[@s%kp6sn ' Ws.png "alt=" Wkiom1uf9grqzvzbaacplhl04vg795.jpg "/>

Our use of one hand grep [options] PATTERN [FILE ...], where ^ is the beginning of the anchor line, [SS] means that only the set of S and s two letters can be selected.

Let's do one more---find one or two digits in the/etc/passwd file, so that we don't look so pale, let's give her some color.

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/5D/0E/wKioL1Uf9qSDA_x6AAQVHqNTIGc431.jpg "title=" eno@pjx@ $CXCJZ 5_) ($Q]5d.png "alt=" Wkiol1uf9qsda_x6aaqvhqntigc431.jpg "/>

The--color option indicates that the search is displayed in color,\<\> These two combinations can be used to determine a word, [0-9] indicates that a range of numbers is selected in 0-9, and \? indicates that the preceding character can appear one time or does not appear.

Our movements are getting more and more skilled--we give a few sentences to see how to use grouping

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/5D/12/wKiom1Uf-L3RwgRNAACGatdYZIk554.jpg "title=" 5AXHWD }dge8}do96]i{zy17.png "alt=" Wkiom1uf-l3rwgrnaacgatdyzik554.jpg "/>

We output four sentences, of which two are very similar, and we use groups and references to find them: \ (L.. e\). *\1,l. E means to start with an L in the middle root two any letter, and then regard this as a whole,. * represents any length of character, \1 refers to the previous use of \ (\) enclosed in the content.

If you want to use color output all the time, we can do this by defining an alias in the form of a bit less trouble:


[[email protected] ~]# alias grep= ' grep--color '

Let's take a little bit of difficulty--taking the path name out of a path

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/5D/0E/wKioL1Uf_ITykWyNAAA8bEwETPQ986.jpg "title=" E2tiqw _3}}kjyg08zf ' 22ld.png "alt=" Wkiol1uf_itykwynaaa8bewetpq986.jpg "/>

Grep-o means that only the selected content is displayed, here we use two times grep, the first time [^/] means the last match to the non-ending character,/.*/represents the content between two delimiters, if we want to get a format like dirname command can use

[Email protected] ~]# echo '/etc/rc.d/init.d/' | Grep-o '/.*[^/] ' | Grep-o '/.*/' | Grep-o '/.*[^/] ' everyone can go and try.

We can't always use one hand, it's clear the other one--take out the base name of a path

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/5D/0E/wKioL1Uf_vChWydnAACq4efp3cU316.jpg "title=" jm}}7% BJ (yq2f$3~gee25kf.png "alt=" Wkiol1uf_vchwydnaacq4efp3cu316.jpg "/>

Egrep equals Grep-e, which represents an extended regular expression, + indicates that the previous [^/] appears one or more times,? As we explained earlier, $ means the end of the anchor line, with the path name at the ends, we can use cut to cut it.

Let's try to consolidate the results:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/5D/0E/wKioL1UgAIjA5KVPAACECOYnaq0123.jpg "title=" GOR ( 51ab0f%) zgvja1ijf7j.png "alt=" Wkiol1ugaija5kvpaacecoynaq0123.jpg "/>

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/5D/0E/wKioL1UgAg-igJSUAABk_2-_YtM534.jpg "title=" G5BKHB (k38o$@ ' JRNVH] Zsd.png "alt=" Wkiol1ugag-igjsuaabk_2-_ytm534.jpg "/>

The first of our purposes is to find the content in the string that is consistent and inconsistent, and the second is to find the content in the string that conforms to the mailbox format, so let's see for yourself, I won't explain.

In order to prevent becoming "three hands", I no longer introduce fgrep, it is actually very simple, is the character of the match.

After these three strong "chick" steps I think you should be able to capture her heart, but if you want to marry this cunning goddess to go home also need you more in-depth efforts ah, the enemy can Baizhanbudai!

Well hope this article can bring you the help, please crossing this article of the wrong place, humbly!

How to use ps:grep,egrep,fgrep you use the man command to look at it, this article will not repeat.

This article is from the "Linuxlove" blog, make sure to keep this source http://linuxlover.blog.51cto.com/2470728/1629004

Tempting regular Expressions

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.