Regular expression Writing rules description

Source: Internet
Author: User
Tags uppercase letter

I. Description

I've seen countless regular expression articles, but whenever I need to write again I still feel I can't do it, as always I need Baidu "IP regular expression", "url regular expression".

To reflect on the reasons for this kind of image, one is that many articles are awarded to fish, such as direct to tell you the URL of the regular expression is "[a-za-z]+://[^\s]*", and the same expression (for example, are URL regular expressions) different articles are often given is not the same (the reason is that some of the filter is more than the whole filter is not so all the second is the regular expression is really a lot of equivalent notation). It can't be remembered.

The second is to teach the most of the course of fishing is only the official documents moved, chatty did not highlight the key points, which can not be remembered.

Second, the regular writing key point

There are only three key points in regular writing: The first is that all underlying expressions can and can match only one character, and the other is that unless the qualifier is given a different number of matches, the third is the one that the qualifier only precedes (the ^ locator takes effect only on the following expression). A locator takes effect only on its previous expression).

2.1 Basic Expressions

printable character base expression:

Base expression Description
A A single character. A here is just a delegate, can be any non-special character
A|b Match A or B. A and B are just a representation, A and B can be any non-special character
[ABC] Match A or B or C. A, B and C are just representatives, a, B and C can be any non-special characters
[^ABC] Matches all characters other than a or B or C.
[A-z] Matches any one character from A to Z
\d Matches a numeric character. equivalent to [0-9].
\d Matches a non-numeric character. equivalent to [^0-9].
\w Matches letters, numbers, underscores. Equivalent to ' [a-za-z0-9_] '.
\w Matches non-alphabetic, numeric, underline. Equivalent to ' [^a-za-z0-9_] '.
\s Matches any whitespace character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v]
\s Matches any non-whitespace character. equivalent to [^ \f\n\r\t\v].

Non-printable character base expression:

Base expression Description
\cx Matches the control character indicated by X. For example, \cm matches a control-m or carriage return. The value of x must be one of a-Z or a-Z. Otherwise, c is treated as a literal ' C ' character.
\f Matches a page break. Equivalent to \x0c and \CL.
\ n Matches a line break. Equivalent to \x0a and \CJ.
\ r Matches a carriage return character. Equivalent to \x0d and \cm.
\s Matches any whitespace character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v]. Note Unicode Regular expressions match full-width whitespace characters.
\s Matches any non-whitespace character. equivalent to [^ \f\n\r\t\v].
\ t Matches a tab character. Equivalent to \x09 and \ci.
\v Matches a vertical tab. Equivalent to \x0b and \ck.

Special character base expression (special character is the one that uses him to do other functions he cannot be his own character):

The The The
Special character special character base expression
$ matches the end position of the input string. If the Multiline property of the RegExp object is set, then $ also matches ' \ n ' or ' \ R '. To match the $ character itself, use \$.
() marks the start and end positions of a subexpression. Sub-expressions can be obtained for later use. To match these characters, use \ (and \).
* matches the preceding subexpression 0 or more times. To match the * character, use \*.
+ matches the preceding subexpression one or more times. to match the + character, use \+.
. matches any single character except for the newline character \ n. to match. , please use \. 。
[ marks the beginning of a bracket expression. to match [, please use \[.
? matches the preceding subexpression 0 or one time, or indicates a non-greedy qualifier. to match? characters, use \?.
\ marks the next character as either a special character, or a literal character, or a backward reference, or octal escape character. For example, ' n ' matches the character ' n '. ' \ n ' matches line breaks. The sequence ' \ \ ' matches ' \ ', while ' \ (' then Match ' (".
^ matches the starting position of the input string, unless used in a square bracket expression, which indicates that the character set is not accepted. To match the ^ character itself, use \^.
{ tags the beginning of the qualifier expression. To match {, use \{.
| indicates a choice between the two items. to match |, please use \|.

2.2 Qualifiers
Qualifier Description
* Matches the preceding subexpression 0 or more times. For example, zo* can match "z" and "Zoo". * Equivalent to {0,}.
+ Matches the preceding subexpression one or more times. For example, ' zo+ ' can match "Zo" and "Zoo", but not "Z". + equivalent to {1,}.
? Matches the preceding subexpression 0 or one time. For example, do (es) can match "do" in "Do", "does" in "does", "Doxy" in "Do"? Equivalent to {0,1}.
N N is a non-negative integer. Matches the determined n times. For example, ' o{2} ' cannot match ' o ' in ' Bob ', but can match two o in ' food '.
{N,} N is a non-negative integer. Match at least n times. For example, ' o{2,} ' cannot match ' o ' in ' Bob ', but can match all o in ' Foooood '. ' O{1,} ' is equivalent to ' o+ '. ' O{0,} ' is equivalent to ' o* '.
{N,m} Both M and n are non-negative integers, where n <= m. Matches at least n times and matches up to M times. For example, "o{1,3}" will match the first three o in "Fooooood". ' o{0,1} ' is equivalent to ' O? '. Note that there can be no spaces between a comma and two numbers.

Third, regular expression example description

Understanding other people's regular form is more representative of the table than writing regular expressions, and we use several expressions generated by this site to parse. Grasping the three key points mentioned above is basically understandable.

3.1 URL Regular expression

Expression: [a-za-z]+://[^\s]*

[a-za-z]+----[a-za-z] is an underlying expression that matches any lowercase or uppercase letter (z in a-Z should be wrong and should be capitalized to its original meaning). + is its qualifier, so he is no longer only matched once but once or more.

://----This is three basic expressions (: and/and/), because they have no qualifiers, so they can and can only match once.

[^\s]*----[^\s] is an underlying expression that represents all non-whitespace characters. * is its qualifier, so he is no longer able to match only once but can be any time.

Overall, this expression is a very non-rigorous URL expression.

3.2 ID Regular Expressions

Expression: ^ (\d{6}) (\d{4}) (\d{2}) (\d{2}) (\d{3}) ([0-9]| X) $

^ (\d{6})----^ denotes (\d{6}) at the beginning of the match, \d represents the number, {6} represents \d to be matched 6 times. (Province and county)

(\d{4})----\d represents a number, {4} means that \d is to be matched 4 times. Years

(\d{2})----\d represents a number, {2} means that \d is to be matched 2 times. Months

(\d{2})----\d represents a number, {2} means that \d is to be matched 2 times. Day

(\d{3})----\d represents a number, {3} means that \d is to be matched 3 times. Id

([0-9]| X) $----$ representation ([0-9]| X) matches at the end, without qualifiers so ([0-9]| X) can and can match only one character, which is either 0 to 9 or the letter "X". (check code)

Four, regular on-line generation and testing tools

Webmaster's home: http://tool.chinaz.com/regex/

Rookie Tutorial: https://c.runoob.com/front-end/854

Open source China: http://tool.oschina.net/regex/#

Reference:

Http://www.runoob.com/regexp/regexp-syntax.html

Regular expression Writing rules description

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.