Regular expression Writing rules description

Last Update:2018-10-19 Source: Internet

Author: User

Tags uppercase letter

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I. Description

I've seen countless regular expression articles, but whenever I need to write again I still feel I can't do it, as always I need Baidu "IP regular expression", "url regular expression".

To reflect on the reasons for this kind of image, one is that many articles are awarded to fish, such as direct to tell you the URL of the regular expression is "[a-za-z]+://[^\s]*", and the same expression (for example, are URL regular expressions) different articles are often given is not the same (the reason is that some of the filter is more than the whole filter is not so all the second is the regular expression is really a lot of equivalent notation). It can't be remembered.

The second is to teach the most of the course of fishing is only the official documents moved, chatty did not highlight the key points, which can not be remembered.

Second, the regular writing key point

There are only three key points in regular writing: The first is that all underlying expressions can and can match only one character, and the other is that unless the qualifier is given a different number of matches, the third is the one that the qualifier only precedes (the ^ locator takes effect only on the following expression). A locator takes effect only on its previous expression).

2.1 Basic Expressions

printable character base expression:

Base expression	Description
A	A single character. A here is just a delegate, can be any non-special character
A\|b	Match A or B. A and B are just a representation, A and B can be any non-special character
[ABC]	Match A or B or C. A, B and C are just representatives, a, B and C can be any non-special characters
[^ABC]	Matches all characters other than a or B or C.
[A-z]	Matches any one character from A to Z
\d	Matches a numeric character. equivalent to [0-9].
\d	Matches a non-numeric character. equivalent to [^0-9].
\w	Matches letters, numbers, underscores. Equivalent to ' [a-za-z0-9_] '.
\w	Matches non-alphabetic, numeric, underline. Equivalent to ' [^a-za-z0-9_] '.
\s	Matches any whitespace character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v]
\s	Matches any non-whitespace character. equivalent to [^ \f\n\r\t\v].

Non-printable character base expression:

Base expression	Description
\cx	Matches the control character indicated by X. For example, \cm matches a control-m or carriage return. The value of x must be one of a-Z or a-Z. Otherwise, c is treated as a literal ' C ' character.
\f	Matches a page break. Equivalent to \x0c and \CL.
\ n	Matches a line break. Equivalent to \x0a and \CJ.
\ r	Matches a carriage return character. Equivalent to \x0d and \cm.
\s	Matches any whitespace character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v]. Note Unicode Regular expressions match full-width whitespace characters.
\s	Matches any non-whitespace character. equivalent to [^ \f\n\r\t\v].
\ t	Matches a tab character. Equivalent to \x09 and \ci.
\v	Matches a vertical tab. Equivalent to \x0b and \ck.

Special character base expression (special character is the one that uses him to do other functions he cannot be his own character):

The The The

Special character	special character base expression
$	matches the end position of the input string. If the Multiline property of the RegExp object is set, then $ also matches ' \ n ' or ' \ R '. To match the $ character itself, use \$.
()	marks the start and end positions of a subexpression. Sub-expressions can be obtained for later use. To match these characters, use \ (and \).
*	matches the preceding subexpression 0 or more times. To match the * character, use \*.
+	matches the preceding subexpression one or more times. to match the + character, use \+.
.	matches any single character except for the newline character \ n. to match. , please use \. 。
[	marks the beginning of a bracket expression. to match [, please use \[.
?	matches the preceding subexpression 0 or one time, or indicates a non-greedy qualifier. to match? characters, use \?.
\	marks the next character as either a special character, or a literal character, or a backward reference, or octal escape character. For example, ' n ' matches the character ' n '. ' \ n ' matches line breaks. The sequence ' \ \ ' matches ' \ ', while ' \ (' then Match ' (".
^	matches the starting position of the input string, unless used in a square bracket expression, which indicates that the character set is not accepted. To match the ^ character itself, use \^.
{	tags the beginning of the qualifier expression. To match {, use \{.
\|	indicates a choice between the two items. to match \|, please use \\|.

2.2 Qualifiers

Qualifier	Description
*	Matches the preceding subexpression 0 or more times. For example, zo* can match "z" and "Zoo". * Equivalent to {0,}.
+	Matches the preceding subexpression one or more times. For example, ' zo+ ' can match "Zo" and "Zoo", but not "Z". + equivalent to {1,}.
?	Matches the preceding subexpression 0 or one time. For example, do (es) can match "do" in "Do", "does" in "does", "Doxy" in "Do"? Equivalent to {0,1}.
N	N is a non-negative integer. Matches the determined n times. For example, ' o{2} ' cannot match ' o ' in ' Bob ', but can match two o in ' food '.
{N,}	N is a non-negative integer. Match at least n times. For example, ' o{2,} ' cannot match ' o ' in ' Bob ', but can match all o in ' Foooood '. ' O{1,} ' is equivalent to ' o+ '. ' O{0,} ' is equivalent to ' o* '.
{N,m}	Both M and n are non-negative integers, where n <= m. Matches at least n times and matches up to M times. For example, "o{1,3}" will match the first three o in "Fooooood". ' o{0,1} ' is equivalent to ' O? '. Note that there can be no spaces between a comma and two numbers.

Third, regular expression example description

Understanding other people's regular form is more representative of the table than writing regular expressions, and we use several expressions generated by this site to parse. Grasping the three key points mentioned above is basically understandable.

3.1 URL Regular expression

Expression: [a-za-z]+://[^\s]*

[a-za-z]+----[a-za-z] is an underlying expression that matches any lowercase or uppercase letter (z in a-Z should be wrong and should be capitalized to its original meaning). + is its qualifier, so he is no longer only matched once but once or more.

://----This is three basic expressions (: and/and/), because they have no qualifiers, so they can and can only match once.

[^\s]*----[^\s] is an underlying expression that represents all non-whitespace characters. * is its qualifier, so he is no longer able to match only once but can be any time.

Overall, this expression is a very non-rigorous URL expression.

3.2 ID Regular Expressions

Expression: ^ (\d{6}) (\d{4}) (\d{2}) (\d{2}) (\d{3}) ([0-9]| X) $

^ (\d{6})----^ denotes (\d{6}) at the beginning of the match, \d represents the number, {6} represents \d to be matched 6 times. (Province and county)

(\d{4})----\d represents a number, {4} means that \d is to be matched 4 times. Years

(\d{2})----\d represents a number, {2} means that \d is to be matched 2 times. Months

(\d{2})----\d represents a number, {2} means that \d is to be matched 2 times. Day

(\d{3})----\d represents a number, {3} means that \d is to be matched 3 times. Id

([0-9]| X) $----$ representation ([0-9]| X) matches at the end, without qualifiers so ([0-9]| X) can and can match only one character, which is either 0 to 9 or the letter "X". (check code)

Four, regular on-line generation and testing tools

Webmaster's home: http://tool.chinaz.com/regex/

Rookie Tutorial: https://c.runoob.com/front-end/854

Open source China: http://tool.oschina.net/regex/#

Reference:

Http://www.runoob.com/regexp/regexp-syntax.html

Regular expression Writing rules description

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More