Basic Analytic Regular Expressions

Last Update:2015-05-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The first "\" This is commonly known as an escape character, which is to mark a character as a special character or literal character. For example: "N" is the match "n". If "\ n" is a newline character. Someone should ask, what if I just want to write "\" This slash? This one is very simple, too! Just write "\ \" on it! Why write two "\ \"! is to differentiate.

The second "^" is commonly known as the start character, which means to prepare to write the regular! If the multiline property of the RegExp object is set, ^ also matches the position after "\ n" or "\ r".

The third "$" this is commonly known as the end character, can also be said to be a finishing touches (very unprofessional explanation)! If the multiline property of the RegExp object is set, $ also matches the position before "\ n" or "\ r".

The fourth "*" is a match for the preceding sub-expression 0 or more times. Such as: zo* can match "z" and "Zo" or "Zoo". This "*" is equivalent to {0,}

The fifth "+" is one or more occurrences of the preceding sub-expression. such as: "zo+" can Match "Zo" and "Zoo" or "zooo". This "*" and "+" are almost a start 0 times one is once. This "+" corresponds to {1,}.

Sixth "?" This is a match for the preceding subexpression 0 or one time. such as: "Do (es)?" Can match "do" or "does". This question mark means that either match 0 times or match once!

The seventh "{}" This symbol is the number of matches, 1,{n} matches determined n times, N is a non-negative integer, such as: "o{2}" This means matching two "oo", such as: Good,food! But it doesn't match the body, because it's a o!. 2,{n,} matches at least n times, n is a nonnegative integer, such as: "O{2,}" This means matching two or more "oo", such as: Good,goood,gooood, etc. "O{1,}" is equivalent to "o+". "O{0,}" this is equivalent to "o*". 3,{n,m} This is the least match n times most matches m times, N and M are non-negative integers, where n<=m. For example: "o{1,3}" matches Body,food,foood. But it doesn't match fooood. "o{0,1}" is equivalent to "O?". When writing here, note that there can be no space between a comma and two numbers.

Eighth one "?" Special usage when the character immediately follows any other restriction (*,+,?,{n},{n,},{n,m}), the matching pattern is non-greedy. The so-called non-greedy is the least good, non-greedy mode as little as possible to match the searched string, and the default greedy mode as much as possible to match the searched string. For example, for the string "Oooo", "o+?" A single "O" will be matched, and "o+" will match all "O".

The Nineth "." matches any single character except for the line break "\ n". If you want to match any of the characters in the newline character "\ n", use the "(. | \ n) "mode.

The tenth "pattern" of this "pattern" is not very good understanding, just a look more dizzy! But my understanding of this is to be useful to you: 1. ?:p Attern matches the pattern but does not get a matching result, for example: K (?: 1|2|3) K randomly matches one in 123, example: K1|k2 2. =pattern positive pre-check for example: K (? =1|2|3) when K matches any one of 123, select the K example: K in K1 or K 3 in K2. ?! Pattern positive negative pre-check for example: K (?! 1|2|3) when K does not match any of the 123, select the K example: does not match the K1 in the K, but can be k4,k5 4. <=pattern Reverse affirmation Pre-check for example: (? <=1|2|3) k when K matches any one of 123 Select K Example: K in 1k or K 5 in 2k. <!pattern Reverse negation Pre-check for example: (? <!1|2|3) K when K does not match any one of the 123 examples: mismatch in 1k K can be 4k,5k

11th "|" This symbol is or meaning, for example: "F|good" can match "F" or "good", If so "(F|g) Ood" matches "food" or "good".

The 12th "[]" symbol is the character set and the meaning, and "{}" looks similar, but the meaning is much worse.

The 13th "()" is a symbolic array or a collection (so the explanation may not be accurate, hehe).

The

1.[xyz] matches any one of the characters contained. That means one of the three choices. Example: "[ABC]" can Match "a" in "Company" but cannot match "beautiful" because it uses two letters inside. 2.[^XYZ] This is a negative character set, or it can be said to be "non". Example: "[^ABC]" can match "drop" and so on! As long as there is no "abc" in the word three letters can be. 3.[A-Z] The range of characters. Matches any character within the specified range. For example, "[A-z]" can match any lowercase alphabetic character in the range "a" to "Z". Can also be written as "[0-9]" This is a match between 0 and 9 directly hit the number. 4.[^a-z] This I would say that everyone should think of what it means, right! That's what you think it is: not any characters in the "a" to "Z" range, at first I thought it was not a letter from A to Z. I said if it's not the letter from A to Z, it's only the "U" in Chinese! This is like reading "metaphor"! Oh! Everybody See clearly! is a character, not a letter.

Below everyone and I see "\" and the letter match the special meaning, "\b" This is to match the boundary of a word, that is, the position between the word and the space. For example, "er\b" can Match "er" in "never", but cannot match "er" in "verb". I feel better about this. Remember this: The edge of the boundary is the beginning of B! The "\b" is the opposite of "\b" and matches the non-word boundary. "er\b" can Match "er" in "verb", but cannot match "er" in "Never". "\d" This is used more! I suggest that you remember this more, this is the match number character, equivalent to [0-9]. "\d" This is also very good understanding, but also the opposite meaning is not a number, equivalent to [^0-9]. "\f" This is a match to a page break. This does not make too much explanation! The next four are too much to explain. Just remember to do it! You can use it in the project! "\ n" This is the match for a line break. "\ r" This is a match to a carriage return character. "\ T" This is a tab match. "\v" This is a vertical tab that matches a. "\s" matches any null character, matches any whitespace character, including spaces, tabs, page breaks, and so on. equivalent to [\f\n\r\t\v]. That's the one that's included in the top five! The "\s" is a non-whitespace character equivalent to [^ \f\n\r\t\v]. Speaking of which, everyone may feel that the regular is actually these characters! And some of us can rely on our logical reasoning, and some are repetitive, as long as we can be flexible use of it. OK, we continue to "\w" This is a match for any word character that includes an underscore. Equivalent to "[a-za-z0-9_]". This is used in practice is also quite a lot of also suggest that you remember this. "\w" This is a match for non-word numeric characters. Equivalent to "[^a-za-z0-9_]".

Good! Basically remember it's so much! These may be some regular master should say "you this is not all ah?" Oh Let me explain in advance, I write only some basic, common in the project, more practical, basically these in the project can be used freely. Next, do something substantial with everyone and parse some regular expressions with you. For example, this regular: ^ ([0-1]?[ 0-9]|2[0-3]):([0-5][0-9]):([0-5][0-9]) $ this regular I would like to know for the regular master to see what it is. Of course some logical thinking more strong look at two eyes also know this is what, yes is the time.

OK, let's resolve this, starting with this "^", "([0-1]?[ 0-9]|2[0-3]) "is a group," [0-1]? " The function of this question mark is 0 or 1 up to 0 or one, "[0-9]" 0 to 9 any number, "|" This is the meaning of "or", that is, "[0-1]?" [0-9] "is" 2[0-3] "," 2[0-3 "" This is the preceding 2 is 2, the back 0 to 3 is 0 to 3 any number, ":" Is the representative ":", "([0-5][0-9])" is also a group, "[0-5]" is 0 to 5 any number, "[0-9] "Is any number between 0 and 9,": "is also intended," ([0-5][0-9]) is also a group, "[0-5]" is a 0 to 5 any number, "[0-9]" is any number between 0 to 9, "$" This is the terminator. And then we can parse a decimal, for example: ^[1-9]+\d* (\.[ 0-9]{1,2})? | 0 (\.[ 0-9]{1,2})? $ "^" is the start character, "[1-9]+" where "+" means that there is at least one or more between 1 and 9, "\d*" This "\d" is a number, this "*" is a minimum of 0 numbers, or there are multiple numbers, "(\.[ 0-9]{1,2})? " Inside this group "\." Is the original point, "[0-9]{1,2}" between 0 to 9 has one or two numbers, the question mark after this "?" It means there are 0 or one of it "(\.[ 0-9]{1,2}) ". "|" is either "[1-9]+\d* (\.[ 0-9]{1,2})? " or "0 (\.[ 0-9]{1,2})? ". "0 (\.[ 0-9]{1,2})? " The 0 inside of this is the original intention, "(\.[ 0-9]{1,2})? " Inside this group "\." Is the original point, "[0-9]{1,2}" between 0 to 9 has one or two numbers, the question mark after this "?" It means there are 0 or one of it "(\.[ 0-9]{1,2}) ".

Well, I will not one of the analysis, if I do so the analysis of the estimated that everyone should treat me as "Tang's Monk". Today and everyone to share here, or that the old saying welcome expert criticism, there are different views please leave a message to discuss. Below I cite some common regular expressions to say goodbye to you:

^[1-9]\d*$//Match positive integer ^-[1-9]\d*$//Match negative integer ^-? [1-9]\d*$//Match integer ^[1-9]\d*|0$//Match non-negative integer (positive integer + 0) ^-[1-9]\d*|0$//Match non-positive integer (negative integer + 0) ^[1-9]\d*\.\d*|0\.\d*[1-9]\d*$// Match positive floating point ^-([1-9]\d*\.\d*|0\.\d*[1-9]\d*) $//Match negative floating point number ^-? ([1-9]\d*\.\d*|0\.\d*[1-9]\d*|0?\.0+|0) $//Match floating point number ^[1-9]\d*\.\d*|0\.\d*[1-9]\d*|0?\.0+|0$//match nonnegative floating point number (positive floating point + 0) ^ (-([1- 9]\d*\.\d*|0\.\d*[1-9]\d*)) |0?\.0+|0$//matches the non-positive floating-point number (negative floating-point + 0) ^[a-za-z][a-za-z0-9_]{4,15}$//Match the account is legal (the letter begins, allows 5-16 bytes, Allow alphanumeric underline) ^\s*|\s*$//matches the regular expression of the trailing whitespace character \n\s*\r//matches the regular expression of the blank line [^\x00-\xff]//matches the double-byte character (including Chinese characters) [\U4E00-\U9FA5]//matches the regular of Chinese characters An expression

User name ^[a-z0-9_-]{3,16}$

Password ^[a-z0-9_-]{6,18}$

Hexadecimal value ^#? ([a-f0-9]{6}| [A-f0-9] {3}) $

e-Mail ^ ([a-z0-9_\.-]+) @ ([\da-z\.-]+) \. ([A-z\.] {2,6}) $ ^[a-z\d]+ (\.[ a-z\d]+) *@ ([\da-z] (-[\da-z])?) +(\. {A} [a-z]+] +$

URL ^ (https?:\ /\/)? ([\da-z\.-]+) \. ([A-z\.] {2,6}) ([\/\w \.-]*) *\/?$

IP Address ((2[0-4]\d|25[0-5]|[ 01]?\d\d?) \.) {3} (2[0-4]\d|25[0-5]| [01]?\d\d?]

or ^ (?:(? : 25[0-5]|2[0-4][0-9]| [01]? [0-9] [0-9]?) \.) {3} (?: 25[0-5]|2[0-4][0-9]| [01]? [0-9] [0-9]?) $

HTML tag ^< ([a-z]+) ([^<]+) * (?:> (. *) <\/\1>|\s+\/>) $

Basic Analytic Regular Expressions

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Basic Analytic Regular Expressions

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support