Proficient in JS regular expressions

Source: Internet
Author: User
Tags control characters expression engine truncated

Proficient in JS regular expression, speaking more detailed, learning regular expression of friends can refer to the next. Regular Expressions can:
• Test a pattern for a string. For example, you can test an input string to see if there is a phone number pattern or a credit card number pattern in the string. This is called data validation
• Replace text. You can use a regular expression in your document to identify specific text, and then you can either delete it all or replace it with another text
• Extracts a substring from a string based on pattern matching. Can be used to find specific text in text or input fields
Regular expression syntax
A regular expression is a text pattern consisting of ordinary characters, such as characters A through z, and special characters (called metacharacters). This pattern describes one or more strings to match when looking up a text body. A regular expression, as a template, matches a character pattern to the string you are searching for.
To create a regular expression
var re = new RegExp ();//regexp is an object, like Aarray//But this has no effect and requires the contents of the regular expression to be passed in as a string into the re =new RegExp ("a");//The simplest regular expression that will match the letter A Re=new RegExp ("A", "I");//The second parameter, which indicates that the match is not case-sensitive

RegExp constructor The first argument is the text content of a regular expression, and the first parameter is an optional flag. Flags can be combined with

• G (Full-text search)
• I (ignoring case)
• m (Multi-line lookup)
var re = new RegExp ("A", "GI");//matches all A or a

Regular expressions There is another way to declare a regular expression literal

var re =/a/gi;

Regular expression-related methods and properties

Methods for regular Expression objects
test, which returns a Boolean value that indicates whether a pattern exists in the string being looked up. Returns true if it exists, or false if it is present.
exec, runs the lookup in a string with the regular expression pattern, and returns the package <script type= "Text/javascript" src= "http://www.javaeye.com/javascripts/tinymce/ Themes/advanced/langs/zh.js "></script><script type=" Text/javascript "src=" http://www.javaeye.com/ Javascripts/tinymce/plugins/javaeye/langs/zh.js "></script> An array with the result of the lookup.
compile, the regular expression is compiled into an internal format, which executes faster.
The properties of the regular Expression object
A source that returns a copy of the text of the regular expression pattern. Read-only.
lastindex, which returns the position of the character, which is the beginning of the next successful match for the found string.
$1...$9, returns nine of the most recently saved parts found during pattern matching. Read-only.
input ($_), which returns the string that executes the canonical expression lookup. Read-only.
Lastmatch ($&) that returns the last matching character in any regular expression search process. Read-only.
Lastparen ($+), if any, returns the last sub-match during any regular expression lookup. Read-only.
leftcontext ($ ') that returns the character from the beginning of the string in the searched string to the position before the last match. Read-only.
rightcontext ($ '), returns the character from the last matching position in the searched string to the end of the string. Read-only.
String object Some methods related to regular expressions
match, finds a match for one or more regular expressions.
Replace, replacing the substring that matches the regular expression.
Search, retrieves the value that matches the regular expression.
split, splits the string into an array of strings.
Test how regular expressions work!
The code is as follows:
The test method, which tests the string, returns True when conforming to the pattern, otherwise returns false var re =/he/;//The simplest regular expression that will match the He word var str = "he"; Alert (Re.test (str));//true str = "we"; Alert (Re.test (str));//false str = "HE"; Alert (Re.test (str));//false, uppercase, if you want to match the case, you can specify the I flag (I is the representation of ignorecase or case-insensitive) re =/he/i; Alert (Re.test (str));//true str = "certainly! He loves her! "; Alert (Re.test (str)),//true, as long as it is included in the He (he), if you want to just he or he, there can be no other characters, you can use the ^ and re =/^he/i;//(^) character start position alert (Re.test (str );//false, because he is not at the beginning of str str = "He is a good boy!"; Alert (Re.test (str));//true,he is the start position of the character, and you need to use $ re =/^he$/i;//$ to indicate the end of the character alert (Re.test (str));//false str = "he"; Alert (Re.test (str));//true//Of course, it is not possible to find out how powerful regular expressions are because we can use = = or indexof re =/\s/;//\s in the above example to match any whitespace character, including spaces, tabs, page breaks, etc. str= "user name";//username contains a space alert (Re.test (str));//true str = "user name";//Username contains tab alert (Re.test (str));//true re=/^[a-z ]/i;//[] matches any character in the specified range, which will match the English letter, not the case str= "variableName";//the variable name must start with a letter alert (Re.test (str));//true str= "123ABC"; Alert (Re.test (str));//false

Of course, it's not enough to just know if a string matches the pattern, and we need to know which characters match the pattern.
The code is as follows:

var osVersion = "Ubuntu 8";//8 indicates the system major version number var re =/^[a-z]+\s+\d+$/i; The + sign indicates that the character must appear at least 1 times, \s represents a white space character, and \d represents a number alert (Re.test (osVersion));//true, but we want to know the main version number//Another method exec, returns an array, The first element of the array is a complete match of the content re=/^[a-z]+\s+\d+$/i; arr = re.exec (osVersion); Alert (arr[0]);//Will osversion the full output, because the entire string exactly matches the RE//I just need to take out the number re=/\d+/; var arr = re.exec (osVersion); Alert (arr[0]);//8

More complex usage, using sub-matching
The code is as follows:

EXEC returns an array of 1th to n elements that contain any one of the sub-matches appearing in the match re=/^[a-z]+\s+ (\d+) $/i;//used () to create a sub-match arr =re.exec (osVersion); Alert (arr[0]);//The entire osVersion, which is the complete match of the regular expression alert (arr[1]);//8, the first sub-match, the fact can also remove the major version of alert (arr.length);//2 osVersion = "Ubuntu 8.10";//Remove the major and minor version numbers re =/^[a-z]+\s+ (\d+) \. (\d+) $/i;//. is one of the regular expression metacharacters to be escaped with its literal meaning arr = re.exec (osVersion); Alert (arr[0]);//Complete OSVersion alert (arr[1]);//8 alert (arr[2]);//10

Note that when the string does not match the RE, the Exec method returns null
Some methods of the string object related to regular expressions

The code is as follows:

Replace method, for replacing the string var str = "some money"; Alert (Str.replace ("some", "much"));//much Money//replace The first argument can be a regular expression var re =/\s/;//blank character alert (str.replace (Re, "%") );//some%money//The regular expression is very convenient when you do not know how many white space characters are in the string str = "some some \tsome\t\f"; re =/\s+/; Alert (Str.replace (Re, "#"));//But this will only replace the first occurrence of a bunch of white space characters//Because a regular expression can only be matched once, \s+ after matching the first space, then exit RE =/\s+/g;//g, global flag, The regular expression will match the entire string alert (str.replace (Re, "@"));//[email protected]@[email protected] var str = "ADF9DF9DF9",// The string in the text file; Re =/9/gi,//Match 9 counter = 0; counter var newstr = str = str.replace (Re, function () {counter++;//each time a match occurs, the function is executed once, and the return value of the function is used to replace the original value return "#";}); Alert ("Number of Replacements:" +counter); alert (str); Finally str becomes adf#df#df# "var str =" He is 22 years old, she is 20 years old, his father is 45 years old, her father this year 44 years old, a total of 4 people "function test ($) {var gyear = (new Date ())." Ge Tyear ()-parseint ($) + 1; return $ + "(" + Gyear + "Year of Birth)";} var reg = new RegExp ("(http://www.cnblogs.com/sgivee/admin/file://d/+) years old", "G"); var reg =/(\d+) old/gi; var newstr = str.replace (reg, test); AlerT (str); alert (NEWSTR); Another similarity is the split var str = "A-bd-c"; var arr = str.split ("-");//returns ["a", "BD", "C"]//If STR is input by the user, he may enter a-bd-c may also enter a BD C or A_bd_c, but will not be ABDC (so that he loses the wrong) str = "A_ Db-c ";//user in the way he likes to add the delimiter s re=/[^a-z]/i;//front we say ^ means the character starts, but in [] it represents a negative character set//matches any character that is not in the specified range, here will match all characters except the letter arr = Str.split (re);//Still returns ["a", "BD", "C"]; When looking in a string, we often use indexof, which corresponds to the method used for regular lookups, search str = "My age is 18.Golden age!"; /age is not certain, we can not find its position with indexof re =/\d+/; Alert (Str.search (re));//Returns the string that is found to start subscript 10//note, because the lookup itself is the first time to return immediately, so there is no need to use the G flag at search//the following code, although not error, but the G flag is redundant re=/\d+/g ; Alert (Str.search (re));//Still 10

Similar to the Exec method, the match method of a string object is also used to match a string to a regular expression and return an array of results

The code is as follows:

var str = "My name is CJ." Hello everyone! "; var re =/[a-z]/;//matches all uppercase letters var arr = Str.match (re);//Returns an array of alert (arr);//The array contains only one m, because we do not use global match re =/[a-z]/g; arr = Str.match (re); Alert (arr);//m,c,j,h//extract Word from string re =/\b[a-z]*\b/gi;//\b denotes word boundary str = "One, three four"; Alert (Str.match (re));//one,two,three,four

The code is as follows:

An instance of each RegExp object has a lastindex property, which is the starting position of the next successful match for the found string, and the default value is-1. The LastIndex property is modified by the exec and test methods of the RegExp object. And it is writable.

var re =/[a-z]/; After the Exec method executes, the Lastindex property of the RE is modified, var str = "Hello,world!!!"; var arr = re.exec (str); alert (re.lastindex);//0, because no global flag is set for re =/[a-z]/g; arr = re.exec (str); alert (re.lastindex);//1 arr = re.exec (str); alert (re.lastindex);//7

When the match fails (there is no match later), or if the lastindex value is greater than the string length, the Exec method will set lastindex to 0 (start position)

The code is as follows:
var re =/[a-z]/; var str = "Hello,world!!!"; Re.lastindex = 120; var arr = re.exec (str); alert (re.lastindex);//0
Static properties of the RegExp object
The code is as follows:
//input finally used to match the string (the string passed to the Test,exec method) var re =/[a-z]/; var str = "Hello, World!!! "; var arr = re.exec (str); alert (regexp.input);//hello,world!!! Re.exec ("TempStr"); alert (regexp.input);//is still hello,world!!!, because TempStr does not match//lastmatch last matched character re =/[a-z]/g; str = "HI"; Re.test (str); alert (regexp.lastmatch);//h re.test (str); Alert (regexp["$&"]),//i,$& is the short name of Lastmatch, but because it is not a valid variable name, it is. Lastparen last matched group re =/[a-z] (\d+)/gi; str = "Class1 Class2 Class3"; Re.test (str); alert (regexp.lastparen);//1 re.test (str); Alert (regexp["$+"])//2//leftcontext returns the character from the beginning of the string in the searched string to the position before the last match//rigthcontext Returns the character in the searched string from the last matching position to the end of the string re =/[a-z]/g; str = "123abc456"; Re.test (str); alert (regexp.leftcontext);//123 alert (regexp.rightcontext);//bc456 re.test (str); Alert (regexp["$ '"]);//123a alert (regexp["$ '"]);//c456 

The Multiline property returns whether the regular expression uses multiline mode, which is not for a regular expression instance, but for all regular expressions, and this property is writable. (IE and opera do not support this attribute)

The code is as follows:
alert (regexp.multiline); Because Ie,opera does not support this property, it is best to specify var re =/\w+/m separately; alert (re.multiline); Alert (regexp["$*"]); static properties of//regexp objects do not change Regexp.multiline = true;//because the M flag is specified for an object instance REGEXP this will open the multiline matching pattern for all regular expression instances alert (Regexp.multiline);

Using metacharacters Note: Metacharacters are part of regular expressions and must be escaped when we want to match the regular expression itself. Here are all the metacharacters used by the regular expression

( [ { \ ^ $ | ) ? * + .
The code is as follows:
var str = "?"; var re =/?/; Alert (Re.test (str));//error, because? is a meta character and must be escaped by re =/\?/; Alert (Re.test (str));//true

Use the RegExp constructor with the regular expression literal to create a regular expression note point


The code is as follows:
var str = "\?"; alert (str);//Output only? var re =/\?/;//will match? Alert (Re.test (str));//true re = new RegExp ("\?"); /error because this is equivalent to re =/\?/re = new RegExp ("\ \"); /correct, will match? Alert (Re.test (str));//true

Since double escaping is so unfriendly, it is still declared with regular expression literals.

How do I use special characters in regular expressions?
The code is as follows:
The ASCII method uses hexadecimal numbers to represent the special characters var re =/^\x43\x4a$/;//will match CJ Alert (Re.test ("CJ")),//true//can also be used in octal mode re =/^\103\112$/;//will match CJ Alert (Re.test ("CJ"));//true//You can also use Unicode encoding re =/^\u0043\u004a$/;//using Unicode, you must start with u, followed by the character-encoded four-bit 16-in representation of alert ( Re.test ("CJ"));

Also, there are some other predefined special characters, as shown in the following table:

Character description
\ n line break
\ r return character
\ t tab
\f page Break (TAB)
\CX control characters corresponding to X
\b Backspace (BackSpace)
\v Vertical Tab
The ("") Null character ("")
Character class---Simple class, reverse class, scope class, combo class, predefined class
The code is as follows:
Simple class
var re =/[abc123]/;//will match abc123 one of these 6 characters
Negative to Class
Re =/[^abc]/;//will match a character other than ABC
Scope class
Re =/[a-b]/;//will match lowercase a-b 26 letters
Re =/[^0-9]/;//will match one character in addition to 0-9 10 characters
Combination Class
Re =/[a-b0-9a-z_]/;//will match letters, numbers and underscores
The following is a predefined class in a regular expression
Code is equivalent to matching
. ie [^\n], other [^\n\r] matches any character except line break
\d [0-9] matching numbers
\d [^0-9] matches non-numeric characters
\s [\n\r\t\f\x0b] matches a white space character
\s [^ \n\r\t\f\x0b] matches a non-whitespace character
\w [a-za-z0-9_] matches alphanumeric and underscore
\w [^a-za-z0-9_] matches characters other than alphanumeric underscores
Quantifier (The following table quantifier is a greedy quantifier when a single occurrence)
Code description
* matches the preceding subexpression 0 or more times. For example, zo* can match "z" and "Zoo". * Equivalent to {0,}.
+ matches the preceding subexpression one or more times. For example, ' zo+ ' can match "Zo" and "Zoo", but not "Z". + equivalent to {1,}.
? Matches the preceding subexpression 0 or one time. For example, "Do (es)?" can match "do" in "do" or "does".? Equivalent to {0,1}.
{n} n is a non-negative integer. Matches the determined n times. For example, ' o{2} ' cannot match ' o ' in ' Bob ', but can match two o in ' food '.
{N,} n is a non-negative integer. Match at least n times. For example, ' o{2,} ' cannot match ' o ' in ' Bob ', but can match all o in ' Foooood '. ' O{1,} ' is equivalent to ' o+ '. ' O{0,} ' is equivalent to ' o* '.
{n,m} m and n are non-negative integers, where n <= m. Matches at least n times and matches up to M times. Liu, "o{1,3}" will match the first three o in "Fooooood". ' o{0,1} ' is equivalent to ' O? '. Note that there can be no spaces between a comma and two numbers.
Greedy quantifier and the lazy quantifier
• When matching with greedy quantifiers, it first treats the whole string as a match, if the match is exited, if it does not match, the last character is truncated, if it does not match, the last character is truncated to match until there is a match. Until now, we have come across quantifiers that are greedy quantifiers.
• When matching with an inert quantifier, it first treats the first character as a match, exits if successful, and if it fails, tests the first two characters and increments until a suitable match is encountered
The lazy quantifier only adds a "?" to the greedy quantifier. , such as "A +" is a greedy match, "A +?" It is inert.
The code is as follows:
var str = "ABC"; var re =/\w+/;//will match abc re =/\w+?/;//will match a

Multi-line mode

The code is as follows:
var re =/[a-z]$/; var str = "Ab\ncdef"; Alert (Str.replace (Re, "#"));//ab\ncde# re =/[a-z]$/m; Alert (Str.replace (Re, "#"));//a#\ncde#

Grouping and non-capturing groups

The code is as follows:
Re =/abc{2}/;//will match abcc re =/(ABC) {2}/;//will match abcabc//The above groupings are capturing groupings str = "ABCABC # #"; arr = re.exec (str); Alert (arr[1]);//ABC//Non-capturing grouping (?:) re =/(?: ABC) {2}/; arr = re.exec (str); alert (arr[1]);//undefined

Candidate (that is, the "or")

The code is as follows:
Re =/^a|bc$/;//will match the start position of a or the end position of BC str = "Add"; Alert (Re.test (str));//true re =/^ (A|BC) $/;//will match a or BC str = "BC"; Alert (Re.test (str));//true

When a regular expression containing a grouping has been test,match,search these methods, each grouping is placed in a special place for future use, which is a special value in the grouping, which we call a reverse reference
The code is as follows:

var re =/(A? ( B? (C?))) /; /* The regular expression above will produce three groupings in turn (A? ( B? (C?))) The outermost (B? ( C?)) (C?) */str = "ABC"; Re.test (str);//The reverse reference is stored in the static property of the RegExp object $1-$9 alert (regexp.$1+ "\ n" +regexp.$2+ "\ n" +regexp.$3); A reverse reference can also be used in regular expressions using \1, \2 ... this type of form uses re =/\d+ (\d) \d+\1\d+/; str = "2008-1-1"; Alert (Re.test (str));//true str = "2008-4_3"; Alert (Re.test (str));//false

You can use a reverse reference to require that the characters in a string must be the same. In addition, a special character sequence is used to represent a reverse reference in a method such as replace

Codeas follows:
Re =/(\d) \s (\d)/; str = "1234 5678"; Alert (Str.replace (RE, "$ $"));//In this case, the first grouping 1234,$2 represents 5678

Other--〉 are forward-looking to capture characters that appear before a particular character, and only after the character is followed by a specific word characters to capture it. Negative forward-looking corresponding to forward-looking, which matches a character only when it is not followed by a specific character. When performing operations such as forward-looking and negative-looking, the regular expression engine pays attention to the parts that follow the string, but does not move the index

Codeas follows:
Forward forward re =/([a-z]+ (? =\d))/I; We want to match the word followed by a number, and then return the word without returning the number str = "ABC every1 ABC"; Alert (Re.test (str));//true alert (regexp.$1);//every alert (Re.lastindex);//The advantage of using foresight is that the forward-looking content (? =\d) is not considered a match, The next match still starts from it//negative forward (?!) re =/([A-z] (?! \d))/;i//will match letters that do not contain numbers, and will not return (?! \d) in the contents of str = "ABC1 one"; Alert (Re.test (str)); alert (regexp.$1);//one

Build a regular expression that validates the validity of your e-mail address. e-mail address validity requirements (we would like to define): The user name can only contain alphanumeric and underscore, at least one bit, up to 25 bits, the user name immediately after the @, followed by the domain name, domain names can only contain alphanumeric and minus sign (-), and can not start or end with a minus sign, Then there is the domain suffix (there can be more than one), the domain name suffix must be dot number connected to 2-4 characters

The code is as follows:
var re =/^\w{1,15} (?: @ (?! -)) (?:(?: [a-z0-9-]*) (?: [A-z0-9] (?! -))(?:\. (?! -))) +[a-z]{2,4}$/;  

Proficient in JS regular expressions (RPM)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.