Regular expressions can:
• Test a pattern for a string. For example, you can test an input string to see if there is a phone number pattern or a credit card number pattern in the string. This is called data validation
• Replace text. You can use a regular expression in your document to identify specific text, and then you can delete it all, or replace it with another text
• Extracts a substring from a string based on pattern matching. Can be used to find specific text in text or input fields
Regular expression syntax
A regular expression is a literal pattern consisting of ordinary characters (such as characters A through Z) and special characters (called metacharacters). This pattern describes one or more strings to be matched when looking for a text body. A regular expression is used as a template to match a character pattern with the string being searched for.
Creating regular expressions
Copy Code code as follows:
var re = new RegExp ();//regexp is an object, just like Aarray
But without any effect, you need to pass the contents of the regular expression as a string.
Re =new REGEXP ("a");/The simplest regular expression will match the letter A
Re=new RegExp ("A", "I");//The second parameter, which indicates that the match is not case-insensitive
The first parameter of the RegExp constructor is the text content of the regular expression, and the first argument is an optional flag. Flags can be combined with
g (Full-text search)
I. (Ignore case)
m (Multiple-line lookup)
Copy Code code as follows:
var re = new RegExp ("A", "GI");//Match all A or a
The regular expression also has another way of declaring the literal amount of the regular expression
Copy Code code as follows:
Regular expression-related methods and properties
Methods of regular Expression objects
test, returns a Boolean value that indicates whether the pattern exists in the string being searched. Returns true if present, otherwise returns false.
exec, runs the lookup in the string with the regular expression pattern and returns the package <script type= "Text/javascript src=" http://www.javaeye.com/javascripts/tinymce/ Themes/advanced/langs/zh.js "></script><script type=" Text/javascript "src=" http://www.javaeye.com/ Javascripts/tinymce/plugins/javaeye/langs/zh.js "></script> An array containing the results of the lookup.
compile, the regular expression is compiled into an internal format, which executes faster.
The properties of the regular Expression object
source, returns a copy of the text of the regular expression pattern. Read-only.
lastindex, returns the character position, which is the starting position of the next successful match in the lookup string.
$1...$9, returns nine of the most recently saved parts found during pattern matching. Read-only.
input ($_), returns a string that performs the lookup of the canonical representation. Read-only.
Lastmatch ($&) that returns the last-matched character in any regular expression search. Read-only.
Lastparen ($+), if any, returns the last child match in any regular expression lookup. Read-only.
leftcontext ($ ') returns the character found between the position from the beginning of the string to the last match in the string being searched. Read-only.
rightcontext ($ ') returns the character from the last match position to the end of the string in the searched string. Read-only.
String objects some methods related to regular expressions
match, finds a match for one or more regular expressions.
Replace, replacing the substring that matches the regular expression.
Search, retrieves a value that matches the regular expression.
split, splits the string into an array of strings.
Test how regular expressions work!
Copy Code code as follows:
The test method, which tests the string, returns True when it conforms to the pattern, or returns false
var re =/he/;//simplest regular expression that matches the word he
var str = "he";
Alert (Re.test (str));//true
str = "we";
Alert (Re.test (str));//false
str = "he";
Alert (Re.test (str));//false, uppercase, if you want to match case, you can specify I flag (i is ignorecase or case-insensitive representation)
re =/he/i;
Alert (Re.test (str));//true
str = "certainly! He loves her! ";
Alert (Re.test (str));//true, as long as the inclusion of he is in accordance with, if you want to just he or he, cannot have other characters, you can use ^ and $
Re =/^he/i;//character (^) represents the start position of a character
Alert (Re.test (str));//false, because he was not at the beginning of str
str = "He is a good boy!";
Alert (Re.test (str));//true,he is the character start position, and you need to use the $
Re =/^he$/i;//$ indicates the end position of the character
Alert (Re.test (str));//false
str = "he";
Alert (Re.test (str));//true
Of course, this does not reveal how powerful the regular expression is, because we can use = = or indexof in the example above
Re =/\s/;//\s matches any white space character, including spaces, tabs, page breaks, and so on
str= "user name";//username contains spaces
Alert (Re.test (str));//true
str = "user name";//user name contains tab
Alert (Re.test (str));//true
Re=/^[a-z]/i;//[] matches any character within a specified range, where the English alphabet is matched, case-insensitive
Str= "VariableName";//variable name must begin with a letter
Alert (Re.test (str));//true
Str= "123ABC";
Alert (Re.test (str));//false
Of course, it's not enough just to know if the string matches the pattern, and we need to know which characters match the pattern.
Copy Code code as follows:
var osversion = "Ubuntu 8";//8 of which represents the system major version number
var re =/^[a-z]+\s+\d+$/i; The + number indicates that the character should appear at least 1 times, \s represents a white space character, and \d represents a number
Alert (Re.test (osversion));//true, but we'd like to know the major version number.
Another method, exec, returns an array in which the first element of the array is the complete matching content
re=/^[a-z]+\s+\d+$/i;
arr = re.exec (osversion);
Alert (arr[0]);//Will osversion the full output, because the entire string matches exactly the re
I just need to take out the numbers.
re=/\d+/;
var arr = re.exec (osversion);
Alert (arr[0]);//8
More complex usage, using child matching
Copy Code code as follows:
The 1th to n elements of the array returned by exec contain any of the child matches that appear in the match
re=/^[a-z]+\s+ (\d+) $/i;//use () to create a child match
Arr =re.exec (osversion);
Alert (arr[0]);//The entire osversion, that is, the full match of the regular expression
Alert (arr[1]);//8, the first child matches, the fact that you can also remove the major version number
alert (arr.length);//2
OSVersion = "Ubuntu 8.10";//Remove Major and minor version numbers
Re =/^[a-z]+\s+ (\d+) \. (\d+) $/i;//. is one of the regular expression metacharacters, to be escaped by its literal meaning
arr = re.exec (osversion);
Alert (arr[0]);/full OSVersion
Alert (arr[1]);//8
Alert (arr[2]);//10
Note that when the string does not match the RE, the Exec method returns null
Some methods associated with regular expressions for a string object
Copy Code code as follows:
Replace method, for replacing strings
var str = "some money";
Alert (Str.replace ("some", "much"));//much
The first parameter of replace can be a regular expression
var re =/\s/;//whitespace character
Alert (Str.replace (Re, "%"));//some%money
Regular expressions are extremely handy when you don't know how many whitespace characters are in a string
str = "some some \tsome\t\f";
re =/\s+/;
Alert (Str.replace (Re, "#"))//But this will only replace the first occurrence of a pile of white space characters
Because a regular expression can only be matched once, \s+ the first space and then exits.
Re =/\s+/g;//g, global flag that will cause the regular expression to match the entire string
Alert (Str.replace (Re, "@"));//some@some@some@
var str = "ADF9DF9DF9",//string in that text file;
Re =/9/gi,//Match 9
Counter = 0; Counter var newstr =
str = str.replace (Re, function () {
counter++; Every time a match occurs, the function is executed once, and the return value of the function is used to replace the original value
Return "#";
});
Alert ("Number of Replacements:" +counter);
alert (str);
Finally str turns adf#df#df#.
var str = "He is 22 years old this year, she is 20 years old, his father is 45 years old, her father this year 44 years old, a total of 4 people"
function Test ($) {
var gyear = (new Date ()). getyear ()-parseint ($) + 1;
Return "(" + Gyear + "year born)";
}
var reg = new RegExp ("(http://www.cnblogs.com/sgivee/admin/file://d/+) years old", "G");
var reg =/(\d+)-Year-old/gi;
var newstr = str.replace (reg, test);
alert (str);
alert (NEWSTR);
Another similarity is split.
var str = "A-bd-c";
var arr = str.split ("-");//Return ["A", "BD", "C"]
If STR is entered by the user, he may enter a-bd-c or enter a BD C or A_bd_c, but it will not be ABDC (so that he is wrong)
str = "A_DB-C";//the user adds the separator in the way he likes
re=/[^a-z]/i;//before we say ^ the character begins, but in [] it represents a negative character set
Matches any character that is not in the specified range, where all characters except the letter are matched
arr = Str.split (re);//Still return ["a", "BD", "C"];
When looking in a string, we use IndexOf, and the method used for regular lookup is search
str = "My age is 18.Golden age!"; /age is not a certain, we can not find its location with indexof
re =/\d+/;
Alert (Str.search (re));//Returns the found string starting subscript 10
Note that because the lookup itself appears to return immediately the first time, you do not need to use the G flag in search
The following code is not wrong, but the G flag is redundant
re=/\d+/g;
Alert (Str.search (re));//Still 10
Similar to the Exec method, the match method of a string object is also used to match a string to a regular expression and return an array of results
Copy Code code as follows:
var str = "My name is CJ." Hello everyone! ";
var re =/[a-z]/;//matches all uppercase letters
var arr = Str.match (re);//return array
Alert (arr);//The array will contain only one m, because we are not using a global match
re =/[a-z]/g;
arr = Str.match (re);
Alert (arr);//m,c,j,h
Extract a word from a string
Re =/\b[a-z]*\b/gi;//\b denotes word boundaries
str = "One two three four";
Alert (Str.match (re));//one,two,three,four
RegExp Some properties of an object instance
Copy Code code as follows:
var re =/[a-z]/i;
alert (re.source);//To output a [a-z] string
Note that direct alert (re) will have the regular expression along with the forward slash and the flag output, which is defined by the Re.tostring method.
var re =/[a-z]/i;
alert (Re.source);
Output A [A-z] string
Note that direct alert (re) will have the regular expression along with the forward slash and the flag output, which is defined by the Re.tostring method.
An instance of each RegExp object has the Lastindex property, which is the starting position of the next successful match for the lookup string, and the default value is-1. The Lastindex property is modified by the exec and test methods of the RegExp object. And it is writable.
Copy Code code as follows:
var re =/[a-z]/;
After the Exec method is executed, the Lastindex property of the RE is modified.
var str = "Hello,world!!!";
var arr = re.exec (str);
alert (re.lastindex);//0 because no global flags are set
re =/[a-z]/g;
arr = re.exec (str);
alert (re.lastindex);//1
arr = re.exec (str);
alert (re.lastindex);//7
When a match fails (after no match), or if the lastindex value is greater than the string length, then executing the Exec method will set the lastindex to 0 (start position)
Copy Code code as follows:
var re =/[a-z]/;
var str = "Hello,world!!!";
Re.lastindex = 120;
var arr = re.exec (str);
alert (re.lastindex);//0
Static properties of RegExp objects
Copy Code code as follows:
Input the last string used for matching (the string passed to the Test,exec method)
var re =/[a-z]/;
var str = "Hello,world!!!";
var arr = re.exec (str);
alert (regexp.input);//hello,world!!!
Re.exec ("TempStr");
alert (regexp.input);//Still Hello,world!!!, because TempStr does not match
Lastmatch the last matching character
re =/[a-z]/g;
str = "HI";
Re.test (str);
alert (regexp.lastmatch);//h
Re.test (str);
Alert (regexp["$&"]);//i,$& is a short name for Lastmatch, but because it is not a valid variable name, you should ...
Lastparen last-matched groupings
Re =/[a-z] (\d+)/gi;
str = "Class1 Class2 Class3";
Re.test (str);
alert (regexp.lastparen);//1
Re.test (str);
Alert (regexp["$+"]);//2
Leftcontext returns the characters in the lookup string between the position from the beginning of the string to the last match
Rigthcontext returns the character from the last match position to the end of the string in the searched string
re =/[a-z]/g;
str = "123abc456";
Re.test (str);
alert (regexp.leftcontext);//123
alert (regexp.rightcontext);//bc456
Re.test (str);
Alert (regexp["$ '"]);//123a
Alert (regexp["$ '"]);//c456
The Multiline property returns whether the regular expression uses multiline mode, not for a regular expression instance, but for all regular expressions, and this property is writable. (IE and opera do not support this attribute)
Copy Code code as follows:
alert (regexp.multiline);
Because Ie,opera does not support this property, it is best to specify the
var re =/\w+/m;
alert (re.multiline);
Alert (regexp["$*"]); The static properties of the//regexp object are not changed because the m flag is specified for an instance of an object RegExp
Regexp.multiline = true;//This opens the multiline matching pattern for all regular expression instances
alert (regexp.multiline);
Using meta-character considerations: Metacharacters are part of regular expressions and must be escape when we want to match the regular expression itself. The following are all the metacharacters used by the regular expression
( [ { \ ^ $ | ) ? * + .
Copy Code code as follows:
var str = "?";
var re =/?/;
Alert (Re.test (str));//error, because? is a meta character and must be escaped
re =/\?/;
Alert (Re.test (str));//true
Using the RegExp constructor to create a regular expression with the literal of a regular expression note points
Copy Code code as follows:
var str = "\?";
alert (str);//Only output?
var re =/\?/;//will match?
Alert (Re.test (str));//true
Re = new RegExp ("\?"); /error, as this is equivalent to re =/\?/
Re = new RegExp ("\"); /correct, will match?
Alert (Re.test (str));//true
Since the double escape is so unfriendly, it is also used to declare the literal amount of the regular expression
How do I use special characters in regular expressions?
Copy Code code as follows:
ASCII method to represent special characters in hexadecimal numbers
var re =/^\x43\x4a$/;//will match CJ
Alert (Re.test ("CJ"));//true
You can also use the Octal method
Re =/^\103\112$/;//will match CJ
Alert (Re.test ("CJ"));//true
You can also use Unicode encoding
Re =/^\u0043\u004a$/;//using Unicode, you must start with u, followed by a four-bit 16-digit representation of character encoding
Alert (Re.test ("CJ"));
In addition, there are other predefined special characters, as shown in the following table:
Character description
\ n Line Feed
\ r return character
\ t tab
\f page Breaks (Tab)
\cx the control character corresponding to X
\b Backspace (BackSpace)
\v Vertical Tab
Empty character ("")
Character class---Simple class, reverse class, Range class, group class, predefined class
Copy Code code as follows:
Simple class
var re =/[abc123]/;//will match abc123 one of these 6 characters
Negative to Class
Re =/[^abc]/;//will match a character other than ABC
Scope class
Re =/[a-b]/;//will match lowercase a-b 26 letters
Re =/[^0-9]/;//will match one character in addition to 0-9 10 characters
Combination Class
Re =/[a-b0-9a-z_]/;//will match letters, numbers, and underscores
The following is a predefined class in a regular expression
Code is equivalent to matching
. ie [^\n], other [^\n\r] matches any character other than line breaks
\d [0-9] matching numbers
\d [^0-9] matches non-numeric characters
\s [\n\r\t\f\x0b] matches a white space character
\s [^ \n\r\t\f\x0b] matches a non-white-space character
\w [a-za-z0-9_] matches alphanumeric and underline
\w [^a-za-z0-9_] matches a character other than an alpha-numeric underline
Quantifiers (The following quantifier appears as a single greedy quantifier)
Code description
* Match the preceding subexpression 0 or more times. For example, zo* can match "z" and "Zoo". * is equivalent to {0,}.
+ matches the preceding subexpression one or more times. For example, ' zo+ ' can match "Zo" and "Zoo", but cannot match "Z". + is equivalent to {1,}.
? Match the preceding subexpression 0 times or once. For example, "Do (es)" can match "do" in "do" or "does". is equivalent to {0,1}.
{n} n is a non-negative integer. Matches the determined n times. For example, ' o{2} ' cannot match ' o ' in ' Bob ', but can match two o in ' food '.
{N,} n is a non-negative integer. Match at least n times. For example, ' o{2,} ' cannot match ' o ' in ' Bob ' but can match all o in ' Foooood '. ' O{1,} ' is equivalent to ' o+ '. ' O{0,} ' is equivalent to ' o* '.
{n,m} m and n are non-negative integers, where n <= m. Matches n times at least and matches up to M times. Liu, "o{1,3}" will match the first three o in "Fooooood". ' o{0,1} ' is equivalent to ' o '. Notice that there is no space between the comma and the two number.
Greedy quantifiers and inert quantifiers
• Use greedy quantifiers to match, it first will be the whole string as a match, if the match to exit, if it does not match, cut off the last character to match, if not match, continue to the last character truncated to match until there is a match. Until now, the quantifiers we have encountered are greedy quantifiers.
• When using an inert classifier, it first matches the first character as a match, if it succeeds, and if it fails, then tests the first two characters, depending on the number, until a suitable match is encountered.
The inert quantifier simply adds a "?" to the greedy quantifier. Just, like "A +" is greedy match, "A +?" It's inert.
Copy Code code as follows:
var str = "ABC";
var re =/\w+/;//will match ABC
Re =/\w+?/;//will match a
Multi-line mode
Copy Code code as follows:
var re =/[a-z]$/;
var str = "Ab\ncdef";
Alert (Str.replace (Re, "#"));//ab\ncde#
Re =/[a-z]$/m;
Alert (Str.replace (Re, "#"));//a#\ncde#
Grouping and non-capturing groupings
Copy Code code as follows:
Re =/abc{2}/;//will match ABCC
Re =/(ABC) {2}/;//will match abcabc
The above grouping is a catch group
str = "Abcabc ###";
arr = re.exec (str);
Alert (arr[1]);//abc
Non-capture grouping (?:)
Re =/(?: ABC) {2}/;
arr = re.exec (str);
Alert (arr[1]);//undefined
Candidate (that is, "or")
Copy Code code as follows:
Re =/^a|bc$/;//will match BC at a or end of start position
str = "Add";
Alert (Re.test (str));//true
Re =/^ (A|BC) $/;//will match a or BC
str = "BC";
Alert (Re.test (str));//true
Once the regular expression containing the grouping has been test,match,search these methods, each grouping is placed in a special place for future use, which is a special value in the grouping, which we call a reverse reference
JS Code
Copy Code code as follows:
var re =/(A? B? (C))) /;
/* The previous regular expression will produce three groupings in turn
A? B? (C))) Most out of the
B? (C?))
(C?) */
str = "ABC";
Re.test (str);//The reverse reference is stored in the static property $1-$9 of the RegExp object
Alert (regexp.$1+ "\ n" +regexp.$2+ "\ n" +regexp.$3);
Reverse references can also be used in regular expressions \1, \2 ... this kind of form uses
Re =/\d+ (\d) \d+\1\d+/;
str = "2008-1-1";
Alert (Re.test (str));//true
str = "2008-4_3";
Alert (Re.test (str));//false
You can use a reverse reference to require that characters in a string have to be the same in several places. In addition, a special character sequence can be used to represent a reverse reference in a method such as replace
JS Code
Copy Code code as follows:
Re =/(\d) \s (\d)/;
str = "1234 5678";
Alert (Str.replace (RE, "$ $"); in this. The first grouping 1234,$2 represents 5678
Other--〉 are forward-looking, used to capture characters that appear before a particular character, and only when the character is followed by a particular word character character to capture it. A negative perspective corresponding to forward-looking, which matches a character only when it is followed by a specific character. When performing operations such as forward and negative foresight, the regular expression engine pays attention to the part behind the string, but does not move the index
JS Code
Copy Code code as follows:
Forward forward
Re =/([a-z]+ (? =\d))/I;
We want to match the word followed by a number, and then return the word instead of returning the number
str = "ABC every1 ABC";
Alert (Re.test (str));//true
alert (regexp.$1);//every
alert (re.lastindex); The advantage of using forward-looking is that the forward-looking content (? =\d) is not considered a match, and the next match still starts with it.
Negative forward (?!)
Re =/([A-z] (?!) \d))/;i
Will match letters that do not contain numbers, and will not return (?! \d) content in the
str = "ABC1 one";
Alert (Re.test (str));
alert (regexp.$1);//one
Build a regular expression that verifies the validity of an e-mail address. e-mail address validity requirements (we would like to define): The user name can only contain alphanumeric and underscores, at least one digit, up to 25 digits, followed by the @, followed by the domain name, and domain names require only alphanumeric and minus signs (-), and cannot begin or end with a minus sign. Then followed by the domain name suffix (can have more than one), the domain suffix must be the dot number connected to the 2-4-digit English alphabet
JS Code
Copy Code code as follows:
var re =/^\w{1,15} (?: @ (?!) -) (?:(?: [a-z0-9-]*) (?: [A-z0-9] (?! -))(?:\. (?! -)) +[a-z]{2,4}$/;