JavaScript's RegExp objects and string objects define methods that use regular expressions to perform powerful pattern matching and text retrieval and substitution functions.
In JavaScript, regular expressions are represented by a RegExp object. Of course, you can use a regexp () constructor to create RegExp objects, or you can use JavaScript A new special syntax added in 1.2 to create the RegExp object. Just as a string literal is defined as a character enclosed in quotation marks, the regular expression literal is also defined as a character that is contained between a pair of slashes (/). Therefore, JavaScript may contain the following code:
var pattern =/s$/;
This line of code creates a new RegExp object and assigns it to the variable parttern. This particular RegExp object matches all strings that end with the letter "s". You can also define an equivalent regular expression by using regexp (), as follows:
var pattern = new RegExp ("s$");
Whether using a regular expression directly or using a constructor regexp (), it is easy to create a RegExp object. The more difficult task is to use regular expression syntax to describe the pattern of characters. JavaScript is a fairly complete subset of the regular expression syntax for Perl language .
The pattern specification for regular expressions is made up of a series of characters. Most characters, including all alphanumeric characters, describe characters that are matched literally. In this case, the regular expression/java/and all of the containing substring "Java" String. Although the other characters in the regular expression are not matched by literal meaning, they all have special meanings. The regular expression/s$/contains two characters.
The first special character "S" is the literal meaning of matching itself. The second character "$" is a special character that matches the end of the string. So the regular expression/s$/matches the end of the letter "s".
The string.
1. Direct measure character
We have found that in regular expressions all alphabetic characters and numbers are matched by literal meaning to themselves. The regular expression of JavaScript also supports some of the non-, through escape sequences that begin with a backslash (/).
Alphabetic characters. For example, the sequence "n" matches a literal newline character in a string. In regular expressions, many punctuation marks have special meanings. Here are the characters and their meanings:
The direct measure character of a regular expression
Character matching
________________________________
Alpha-numeric characters themselves
/F Page Breaks
/N Line Feed
/R Carriage Return
/T tab
/V Vertical Tabs
//One/Direct quantity
//One/Direct quantity
/ . One. Direct quantity
/* A * Direct quantity
/+ A + direct quantity
/ ? One? Direct quantity
/ | One | Direct quantity
/(One (direct quantity
/) a direct amount
/[One [Direct quantity
/] a direct amount
/{One {direct amount
/} A direct amount
/XXX ASCII code characters specified by decimal number xxx
/xnn ASCII code characters specified by hexadecimal number nn
/CX control character ^x. For example,/CI is equivalent to/T,/CJ is equivalent to/n
___________________________________________________
If you want to use special punctuation marks in regular expressions, you must precede them with a "/".
2. Character class
You can combine individual direct characters into a character class by putting them in brackets. A character class matches any of the characters it contains, so the regular expression/[ABC]/And the letter "a", "B", and "C" all match. In addition, you can define a negative character class, These classes match all characters except those contained within the brackets. To define a negative character tip, use a ^ symbol as the first character from the left bracket. The collection of regular expressions is/[a-za-z0-9]/.
Because some character classes are very common, the regular expression syntax for JavaScript contains special characters and escape sequences to represent these commonly used classes. For example,/s matches spaces, tabs and other whitespace characters, and s matches any character other than whitespace.
Regular table-Gray character classes
Character matching
____________________________________________________
[...] Any character that is within the parentheses
[^...] Any character not in parentheses
. Any character other than a line break, equivalent to [^/n]
/w any single word character, equivalent to [a-za-z0-9]
/w any non-word character, equivalent to [^a-za-z0-9]
/s any blank character, equivalent to [/t/n/r/f/V]
/S any non-white-space character, equivalent to [^/t/n/r/f/V]
/d any number, equivalent to [0-9]
/d any character other than numbers, equivalent to [^0-9]
[/b] a backspace direct amount (special case)
________________________________________________________________
3. Copy
With the above regular table syntax, two digits can be described as//d/d/, and the four-digit number is described as//d/d/d/d/. But we don't have a way to describe a number with any number of digits or a
String. This string is composed of three characters and a number following the letter. These complex patterns use regular expression syntax that specifies the number of times each element in the expression repeats.
Specifies that the copied characters always appear after the mode in which they are acting. Because some types of replication are fairly common. So there are some special characters that are specifically used to represent them. For example, the + number matches the pattern of copying the previous pattern one or more times. The following table lists the replication syntax. First look at an example:
D{2, 4}///match numbers between 2 and 4.
W{3}/d?///Match three single characters and an arbitrary number.
s+java/s+///matches the string "Java", and can have one or more spaces before and after that string.
/[^ "] *///Match 0 or more non-quote characters.
Copy character of regular expression
Character meaning
__________________________________________________________________
{n, m} matches the previous item at least n times, but not more than m times
{n,} matches n times before, or multiple
{n} matches the previous item exactly n times
? Matches the previous item 0 or 1 times, which means the previous item is optional. Equivalent to {0, 1}
+ matches 1 or more times before, equivalent to {1,}
* Match the previous item 0 or more times. Equivalent to {0,}
___________________________________________________________________
4. Select, Group and reference
The syntax of a regular expression also includes specifying a selection, grouping the subexpression, and referencing the special characters of the previous subexpression. Character | Used to separate the characters for selection. For example:/ab|cd|ef/matches the string "AB", or the string "CD", or "EF". d{3}| [A-z] {4}/matches either a three-digit number or four lowercase letters. Parentheses have several functions in regular expressions. Its main function is to separate the items into a subexpression so that it can be treated like a separate unit with *, +, or. To deal with those projects. For example:/java (script)?/matches the string "Java", which can be either "script" or not. /(AB|CD) + |ef)/match can be either the string "EF" or the string "AB" or "CD" once or multiple repetitions.
In a regular expression, the second purpose of parentheses is to define the child mode in the complete pattern. When a regular expression succeeds in matching the target string, the You can extract the part of the target string that matches the child pattern in parentheses. For example, suppose that the pattern we are retrieving is followed by one or more digits, then we can use the pattern/[A-Z] +/d+/. But since we're supposed to be really concerned with the numbers of each matching tail, so if we put the numeric portion of the pattern in parentheses (/[A-Z] + (/d+)/), we can extract the numbers from any matches retrieved, and then we'll parse that.
Another use of the parenthetical subexpression is to allow us to refer to the preceding subexpression after the same regular expression. This is accomplished by adding one or more digits to the string. The number refers to the position of the subexpression in the regular expression. For example:/1 refers to the first bracket subexpression. /3 refers to the third bracket subexpression. Note that because the subexpression can be nested within other subexpression, its position is the position of the left parenthesis being counted.
For example: The following regular expression is specified AS/2:
/([Jj]ava ([Ss]cript))/sis/s (fun/w*)/
A reference to the previous subexpression in the regular expression does not specify the pattern of that subexpression, it's the text that matches that pattern. So the reference is not just a shortcut to help you enter the repeating part of the regular expression, it also implements a statute That's a string. The separate parts of the strings contain exactly the same characters. For example, the following regular expression matches all characters that are within a single or double quotation mark. However, it requires quotation marks that start and end to match (for example, two are double quotes or single quotes):
/[' "] [^ ' "]*[' "]/
If you require quotation marks to match the start and end, we can use the following reference:
/([' "]) [^ '"] */1/
/1 matches the pattern that is matched by the first parenthetical subexpression. In this example, it implements a specification that the opening quotation marks must match the closing quotation marks. Note that if the backslash follows a number more than the number of subexpression brackets, then it is parsed into a decimal escape sequence. Instead of a reference. You can persist in using the full three characters to represent the escape sequence, which avoids confusion. For example, use/044 instead of/44. The following are the selection, grouping, and reference characters for regular expressions:
Character meaning
________________ ______________________
| Select. Matches either the subexpression to the left of the symbol, or the subexpression on the right side of it
(...). divides several items into one unit. This unit can be made up of *, +,. and | The use of symbols, you can also remember the characters that match this group for subsequent references that match the characters that are matched by the
/N and nth groupings. The grouping is a subexpression (possibly nested) in parentheses. The group number is a left to right count of the number of left parentheses
_________________ _____________________
5. Specify a matching location
We have seen that many elements in a regular expression can match one character of a string. For example:/s The match is just a blank character. There are also some regular expression elements that match the space between the characters with a width of 0, not the actual characters such as:/b matches the boundary of a word, which is the boundary between a/w character and a/w character. Like/b Such a character does not specify a character in any of the matched strings, they specify a valid location for the match to occur. Sometimes we call these elements the anchors of regular expressions. Because they position the pattern in a specific position in the retrieved string. The most commonly used anchor element is ^, which makes the pattern dependent on the beginning of the string, and the anchor element $ causes the pattern to be positioned at the end of the string.
For example, to match the word "javascript", we can use regular expressions/^ JavaScript $/. If we want to retrieve the word "Java" itself (not as a prefix in "JavaScript"), then we can use the pattern//s java/s/, which requires a space before and after the word java. But there are two problems with this. First: If "Java" appears in the The beginning or end of a character. The pattern will not match unless there is a space at the beginning and end. Second: When this pattern finds a matching character, it returns a matching string with spaces at the front and back end, which is not what we want. Therefore, we use the boundary/b of the word to replace the true spaces/s to match. The resulting expression is//b java/b/.
The following are the anchor characters for the regular expression:
Character meaning
____________________________________________________________________
^ matches the beginning of a character, and in multiple-line retrieval, it matches the beginning of a line
The $ match is the end of the character, and in multiple-row retrieval, the match is the end of a line
/b matches the boundary of a word. In short, the position between the character/w and the/w (note: [/b] matches the backspace)
/b matches the character of a non-word boundary
_____________________________________________________________________
6. Property
The syntax for regular expressions also has the last element, that's the property of the regular expression, which shows the rules for advanced pattern matching. Unlike other regular expression syntaxes, attributes are described outside of the/symbol. That is, they do not appear between the two slashes, but are positioned after the second slash. JavaScript 1.2 supports two properties. Attribute I shows that pattern matching should be case insensitive. Attribute g indicates that pattern matching should be global. That is, you should find all the matches in the retrieved string. These two properties combine to perform a global, case-insensitive match.
For example: To perform a case-insensitive retrieval to find the first concrete value of the word "Java" (or "Java", "Java", and so on), we can use the//b java/b/i of the size-insensitive regular expression. If you want to find all of the "Java" concrete in a string Value, we can also add property g, which is//b java/b/gi.
The following are the properties of the regular expression:
Character meaning
_________________________________________
I perform case insensitive match
G Performs a global match, in short, by finding all matches, rather than stopping the
_________________________________________
except properties G and I after the first one is found. Regular expressions have no other attribute-like attributes. If the static property of the constructor RegExp is set to true multiline, then pattern matching will be in a multiline mode. In this mode, the anchor characters ^ and $ match not just the beginning and end of the string, Also matches the beginning and end of a line within the retrieved string. For example, the pattern/java$/matches "Java", but does not match "Java/nis fun". If we set the multiline attribute, the latter will also be matched:
Regexp.multiline = true;
The regular expression (regular expression) object contains a regular expression pattern. It has attributes (properties) and methods (methods) that match or replace a particular character (or set of characters) in a string (string) with a regular expression pattern. To add a property to a single regular expression, you can use the regular expression constructor (constructor function), regardless of when a preset regular expression that is invoked has a static property (the predefined RegExp object has Static properties that are set whenever any regular expression is used, I do not know if I turned the right, the original list, please self-translation. Create:
A text format or regular expression constructor
Text Format:/pattern/flags
Regular expression constructor: New RegExp ("pattern" [, "flags"]); Parameter description:
Pattern--a regular expression literal
Flags--if present, will be the following values:
G: Global Match
I: Ignore case
GI: Above combination
[note] The arguments in the text format do not use quotation marks, but the arguments that are used when using the constructor require quotes. such as:/ab+c/i new RegExp ("Ab+c", "I") is the implementation of the same function. In constructors, some special characters need to be transferred (plus "/" before a special character). such as: Re = new RegExp ("//w+")
Special characters in regular expressions
Character |
Implications |
/ |
As a turn, the characters usually after "/" are not interpreted according to the original meaning, such as the/b/matching character "B", when B is preceded by a backslash//b/, which means to match the boundary of a word. Or A restore of a regular expression feature character, such as "*" that matches its preceding metacharacters 0 or more times,/a*/will match a,aa,aaa, and after "/",/a/*/will only match "a *". |
^ |
Match an input or the beginning of a line,/^a/matches "an A", but does not match "an A" |
$ |
Match an input or end of a line,/a$/matches "an A" and does not match "an A" |
* |
Matches the preceding metacharacters 0 or more times,/ba*/will match b,ba,baa,baaa |
+ |
Matches the preceding metacharacters 1 or more times,/ba*/will match ba,baa,baaa |
? |
Match the preceding metacharacters 0 or 1 times,/ba*/will match B,ba |
(x) |
Match x Save x in a variable named $1...$9 |
X|y |
Match x or Y |
N |
Exact Match n times |
{N,} |
Match n times above |
{N,m} |
Matching n-m times |
[XYZ] |
Character set (character set) that matches any one by one characters (or metacharacters) in this collection |
[^XYZ] |
does not match any one of the characters in this collection |
[/b] |
Match a backspace |
/b |
Match the bounds of a word |
/b |
Match the non-boundary of a word |
/cx |
Here, X is a control character,//cm/match ctrl-m |
/d |
Matches a character number character,//d/=/[0-9]/ |
/d |
Matches a non-word number character,//d/=/[^0-9]/ |
/n |
Match a line feed |
/R |
Match a return character |
/s |
Match a blank character, including/n,/r,/f,/t,/v, etc. |
/S |
Matches a non-white-space character equal to/[^/n/f/r/t/v]/ |
/t |
Match a tab |
/V |
Match a heavy-straight tab |
/w |
Match a character that can make up a word (alphanumeric, which is my transliteration, with numbers), including underscores, like [/w] matches 5 in "$5.98", equals [a-za-z0-9] |
/w |
Matches a character that cannot be made into words, such as [/w] matches $ in "$5.98", equal to [^a-za-z0-9]. |
|
Having said so much, let's look at some examples of the actual application of regular expressions:
e-mail address verification:
function Test_email (stremail) ... {
var myreg =/^[_a-z0-9]+@ ([_a-z0-9]+.) +[A-Z0-9] ... {2,3}$/;
if (Myreg.test (Stremail)) return true;
return false;
}
Masking of HTML code
function Mask_htmlcode (strinput) ... {
var Myreg =/< (w+) >/;
Return Strinput.replace (Myreg, "<$1>");
}
Properties and methods of regular expression objects
Predefined regular expressions have the following static properties: Input, Multiline, Lastmatch, Lastparen, Leftcontext, Rightcontext, and $ $. Where input and multiline can be preset. The values of other properties are assigned different values according to different conditions after the exec or test method has been executed. Many attributes have both long and short (Perl-style) two names, and the two names point to the same value. (JavaScript simulates Perl's regular expression)
The properties of the regular Expression object
Property |
Meaning |
$1...$9 |
If it exists, it is the substring of the match. |
$_ |
See Input |
$* |
See Multiline |
$& |
See Lastmatch |
$+ |
See Lastparen |
$` |
See Leftcontext |
$ |
See Rightcontext |
Constructor |
Create a special function prototype for an object |
Global |
Whether to match in the entire string (bool type) |
IgnoreCase |
Whether to ignore case (bool type) when matching |
Input |
String to be matched |
Lastindex |
Last-matched index |
Lastparen |
Substring enclosed in the last bracket |
Leftcontext |
Last match with left substring |
Multiline |
Whether to do multiple rows matching (bool type) |
Prototype |
Allow attached properties to object |
Rightcontext |
Last matching substring with right |
Source |
Regular expression pattern |
Lastindex |
Last-matched index |
|
Methods of regular Expression objects
method |
meaning | /tr>
compile |
Regular expression comparison |
exec |
Perform a lookup |
test |
to match |
Tosource |
Returns the definition of a particular object (literal representing), whose value can be used to create a new object. Overloaded Object.tosource method is obtained. |
toString |
Returns a string for a particular object. Overloaded Object.ToString method is obtained. The |
valueof |
Returns the original value of a particular object. Overloaded Object.valueof method gets |
|
Example
< script language = "JavaScript" >
var Myreg =/(w +) s (w +)/;
var str = "John Smith";
var newstr = str.replace (Myreg, "$, $");
document.write (NEWSTR);
</script >
Will output "Smith, John"
The checksum is all made up of numbers
function IsDigit (s)
... {
var patrn=/^[0-9] ... {1,20}$/;
if (!patrn.exec (s)) return false
return True
}
Verify Login Name: Can only enter 5-20 letters beginning with a letter, can be with numbers, "_", "." The string