JavaScript Regular Expression notes

Source: Internet
Author: User
Tags character classes valid email address alphanumeric characters

A regular expression is an object that describes the character pattern.
JavaScript's RegExp objects and string objects define methods that use regular expressions to perform powerful pattern-matching and text-retrieval and substitution functions.

‘‘***********************
"' \\javascript//
‘‘
‘‘***********************

In JavaScript, a regular expression is represented by a RegExp object. Of course, you can use a regexp () constructor to create a RegExp object.
You can also create a RegExp object with a special syntax that is newly added in JavaScript 1.2. Just like a string literal is defined as a character enclosed in quotation marks.
The direct amount of a regular expression is also defined as a character that is contained between a pair of slashes (/). Therefore, JavaScript may contain the following code:

var pattern =/s$/;

This line of code creates a new RegExp object and assigns it to the variable parttern. This special RegExp object matches all strings that end with the letter "s". The RegExp () can also be defined
An equivalent regular expression with the following code:

var pattern = new RegExp ("S $");

It is easier to create a RegExp object either by using the regular expression directly or by using the constructor RegExp (). The more difficult task is to use regular expression syntax to describe the pattern of the character.
JavaScript uses a fairly complete subset of the Perl language regular expression syntax.

The pattern specification of a regular expression is made up of a series of characters. Most characters (including all alphanumeric characters) describe characters that are matched by literal meaning. So the regular expression/java/
All strings that contain the substring "Java" match. Although the other characters in the regular expression are not matched by literal meaning, they all have special meanings. The regular expression/s$/contains two characters.
The first special character "S" is the literal meaning that matches itself. The second character "$" is a special character that matches the end of a string. So the regular expression/s$/matches the letter "S" to the end.
The string.


1. Direct Volume characters

We have found that all alphabetic characters and numbers in regular expressions match the literal meaning of themselves. The regular expression of JavaScript also supports some non-null by escaping sequences beginning with a backslash (\)

Alphabetic characters. For example, the sequence "\ n" matches a direct volume line break in a string. In regular expressions, many punctuation marks have special meanings. Here are the characters and what they mean:

Direct-volume characters for regular expressions

Character matching
________________________________
Alpha-numeric characters themselves
\ f Page Break
\ n line break
\ r Enter
\ t tab
\ v Vertical Tab
\/A/Direct volume
\ \ a \ Direct volume
\ . One. Direct volume
\ * One * direct volume
\ + one + direct volume
\ ? One? Direct volume
\ | A | Direct volume
\ (One (direct volume
\) A direct amount of
\ [one [direct volume]
\] A] direct volume
\ {One {direct volume
\} One} Direct volume
\ XXX ASCII code character specified by decimal number XXX
\ Xnn ASCII code character specified by hexadecimal number nn
\ CX control character ^x. For example, \ci is equivalent to \ t, \CJ is equivalent to \ n

___________________________________________________

If you want to use special punctuation in regular expressions, you must precede them with a "\".


2. Character classes

You can combine individual direct characters into a character class by putting them inside brackets. A character class matches any one of the characters it contains, so the regular expression/[ABC]/And the letter "a", "B", "C" any one
are matched. You can also define negative character classes that match all characters except those contained within the brackets. When defining a negative character tip, a ^ symbol is used as the first
A character. The collection of regular expressions is/[a-za-z0-9]/.

Because some character classes are very common, JavaScript's regular expression syntax contains special characters and escape sequences to represent these common classes. For example, \s matches spaces, tabs, and other whitespace characters, \s
Matches any character other than the white space symbol.

Regular table gray type character class

Character matching
____________________________________________________
[...] Any character that is within the parentheses
[^...] Any character that is not in parentheses
. Any character other than line break is equivalent to [^\n]
\w any single character, equivalent to [a-za-z0-9]
\w any non-single character, equivalent to [^a-za-z0-9]
\s any whitespace character, equivalent to [\ t \ n \ r \ f \ V]
\s any non-whitespace character, equivalent to [^\ t \ n \ r \ f \ V]
\d any number, equivalent to [0-9]
\d any character other than a number, equivalent to [^0-9]
[\b] A backspace direct volume (special case)
________________________________________________________________

3. Copying

Using the syntax of the regular table above, you can describe the two-digit number AS/\ d \ d/, and describe the four-digit number AS/\d \ d \ d \ d/. But we have not yet had a way to describe a number with any number of digits or a

String. This string consists of three characters and a number followed by a letter. These complex patterns use the regular expression syntax to specify the number of occurrences of each element in the expression.

Specifies that the copied characters always appear after the pattern they are in. Because some type of replication is quite common. So there are special characters that are specifically used to represent them. For example, the + number matches the previous mode.

or multiple modes. The following table lists the replication syntax. Let's look at an example:

/\d{2, 4}///matches numbers between 2 and 4.

/\W{3} \d?///matches three single character and an arbitrary number.

/\s+java\s+///matches the string "Java", and it can have one or more spaces before and after it.

/[^ "] *///Match 0 or more non-quoted characters.


Copy characters of regular expressions

Character meaning
__________________________________________________________________
{n, m} matches the previous item at least n times, but cannot exceed m times
{N,} matches the previous item n times, or multiple times
{n} matches the previous item exactly n times
? Matches the previous item 0 or 1 times, which means the previous item is optional. Equivalent to {0, 1}
+ matches the previous item 1 or more times, equivalent to {1,}
* matches the previous item 0 or more times. Equivalent to {0,}
___________________________________________________________________


4. Select, Group and reference

The syntax for regular expressions also includes specifying selection items, grouping and referencing special characters for the previous subexpression. Characters | Used to separate the characters for selection. For example:/ab|cd|ef/matches the string "AB", or is

The string "CD", or "EF". /\d{3}| [A-z] {4}/matches either a three-digit number or four lowercase letters. Parentheses have several functions in regular expressions. Its main function is to group individual items

Chengzi expression so that it can be treated as a separate unit with a *, +, or? To deal with those projects. For example:/java (script)?/matches the string "Java", after which there can be either "script" or not. /

(AB|CD) + |ef)/match can be either a string "EF" or a single or multiple repetition of the string "AB" or "CD".

In regular expressions, the second use of parentheses is to define a sub-pattern in a complete pattern. When a regular expression succeeds in matching the target string, it can be extracted from the target strings and matched with the sub-patterns in parentheses

. For example, suppose we are retrieving a pattern that follows one or more letters followed by a single or multiple digits, then we can use the pattern/[A-z] + \ d+/. But because assuming that we really care about every match,

Trailing numbers, then if we put the number part of the pattern in parentheses (/[A-Z] + (\d+)/), we can extract the numbers from any matches we've retrieved, and we'll parse that later.

Another use of parentheses is to allow us to refer to the preceding subexpression after the same regular expression. This is done by adding one or more digits to the string \. The number refers to the parentheses

The position of the subexpression in the regular expression. For example: \1 refers to the subexpression of the first parenthesis. \3 refers to the third parenthetical subexpression. Note that because subexpression can be nested in other sub-expressions,

So its position is the position of the left parenthesis that is counted.
For example, the following regular expression is specified as \ 2:
/([Jj]ava ([Ss]cript)] \sis \s (fun\w*)/


The reference to the previous subexpression in the regular expression specifies not the pattern of the subexpression, but the text that matches that pattern. So, the reference is not just a quick way to help you enter the repeating part of the regular expression.

Jie way, it also implements a statute, that is, a string of separate parts contain exactly the same character. For example: The following regular expression matches all words in single or double quotation marks

character. However, it requires that the opening and closing quotes match (for example, two are double quotes or both are single quotes):
/["" [^ ' ""]*["]]/


If you want the quotation marks to start and end to match, we can use the following reference:
/([""]) [^ ']] * \1/


\1 matches a pattern that matches the subexpression of the first parenthesis. In this example, it implements a protocol that begins with quotes that must match the closing quotation marks. Note that if the backslash follows a number that is

Parentheses have many sub-expressions, then it is parsed into a decimal escape sequence instead of a reference. You can persist in using the full three characters to represent the escape sequence, which avoids confusion. For example,

Use \044 instead of \44. The following are the selection, grouping, and reference characters for regular expressions:

Character meaning
____________________________________________________________________
| Select. Matches either the subexpression to the left of the symbol, or the sub-expression to the right of it
(...) Grouping. Divide several items into one unit. This unit can be made up of *, +,? and | Symbols are used, and the characters that match this group are also remembered for subsequent citation

With the use
\ n matches the characters of the nth grouping. Groupings are sub-expressions in parentheses (possibly nested). The number of brackets is the left-to-right count
____________________________________________________________________

5. Specify a matching location

As we have seen, many elements in a regular expression are able to match one character of a string. For example: \s matches just one whitespace. There are also elements of a regular expression that match the width between characters

0 of the space, not the actual characters for example: \b matches the boundary of a word, that is, the boundary between a/w character and a \w character. Characters like \b Do not specify any of the matching

The characters in the string that specify the legal location where the match occurred. Sometimes we call these elements the anchors of regular expressions. Because they position the pattern in a specific location in the retrieved string. Most commonly used anchor elements

The element is ^, which makes the pattern dependent on the beginning of the string, while the anchor elements $ causes the pattern to be positioned at the end of the string.

For example: to match the word "javascript", we can use regular expressions/^ JavaScript $/. If we want to retrieve the word "Java" itself (unlike in "JavaScript" as a prefix), then we can make

Using the schema/\s Java \s/, it requires spaces before and after the word java. But there are two problems with this. First: If "Java" appears at the beginning or end of a character. The pattern does not match, except

There is not a space at the beginning and end. Second: When this pattern finds a matching character, it returns a matching string with spaces at the front and back end, which is not what we want. So we use words

The boundary \b to match the real space \s. The result expression is/\b java \b/.
The following is the anchor character of the regular expression:


Character meaning
____________________________________________________________________
^ matches the beginning of the character, and in multi-line retrieval, matches the beginning of a line
$ matches the end of the character, and in multi-line retrieval, matches the end of a line
\b matches the boundary of a word. In short, it is the position between the characters \w and \w (note: [\b] matches backspace)
\b matches the character of a non-word boundary
_____________________________________________________________________

6. Properties

The syntax for regular expressions is also the last element, which is the property of the regular expression, which describes the rules for advanced pattern matching. Unlike other regular expression syntaxes, attributes are described outside the/symbol. That is, it

They do not appear between two slashes, but after the second slash. JavaScript 1.2 supports two properties. Attribute I indicates that pattern matching should be case insensitive. The attribute G indicates that pattern matching should be global.

That is, all matches in the retrieved string should be found. Together, these two properties can perform a global, case-insensitive match.

For example, to perform a size-insensitive search to find the first specific value of the word "Java" (or "Java", "Java", and so on), we can use an insensitive regular expression/\b java\b/i. If you want to

In a string that finds all the specific values of "Java", we can also add the attribute g,/\b Java \b/gi.

The following are the properties of the regular expression:


Character meaning
_________________________________________
I perform a case insensitive match
G performs a global match, in short, finds all matches, instead of stopping after the first one is found
_________________________________________

In addition to the properties G and I, regular expressions have no other attribute-like properties. If you set the static property multiline of the constructor RegExp to true, the pattern match will be in multiline mode. In this

mode, the anchor characters ^ and $ do not just retrieve the beginning and end of a string, but also match the beginning and end of a row inside the retrieved string. For example: pattern/java$/matches "Java" but does not match

"Java\nis fun". If we set the Multiline property, the latter will also be matched:

Regexp.multiline = true;


The description and application of the expression function in JScript


As a powerful tool for text substitution, search, and extraction in pattern matching, the application of regular expressions (Regular expression) has gradually penetrated into network development from the UNIX platform, and as a server-side/client Script development language, JScript, More and more regular expression applications are being incorporated into them to compensate for their inability to deal with the text. Here, we take the JScript5.5 version as an example to give an overview of the application of regular expressions.
First we need to differentiate between the two objects in JScript about regular expressions: Regular expression objects and RegExp objects.
The former contains only information for a particular regular expression instance, and the latter reflects the characteristics of the closest pattern match through the properties of a global variable.
The former needs to specify a matching pattern before matching, that is to create an instance of the regular Expression object, and then pass it to a string method, or pass a string as a parameter to the regular expression instance, and the latter does not need to be created, It is an intrinsic global object, and each successful match operation result information is saved in the object's properties.

I. Properties of the RegExp object: The result information of the most recent successful match

Input: Save execution matching string (the target string being searched) (>=IE4)
Index: Holds the position of the matching first character *>=ie4)
LastIndex: Holds the position of the next character of the matched string (>=ie4)
Lastmatch ($&): Save Match to String (>=ie5.5)
Lastparen ($+): Saves the last sub-match of the match result (matches of the last parenthesis) (>=ie5.5)
Leftcontext ($ '): saves all characters before matching substrings in the target string (>=ie5.5)
Rightcontext ($ '): Saves all characters after matching substrings in the target string (>=ie5.5)
$-$9: Saves the first 9 sub-matches in the match (that is, the first 9 parentheses in the matching result) (>=IE4)

Two, Regular expression object Introduction
1.Regular Expression Object Definition
In the script using regular expression pattern matching, the first waist set matching pattern, the method has the following two kinds of
(1) rgexp=/pattern*/[flags*]
(2) Rgexp=new RegExp ("pattern", ["flags"])
Attention:
A. The escape character "\" in the latter pattern needs to be denoted by "\ \" To counteract the meaning of the "\" in JS Transit, otherwise JS first interprets the "\" character as its own escape concept.
The B.flags logo has the following (to JScript version 5.5)
G: Set current match to global mode
I: Ignore case detection in match
M: Multi-line search mode
2.Regular Expression Object Properties
(1) Rgexp.lastindex: matches the position of one character after the result, with Regexp.lastindex
(2) Regular expression matching pattern for RgExp.source:reExp objects
3.Regular Expression Object Methods
(1) rgexp.compile (Pattern,[flags])
Convert rgexp to internal format to speed up matching execution, which is more efficient for a large number of pattern-consistent matches
(2) rgexp.exec (str)
The STR string is matched according to the rgexp pattern, and when the Global search mode (g) is set in the Rgexp object, the matching lookup starts at the target string location specified by the RegExp object lastindex attribute; If global search is not set, Starts the search from the first character of the target string. If no match occurs, NULL is returned.
It is important to note that the method returns the matching result in an array with three properties
Input: Contains the target string, same as Regexp.index
Index: Match the position of the substring in the target string with the Regexp.index
LastIndex: Matches the position of a character after the substring to the same regexp.lastindex
(3) rgexp.test (str)
Returns a Boolean value that reflects whether a matching pattern exists in Str for the target string being looked up. The method does not change the properties of the RegExp
4. Methods related to regular expressions
Mainly refers to the method of applying pattern matching in a string object
(1) stringobj.match (RGEXP)
Finds the matching character entry in the string stringobj based on the regular expression pattern of the Rgexp object, returning the result as an array. The array has three property values, the same as the array properties returned by the Exec method. If there is no match, NULL is returned.
It should be noted that if the Rgexp object does not set the global match pattern, the array 0 subscript element is the overall content of the match, and 1~9 contains the characters obtained by the sub-match. If global mode is set, the array contains all the whole occurrences of the search.
(2) Stringobj.replace (Rgexp, ReplaceText)
Returns a string that will be returned after replacing a string that matches rgexp pattern in stringobj with ReplaceText. It is important to note that the stringobj itself does not change because of the substitution operation. If you expect all strings in stringobj that conform to the regular expression pattern to be replaced, set the regular expression pattern to global mode when you establish it.
(3) Stringobj.search (RGEXP)
Returns the position of the first substring matched to

Symbolic noun Explanation:
Position: Represents the offset of the substring from the first character of the target string
Reexp: Represents an instance of an regular expression object
Stringobj: Represents a String object
Pattern: Regular expression pattern
Flags: pattern identification of the matching operation

In the actual development of Web programs we can use regular expressions to achieve our string processing requirements
The following is a sample of four JScript routines that use regular expressions, which are used primarily to familiarize themselves with the use of regular expressions.

1.email Address validity detection

<script language= ' JScript ' >functionValidateemail (emailstr) {varRe=/^[\w.-][email protected] ([0-9a-z][\w-]+\.) +[a-z]{2,3}$/i;//or var re=new RegExp ("^[\\w.-][email protected" ([0-9a-z][\\w-]+\\.) +[a-z]{2,3}$ "," I "); if(Re.test (EMAILSTR)) {alert ("Valid email address!"); return true; } Else{alert ("Invalid email address!"); return false; }}</script>

2. String substitution operation

<script language= ' JScript ' >var  R, pattern, re; var s = "The rain in Spain falls mainly in the plain falls."  =/falls/= s.replace (Re, ' falling '); alert (' s = ' + S + ' \ n ' + ' re = ' + re); </script>


3. Pattern Lookup String

<script language= ' JScript ' >var  index, pattern; var str = "Four for fall fell fallen fallsing fall Falls Waterfalls"=/\bfalls\b/= Str.sea RCH (pattern); alert (' The position of match is at ' + index); </script>

3. Regular expression attribute routines

<script language= ' JScript ' >functionMatchattrib () {varS= "; varRe =NewREGEXP ("D (b +) (d)", "IG"); varstr = "CDBBBDBSBDBDZ";  while(arr = re.exec (str))! =NULL) {s+ = "=======================================<br>"; S+ = "$ returns:" + regexp.$1 + "<br>"; S+ = "$ returns:" + regexp.$2 + "<br>"; S+ = "$ returns:" + regexp.$3 + "<br>"; S+ = "Input returns:" + Regexp.input + "<br>"; S+ = "Index returns:" + Regexp.index + "<br>"; S+ = "LastIndex returns:" + Regexp.lastindex + "<br>"; S+ = "Lastmatch returns:" + Regexp.lastmatch + "<br>"; S+ = "Leftcontext returns:" + Regexp.leftcontext + "<br>"; S+ = "Rightcontext returns:" + Regexp.rightcontext + "<br>"; S+ = "Lastparen returns:" + Regexp.lastparen + "<br>"; S+ = "Arr.index returns:" + Arr.index + "<br>"; S+ = "Arr.lastindex returns:" + Arr.lastindex + "<br>"; S+ = "Arr.input returns:" + Arr.input + "<br>"; S+ = "Re.lastindex returns:" + Re.lastindex + "<br>"; S+ = "Re.source returns:" + Re.source + "<br>"; }   return(s);//Return results.}document.write (Matchattrib ());</script>

JavaScript Regular Expression notes

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.