JavaScript's RegExp Object

Source: Internet
Author: User
Tags character classes alphanumeric characters

A regular expression is an object that describes the character pattern.
JavaScript's RegExp objects and string objects define methods that use regular expressions to perform powerful pattern-matching and text-retrieval and substitution functions.
‘***********************
' JavaScript

‘***********************
In JavaScript, a regular expression is represented by a RegExp object. Of course, you can use a regexp () constructor to create a RegExp object.
You can also create a RegExp object with a special syntax that is newly added in JavaScript 1.2. Just like a string literal is defined as a character enclosed in quotation marks.
The direct amount of a regular expression is also defined as a character that is contained between a pair of slashes (/). Therefore, JavaScript may contain the following code:
var pattern =/s$/;
This line of code creates a new RegExp object and assigns it to the variable parttern. This special RegExp object matches all strings that end with the letter "s". The RegExp () can also be defined
An equivalent regular expression with the following code:
var pattern = new RegExp ("S $");
It is easier to create a RegExp object either by using the regular expression directly or by using the constructor RegExp (). The more difficult task is to use regular expression syntax to describe the pattern of the character.
JavaScript uses a fairly complete subset of the Perl language regular expression syntax.
The pattern specification of a regular expression is made up of a series of characters. Most characters (including all alphanumeric characters) describe characters that are matched by literal meaning. So the regular expression/java/
All strings that contain the substring "Java" match. Although the other characters in the regular expression are not matched by literal meaning, they all have special meanings. The regular expression/s$/contains two characters.
The first special character "S" is the literal meaning that matches itself. The second character "$" is a special character that matches the end of a string. So the regular expression/s$/matches the letter "S" to the end.
The string.

1. Direct Volume characters
We have found that all alphabetic characters and numbers in regular expressions match the literal meaning of themselves. The regular expression of JavaScript also supports some non-null by escaping sequences beginning with a backslash (\)
Alphabetic characters. For example, the sequence "\ n" matches a direct volume line break in a string. In regular expressions, many punctuation marks have special meanings. Here are the characters and what they mean:
Direct-volume characters for regular expressions
Character matching
________________________________
Alpha-numeric characters themselves
\ f Page Break
\ n line break
\ r Enter
\ t tab
\ v Vertical Tab
\/A/Direct volume
\ \ a \ Direct volume
\ . One. Direct volume
\ * One * direct volume
\ + one + direct volume
\ ? One? Direct volume
\ | A | Direct volume
\ (One (direct volume
\) A direct amount of
\ [one [direct volume]
\] A] direct volume
\ {One {direct volume
\} One} Direct volume
\ XXX ASCII code character specified by decimal number XXX
\ Xnn ASCII code character specified by hexadecimal number nn
\ CX control character ^x. For example, \ci is equivalent to \ t, \CJ is equivalent to \ n
___________________________________________________
If you want to use special punctuation in regular expressions, you must precede them with a "\".

2. Character classes
You can combine individual direct characters into a character class by putting them inside brackets. A character class matches any one of the characters it contains, so the regular expression/[ABC]/And the letter "a", "B", "C" any one
are matched. You can also define negative character classes that match all characters except those contained within the brackets. When defining a negative character tip, a ^ symbol is used as the first
A character. The collection of regular expressions is/[a-za-z0-9]/.
Because some character classes are very common, JavaScript's regular expression syntax contains special characters and escape sequences to represent these common classes. For example, \s matches spaces, tabs, and other whitespace characters, \s
Matches any character other than the white space symbol.
Regular table gray type character class
Character matching
____________________________________________________
[...] Any character that is within the parentheses
[^...] Any character that is not in parentheses
. Any character other than line break is equivalent to [^\n]
\w any single character, equivalent to [a-za-z0-9]
\w any non-single character, equivalent to [^a-za-z0-9]
\s any whitespace character, equivalent to [\ t \ n \ r \ f \ V]
\s any non-whitespace character, equivalent to [^\ t \ n \ r \ f \ V]
\d any number, equivalent to [0-9]
\d any character other than a number, equivalent to [^0-9]
[\b] A backspace direct volume (special case)
________________________________________________________________
3. Copying
Using the syntax of the regular table above, you can describe the two-digit number AS/\ d \ d/, and describe the four-digit number AS/\d \ d \ d \ d/. But we have not yet had a way to describe a number with any number of digits or a
String. This string consists of three characters and a number followed by a letter. These complex patterns use the regular expression syntax to specify the number of occurrences of each element in the expression.
Specifies that the copied characters always appear after the pattern they are in. Because some type of replication is quite common. So there are special characters that are specifically used to represent them. For example, the + number matches the previous mode.
or multiple modes. The following table lists the replication syntax. Let's look at an example:
/\d{2, 4}///matches numbers between 2 and 4.
/\W{3} \d?///matches three single character and an arbitrary number.
/\s+java\s+///matches the string "Java", and it can have one or more spaces before and after it.
/[^ "] *///Match 0 or more non-quoted characters.

Copy characters of regular expressions
Character meaning
__________________________________________________________________
{n, m} matches the previous item at least n times, but cannot exceed m times
{N,} matches the previous item n times, or multiple times
{n} matches the previous item exactly n times
? Matches the previous item 0 or 1 times, which means the previous item is optional. Equivalent to {0, 1}
+ matches the previous item 1 or more times, equivalent to {1,}
* matches the previous item 0 or more times. Equivalent to {0,}
___________________________________________________________________

4. Select, Group and reference
The syntax for regular expressions also includes specifying selection items, grouping and referencing special characters for the previous subexpression. Characters | Used to separate the characters for selection. For example:/ab|cd|ef/matches the string "AB", or is
The string "CD", or "EF". /\d{3}| [A-z] {4}/matches either a three-digit number or four lowercase letters. Parentheses have several functions in regular expressions. Its main function is to group individual items
Chengzi expression so that it can be treated as a separate unit with a *, +, or? To deal with those projects. For example:/java (script)?/matches the string "Java", after which there can be either "script" or not. /
(AB|CD) + |ef)/match can be either a string "EF" or a single or multiple repetition of the string "AB" or "CD".
In regular expressions, the second use of parentheses is to define a sub-pattern in a complete pattern. When a regular expression succeeds in matching the target string, it can be extracted from the target strings and matched with the sub-patterns in parentheses
. For example, suppose we are retrieving a pattern that follows one or more letters followed by a single or multiple digits, then we can use the pattern/[A-z] + \ d+/. But because assuming that we really care about every match,
Trailing numbers, then if we put the number part of the pattern in parentheses (/[A-Z] + (\d+)/), we can extract the numbers from any matches we've retrieved, and we'll parse that later.
Another use of parentheses is to allow us to refer to the preceding subexpression after the same regular expression. This is done by adding one or more digits to the string \. The number refers to the parentheses
The position of the subexpression in the regular expression. For example: \1 refers to the subexpression of the first parenthesis. \3 refers to the third parenthetical subexpression. Note that because subexpression can be nested in other sub-expressions,
So its position is the position of the left parenthesis that is counted.
For example, the following regular expression is specified as \ 2:
/([Jj]ava ([Ss]cript)] \sis \s (fun\w*)/

The reference to the previous subexpression in the regular expression specifies not the pattern of the subexpression, but the text that matches that pattern. So, the reference is not just a quick way to help you enter the repeating part of the regular expression.
Jie way, it also implements a statute, that is, a string of separate parts contain exactly the same character. For example: The following regular expression matches all words in single or double quotation marks
character. However, it requires that the opening and closing quotes match (for example, two are double quotes or both are single quotes):
/[' "] [^ '"]*[' "]/

If you want the quotation marks to start and end to match, we can use the following reference:
/([' "]) [^ '"] * \1/

\1 matches a pattern that matches the subexpression of the first parenthesis. In this example, it implements a protocol that begins with quotes that must match the closing quotation marks. Note that if the backslash follows a number that is
Parentheses have many sub-expressions, then it is parsed into a decimal escape sequence instead of a reference. You can persist in using the full three characters to represent the escape sequence, which avoids confusion. For example,
Use \044 instead of \44. The following are the selection, grouping, and reference characters for regular expressions:
Character meaning
____________________________________________________________________
| Select. Matches either the subexpression to the left of the symbol, or the sub-expression to the right of it
(...) Grouping. Divide several items into one unit. This unit can be made up of *, +,? and | Symbols are used, and the characters that match this group are also remembered for subsequent citation
With the use
\ n matches the characters of the nth grouping. Groupings are sub-expressions in parentheses (possibly nested). The number of brackets is the left-to-right count
____________________________________________________________________

5. Specify a matching location
As we have seen, many elements in a regular expression are able to match one character of a string. For example: \s matches just one whitespace. There are also elements of a regular expression that match the width between characters
0 of the space, not the actual characters for example: \b matches the boundary of a word, that is, the boundary between a/w character and a \w character. Characters like \b Do not specify any of the matching
The characters in the string that specify the legal location where the match occurred. Sometimes we call these elements the anchors of regular expressions. Because they position the pattern in a specific location in the retrieved string. Most commonly used anchor elements
The element is ^, which makes the pattern dependent on the beginning of the string, while the anchor elements $ causes the pattern to be positioned at the end of the string.
For example: to match the word "javascript", we can use regular expressions/^ JavaScript $/. If we want to retrieve the word "Java" itself (unlike in "JavaScript" as a prefix), then we can make
Using the schema/\s Java \s/, it requires spaces before and after the word java. But there are two problems with this. First: If "Java" appears at the beginning or end of a character. This mode will not match, except [huoho.com edit]
There is not a space at the beginning and end. Second: When this pattern finds a matching character, it returns a matching string with spaces at the front and back end, which is not what we want. So we use words
The boundary \b to match the real space \s. The result expression is/\b java \b/.
The following is the anchor character of the regular expression:

Character meaning
____________________________________________________________________
^ matches the beginning of the character, and in multi-line retrieval, matches the beginning of a line
$ matches the end of the character, and in multi-line retrieval, matches the end of a line
\b matches the boundary of a word. In short, it is the position between the characters \w and \w (note: [\b] matches backspace)
\b matches the character of a non-word boundary
_____________________________________________________________________

6. Properties
The syntax for regular expressions is also the last element, which is the property of the regular expression, which describes the rules for advanced pattern matching. Unlike other regular expression syntaxes, attributes are described outside the/symbol. That is, it
They do not appear between two slashes, but after the second slash. JavaScript 1.2 supports two properties. Attribute I indicates that pattern matching should be case insensitive. The attribute G indicates that pattern matching should be global.
That is, all matches in the retrieved string should be found. Together, these two properties can perform a global, case-insensitive match.
For example, to perform a size-insensitive search to find the first specific value of the word "Java" (or "Java", "Java", and so on), we can use an insensitive regular expression/\b java\b/i. If you want to
In a string that finds all the specific values of "Java", we can also add the attribute g,/\b Java \b/gi.
The following are the properties of the regular expression:

Character meaning
_________________________________________
I perform a case insensitive match
G performs a global match, in short, finds all matches, instead of stopping after the first one is found
_________________________________________
In addition to the properties G and I, regular expressions have no other attribute-like properties. If you set the static property multiline of the constructor RegExp to true, the pattern match will be in multiline mode. In this
mode, the anchor characters ^ and $ do not just retrieve the beginning and end of a string, but also match the beginning and end of a row inside the retrieved string. For example: pattern/java$/matches "Java" but does not match
"Java\nis fun". If we set the Multiline property, the latter will also be matched:
Regexp.multiline = true;

JavaScript's RegExp Object

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.