Javascript Regexp object

Source: Internet
Author: User
Tags character classes
A regular expression is an object that describes the character mode.
The Regexp object and string object in Javascript define a method that uses regular expressions to execute powerful pattern matching and text retrieval and replacement functions.
'***********************
'Javascript
'
'***********************
In JavaScript, regular expressions are represented by a Regexp object. Of course, you can use a Regexp () constructor to create a Regexp object,
You can also use a new special syntax added in Javascript 1.2 to create a Regexp object, just as the string's direct quantity is defined as a character contained in quotation marks,
A regular expression is also defined as a character that contains a slash (/). Therefore, JavaScript may contain the following code:
VaR pattern =/S $ /;
This line of code creates a new Regexp object and assigns it to the parttern variable. this special Regexp object matches all strings ending with the letter "S. you can also use Regexp () to define
An equivalent regular expression. The Code is as follows:
VaR pattern = new Regexp ("S $ ");
It is easy to create a Regexp object, whether using a regular expression to directly count or using the constructor Regexp (). A more difficult task is to describe the character pattern using the regular expression syntax.
Javascript uses a fairly complete subset of Perl's regular expression syntax.
The pattern specification of a regular expression is composed of a series of characters. most characters (including all letters, numbers, and characters) Describe character matching by literal meaning. in this way, the regular expression/Java/and
All strings containing the substring "Java" are matched. although other characters in the regular expression do not match by literal meaning, they all have special meanings. regular Expression/S $/contains two characters.
The first special character "S" matches itself by literal meaning. the second character "$" is a special character that matches the end of a string. therefore, the regular expression/S $/matches the string ending with the letter "S ".
.

1. directly count characters
We have found that all the letters and numbers in the regular expression match their own literal meanings. the JavaScript regular expression also supports some non-
Letter character. for example, the sequence "\ n" matches a direct line feed in the string. many punctuation marks have special meanings in regular expressions. the following are the characters and their meanings:
The direct character count of the regular expression.
Character matching
________________________________
Letter/digit character
\ F page feed
\ N linefeed
\ R press ENTER
\ T Tab
\ V vertical Tab
\/One/Direct Volume
\ One \ Direct Volume
\. A. Direct Volume
\ * A * Direct Volume
\ + One + Direct Volume
\? One? Direct Volume
\ | One | direct quantity
\ (One (Direct Volume
\) One) Direct Volume
\ [A [Direct Volume
\] One Direct Volume
\ {One {direct quantity
\} Direct quantity of one}
\ Xxx ascii characters specified by the decimal number XXX
\ Xnn ASCII characters specified by hexadecimal NN
\ CX control character ^ X. For example, \ Ci is equivalent to \ t, \ CJ is equivalent to \ n
___________________________________________________
To use special punctuation marks in regular expressions, you must add "\" before them "\".

2. character classes
Put a separate direct character in brackets to form a character class. A character class matches any one of its characters, so any of the Regular Expressions/[ABC]/and letters "A", "B", "C"
All match. you can also define a negative character class that matches all characters except those contained in brackets. when defining a negative character tip, you must use a ^ symbol as
A character. The set of regular expressions is/[a-zA-z0-9]/.
Because some character classes are very commonly used, the regular expression syntax of JavaScript contains some special characters and escape sequences to represent these commonly used classes. for example, \ s matches space characters, tabs, and other blank characters, \ s
Match any character except the blank space.
Regular Expression gray character classes
Character matching
____________________________________________________
[...] Any character in parentheses
[^...] Any character not in parentheses
Any character except line breaks is equivalent to [^ \ n]
\ W any single character, equivalent to [a-zA-Z0-9]
\ W any non-single character, equivalent to [^ a-zA-Z0-9]
\ S any blank space character, equivalent to [\ t \ n \ r \ f \ v]
\ S any non-blank character, equivalent to [^ \ t \ n \ r \ f \ v]
\ D any number, equivalent to [0-9]
\ D any character except number, equivalent to [^ 0-9]
[\ B] A return direct quantity (Special Case)
________________________________________________________________
3. Copy
Using the regular expression syntax above, you can describe two digits as/\ D/and four digits as/\ D /. however, there is no way to describe any number with Multiple Digits or
String. the string consists of three characters and a digit following the letter. these complex patterns use the regular expression syntax to specify the number of times each element in the expression will appear again.
The specified characters always follow the pattern in which they are applied. some replication types are quite common. therefore, some special characters are used to indicate them. for example, if the number + matches, the previous mode is copied once.
Or multiple modes. The following table lists the replication syntax. Let's look at an example first:
// \ D {2, 4} // match the numbers between 2 and 4.
/\ W {3} \ D? /// Match three single-character characters and an arbitrary number.
// \ S + Java \ s + // match the string "Java", and there can be one or more spaces before and after the string.
/[^ "] * // Matches zero or multiple non-quoted characters.

Duplicate characters of Regular Expressions
Character meaning
__________________________________________________________________
{N, m} matches the first item at least N times, but cannot exceed M times
{N,} matches the previous item n times or multiple times.
{N} matches the first item EXACTLY n times.
? Match the first item 0 or 1, that is, the first item is optional. equivalent to {0, 1}
+ Match the previous item once or multiple times, equivalent to {1 ,}
* Match the first item 0 or multiple times. It is equivalent to {0 ,}
___________________________________________________________________

4. Select, group, and reference
The regular expression syntax also includes specifying selection items, grouping subexpressions, and referencing special characters of the previous subexpression. character | used to separate the selected characters. for example,/AB | cd | EF/matches the string "AB", or
String "cd", or "Ef ". /\ D {3} | [A-Z] {4}/matches either a three-digit number or four lower-case letters. brackets in regular expressions have several functions. it groups individual projects.
Into a sub-expression, so that it can be processed as an independent unit using *, +, or? To process those projects. For example:/Java (SCRIPT )? /Match the string "Java", which can be followed by either "script" or no ./
(AB | cd) + | ef)/matches either the string "Ef" or the string "AB" or "cd.
In a regular expression, the second purpose of parentheses is to define the child pattern in the complete pattern. When a regular expression matches the target string, you can extract the regular expression from the target string and match the child pattern in the brackets.
. For example, if the pattern we are retrieving is one or more letters followed by one or more digits, we can use the pattern/[A-Z] + \ D + /. however, given that we really care about each matching
The number at the end of the pattern, If we place the numeric part of the pattern in brackets (/[A-Z] + (\ D + )/), we can extract numbers from any matching results, and then we will parse the numbers.
Another purpose of the subexpression of parentheses is to allow us to reference the previous subexpression after the same regular expression. this is achieved by adding one or multiple digits after the string. number refers to the bracket
Position of a subexpression in a regular expression. for example, \ 1 references the first child expression in parentheses. \ 3 references the child expression of the third parenthesis. note that subexpressions can be nested in other subexpressions,
Therefore, its position is the left parenthesis of the count.
For example, the following regular expression is specified as \ 2:
/([JJ] Ava ([ss] Ghost) \ SIS \ s (fun \ W *)/

The reference to the first subexpression in a regular expression is not the pattern of the subexpression, but the text that matches the pattern. in this way, the reference is not just to help you enter the duplicate part of the regular expression.
It also implements a protocol, that is, the separation of each character string contains exactly the same characters. for example, the following regular expression matches all words in single or double quotation marks.
But it requires that the start and end quotation marks match (for example, both are double quotation marks or both are single quotation marks ):
/['"] [^'"] * ['"]/

If the start and end quotation marks are required to match, we can use the following reference:
/(['"]) [^'"] * \ 1/

\ 1 matches the pattern matched by the first child expression in parentheses. in this example, it implements a statute, that is, the start quotation marks must match the ending quotation marks. note: If the ratio of digits following the backslash is
If the number of child expressions in parentheses is large, it will be parsed into a decimal escape sequence instead of a reference. you can use the complete three characters to represent the escape sequence, which can avoid confusion. for example,
Use \ 044 instead of \ 44. The selection, grouping, and reference characters of the regular expression are as follows:
Character meaning
____________________________________________________________________
| Select. match either the child expression on the left of the symbol or the child expression on the right of the symbol.
(...) Grouping. Several projects are divided into one Unit. This unit can be divided by *, + ,? And |
Use
\ N matches the characters matching the nth group. The group is a subexpression (which may be nested) in the brackets. The group number is the number of left parentheses counted from left to right.
____________________________________________________________________

5. Specify the matched location
As we can see, many elements in a regular expression can match a character in a string. for example, \ s matches only a blank character. some elements in the regular expression match the character width
0 space, rather than the actual characters. For example, \ B matches the boundary of a word, that is, the boundary between A/W character and a \ W non-character. characters such as \ B do not specify any matching
The characters in the string, which specify the valid location where the match occurs. sometimes we call these elements the anchor of a regular expression. because they locate the pattern in a specific position in the search string. the most common anchor Element
Element is ^, which causes the pattern to depend on the start of the string, while element $ of the anchor causes the pattern to be located at the end of the string.
For example, to match the word "JavaScript", we can use a regular expression/^ JavaScript $ /. if we want to retrieve the word "Java" itself (not as prefix in "JavaScript"), we can
Use mode/\ s Java \ s/, which requires spaces before and after the word Java. but there are two problems. first, if "Java" appears at the beginning or end of a character. this mode will not match, except [huoho. edit com]
There is not a space at the beginning and end. second: When this mode finds a matched character, it returns a matched string with spaces at the front end and backend. This is not what we want. therefore, we use words
The result expression is/\ B Java \ B /.
The following are the anchor characters of the regular expression:

Character meaning
____________________________________________________________________
^ Match the beginning of a character. In multi-row search, match the beginning of a line
$ Matches the end of a character. In multi-row search, it matches the end of a row.
\ B matches the boundary of a word. In short, it is located between the characters \ W and \ W (Note: [\ B] matches the return character)
\ B matches non-word boundary characters
_____________________________________________________________________

6. Attributes
The regular expression syntax also has the last element, which is the attribute of the regular expression. It describes the rules for advanced pattern matching. unlike other regular expression syntaxes, attributes are described outside the/symbol. that is, it
They do not appear between two slashes, but are placed behind the second slashes. javascript 1.2 supports two attributes. attribute I indicates that pattern matching is case insensitive. attribute G indicates that pattern matching should be global. also
That is to say, we should find all the matches in the searched string. These two attributes can be combined to execute a global, case-insensitive match.
For example, you need to perform an insensitive search to find the first specific value of the word "Java" (or "Java", "Java", etc, we can use a non-sensitive regular expression/\ B Java \ B/I. if you want
Find all the specific values of "Java" in a string. We can also add the attribute g, that is,/\ B Java \ B/GI.
The following are the attributes of a regular expression:

Character meaning
_________________________________________
I. Perform case-insensitive matching.
G executes a global match. In short, it finds all the matches, instead of stopping them after finding the first one.
_________________________________________
In addition to attributes G and I, regular expressions do not have other features like properties. if you set the static attribute multiline of the Regexp constructor to true, the pattern matching is performed in multiline mode. here
In this mode, the anchor character ^ and $ match not only the start and end of the search string, but also the beginning and end of a row inside the search string. for example, the pattern/Java $/matches "Java" but does not match
"Java \ NIS fun". If the multiline attribute is set, the latter will also be matched:
Regexp. multiline = true;

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.