[Reprinted] Javascript Regular Expression

Source: Internet
Author: User
Tags character classes

 

A regular expression is an object that describes the character mode.
The Regexp object and string object in Javascript define a method that uses regular expressions to execute powerful pattern matching and text retrieval and replacement functions.
In JavaScript, regular expressions are represented by a Regexp object. Of course, you can use a Regexp () constructor to create a Regexp object. You can also use JavaScript
A regular expression is also defined as a character that contains a slash (/). Therefore, JavaScript may contain the following code:

1 var pattern =/S $ /;

This line of code creates a new Regexp object and assigns it to the parttern variable. this special Regexp object matches all strings ending with the letter "S. regexp () can also be used to define an equivalent regular expression. The Code is as follows:

1 var pattern = new Regexp ("S $ ");

It is easy to create a Regexp object, whether using a regular expression to directly count or using the constructor Regexp (). A more difficult task is to describe the character pattern using the regular expression syntax.
Javascript uses a fairly complete subset of Perl's regular expression syntax.
The pattern specification of a regular expression is composed of a series of characters. most characters (including all letters, numbers, and characters) Describe character matching by literal meaning. in this way, the regular expression/Java/matches all strings containing the sub-string "Java. although other characters in the regular expression do not match by literal meaning, they all have special meanings. regular Expression/S $/contains two characters. the first special character "S" matches itself by literal meaning. the second character "$" is a special character that matches the end of a string. therefore, the regular expression/S $/matches a string ending with the letter "S.

The following describes in detail.
We have found that all the letters and numbers in the regular expression match their own literal meanings. javascript Regular Expressions also support some non-letter characters through escape sequences starting with a backslash. for example, the sequence "/N" matches a direct line feed in the string. many punctuation marks have special meanings in regular expressions. the following are the characters and their meanings:
The direct character count of the regular expression.
Character matching
________________________________
Letter and digit characters
/F page feed
/N linefeed
/R press ENTER
/T Tab
/V vertical Tab
// One/Direct Volume
// One/Direct Volume
/. A. Direct Volume
/* A * Direct Volume
/+ One + Direct Volume
/? One? Direct Volume
/| One (Direct Volume
/) One) Direct Volume
/[A [Direct Volume
/] One] Direct Volume
/{One {direct quantity
/} Direct quantity of one}
/Xxx ASCII characters specified by the decimal number XXX
/Xnn ASCII characters specified by hexadecimal NN
/CX control character ^ X. For example,/CI is equivalent
/T,/CJ is equivalent to/n
___________________________________________________
To use special punctuation marks in a regular expression, you must add "/" before them "/".

2. character classes
Put a separate direct character in brackets to form a character class. A character class matches any one of its characters, so the regular expression/[ABC]/and the letter "A", "B ", any one of "C" matches. you can also define a negative character class that matches all characters except those contained in brackets. when defining a negative character tip, you must use a ^ symbol as the first character counted from the left brackets. the set of regular expressions is/[a-zA-z0-9]/. because some character classes are very commonly used, the regular expression syntax of JavaScript contains some special characters and escape sequences to represent these commonly used classes. for example,/s matches space characters, tabs, and other spaces, And/s matches any character other than space characters.
Character classes of regular expressions

Character matching
____________________________________________________
[...] Any character in parentheses
[^...] Any character not in parentheses
.
Any character except line breaks is equivalent to [^/n]
/W any single character,
Equivalent to [a-zA-Z0-9]
/W
Any non-single character, equivalent to [^ a-zA-Z0-9]
/S any blank space character, equivalent to [/T
/N/R/f/V]
/S any non-blank character, equivalent to [^/
T/N/R/f/V]
/D any number, equivalent to [0-9]
/D
Any character except a number is equivalent to [^ 0-9].
[/B] A return direct quantity (Special Case)
________________________________________________________________

3. Copy
Using the regular expression syntax above, we can describe two digits as // D/and four digits as // D /. however, there is no way to describe any number with Multiple Digits or a string. the string consists of three characters and a digit following the letter. these complex patterns use the regular expression syntax to specify the number of times each element in the expression will appear again. the specified characters always follow the pattern in which they are applied. some replication types are quite common. therefore, some special characters are used to indicate them. for example:
+ The number is matched once in the previous copy mode.
Or multiple modes. The following table lists the replication syntax. Let's look at an example first:
// D {2, 4} // match the numbers between 2 and 4.
// W {3}/D? /// Match three single-character characters and an arbitrary number.
// S + Java/S + // match string "Java"
And there can be one or more spaces before and after the string.
/[^ "] * // Matches zero or multiple non-quoted characters.

Duplicate characters of Regular Expressions
Character meaning
__________________________________________________________________
{N, m} matches the first item at least N times, but cannot exceed M times
{N,} matches the previous item n times or multiple times.
{N} matches the first item EXACTLY n times.
? Match the first item 0 or 1, that is, the first item is optional. equivalent to {0, 1}
+ Match the previous item once or multiple times, equivalent to {1 ,}
* Match the first item 0 or multiple times. It is equivalent to {0 ,}
___________________________________________________________________

4. Select, group, and reference
The regular expression syntax also includes specifying selection items, grouping subexpressions and referencing special characters of the previous subexpression. Character | used to separate the selected characters. For example:
/AB | cd | EF/matches the string "AB" or
String "cd", or "Ef". // d {3} | [A-Z] {4 }/
The matching is either a three-digit number or four lower-case letters. In the regular expression, parentheses have several functions. Its main function is to group individual projects.
Into a sub-expression, so that it can be processed as an independent unit using *, +, or? To process those projects. For example:/Java (SCRIPT )? /
Match the string "Java", which can be followed by either "script" or ./
(AB | cd) + | ef)/matches either the string "Ef" or the string "AB" or "cd.
In a regular expression, the second purpose of parentheses is to define the child pattern in the complete pattern. When a regular expression matches the target string, you can extract the regular expression from the target string and match the child pattern in the brackets.
For example, if the pattern we are retrieving is one or more letters followed by one or more digits, we can use the pattern/[A-Z] +/
D +/. But given that we really care about each matching
The number at the end of the pattern, If we place the numeric part of the pattern in brackets (/[A-Z] + (/d + )/)
Then, we can extract numbers from any matching results, and then we will parse the numbers.
Another purpose of the parentheses subexpression is to allow us to reference the previous subexpression after the same regular expression/
Followed by a single or multiple-digit character. digits refer to braces
Position of a subexpression in a regular expression. For example,/1 references the first child expression in parentheses./3.
References the third child expression in parentheses. Note that because the child expression can be nested in other child expressions,
Therefore, its position is the left parenthesis of the count.
For example, the following regular expression is specified as/2:

/([JJ] Ava ([ss] Ghost)/SIS/s (fun/W *)/

The reference to the first subexpression in a regular expression is not the pattern of the subexpression, but the text that matches the pattern. in this way, the reference is not just to help you enter the duplicate part of the regular expression.
It also implements a protocol, that is, the separation of each character string contains exactly the same characters. for example, the following regular expression matches all words in single or double quotation marks.
But it requires that the start and end quotation marks match (for example, both are double quotation marks or both are single quotation marks ):

/['"] [^'"] * ['"]/

If the start and end quotation marks are required to match, we can use the following reference:

/(['"]) [^'"]
*/1/

/1 matches the pattern matched by the first child expression in parentheses. in this example, it implements a statute, that is, the start quotation marks must match the ending quotation marks. note: If the ratio of digits following the backslash is
If the number of child expressions in parentheses is large, it will be parsed into a decimal escape sequence instead of a reference. you can use the complete three characters to represent the escape sequence, which can avoid confusion. for example,
Use/044 instead of/44. The selection, grouping, and reference characters of the regular expression are as follows:
Character meaning
____________________________________________________________________
|
Select. match either the child expression on the left of the symbol or the child expression on the right of the symbol.
(...) Grouping. Several projects are divided into one Unit. This unit can be
*, + ,? And |
Use
/N
Matches the characters matching the nth group. The group is a subexpression (which may be nested) in parentheses. The group number is the number of left parentheses counted from left to right.
____________________________________________________________________

5. Specify the matched location
As we can see, many elements in a regular expression can match a character in a string. For example:/s
Only a blank character is matched. Some elements in the regular expression match the width between characters.
0 space, rather than the actual characters. For example,/B matches the boundary of a word, that is, the boundary between A/W character and A/W non-character. image/B
Such a character does not specify any matched
The characters in the string, which specify the valid location where the match occurs. sometimes we call these elements the anchor of a regular expression. because they locate the pattern in a specific position in the search string. the most common anchor Element
Element is ^, which causes the pattern to depend on the start of the string, while element $ of the anchor causes the pattern to be located at the end of the string.
For example, to match the word "JavaScript", we can use a regular expression/^ JavaScript $ /.
The word "Java" itself (not as prefix in "JavaScript"), we can
The usage mode // s Java/S/requires that there be spaces before and after the word java. However, there are two problems with this. First: If
"Java" appears at the beginning or end of a character. This mode will not match,
There is not a space at the beginning and end. Second:
When a matched character is found in this mode, the front-end and backend of the matched string returned by it have spaces, which is not what we want. Therefore, we use words.
The result expression is // B Java/B /.
The following are the anchor characters of the regular expression:

Character meaning
____________________________________________________________________
^
Match the beginning of a character. In multi-row search, match the beginning of a line
$ Matches the end of a character. In multi-row search, it matches the end of a row.
/B
Match the boundary of a word. In short, it is located between the character/W and/W. (Note: [/B] matches the return character)
/B
Matched non-word boundary characters
_____________________________________________________________________

6. Attributes
The regular expression syntax has the last element, which is the attribute of the regular expression. It describes the rules for advanced pattern matching. Unlike other regular expression syntaxes, the attribute is
/Outside the symbol. That is, it
They do not appear between two slashes, but are located behind the second slash. Javascript 1.2 supports two attributes. attribute I
It indicates that the pattern match should be case insensitive. Attribute G indicates that the pattern match should be global.
That is to say, we should find all the matches in the searched string. These two attributes can be combined to execute a global, case-insensitive match.
For example, you need to perform an insensitive search to find the words "Java" (or "Java" or "Java)
We can use a non-sensitive regular expression // B Java/B/I.
Find all the specific values of "Java" in a string. We can also add the attribute g, that is, // B Java/B/GI.
The following are the attributes of a regular expression:

Character meaning
_________________________________________
I
Perform case-insensitive matching.
G
Execute a global match. In short, it is to find all the matches, instead of stopping them after finding the first one.
_________________________________________
In addition to attributes G and I, regular expressions do not have any other features like properties. If you use the static attributes of the Regexp constructor
If multiline is set to true, the pattern matching is performed in multiline mode.
In this mode, the anchor character ^ and $ match not only the start and end of the search string, but also the beginning and end of a line inside the search string. For example: Mode
/Java $/matched with "Java", but not
"Java/NIS fun". If the multiline attribute is set, the latter will also be matched:
Regexp. multiline = true;

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.