JS Regular expression

Last Update:2014-07-29 Source: Internet

Author: User

Tags uppercase letter

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The use of regular expressions is detailed

Brief introduction

Simply put, regular expressions are a powerful tool that can be used for pattern matching and substitution. The function is as follows:

Tests a pattern for a string. For example, you can test an input string to see if there is a phone number pattern or a credit card number pattern in the string. This is called data validation.

Replace the text. You can use a regular expression in your document to identify specific text, and then you can either delete it all or replace it with another text.

Extracts a substring from a string based on pattern matching. Can be used to find specific text in text or input fields.

Basic syntax

After a preliminary understanding of the function and function of regular expressions, let's take a specific look at the syntax format of the regular expression.

Regular expressions are generally in the following form:

/love/the part where the "/" delimiter is located is the pattern that will be matched in the target object. The user simply puts the pattern content that you want to find the matching object in between the "/" delimiter. To enable the user to customize the schema content more flexibly, the regular expression provides a special "meta-character". A meta-character is a special character in a regular expression that can be used to specify its leading character (that is, the character in front of the metacharacters) in the target object.

The more commonly used metacharacters are: "+", "*", and "?".

The "+" metacharacters stipulate that their leading characters must appear one or more times in the target object.

The "*" meta-character specifies that its leading character must appear 0 or more times in the target object.

“?” A meta-character specifies that its leading object must appear 0 or more times in the target object.

Below, let's look at the specific application of the regular expression meta-character.

/fo+/because the preceding regular expression contains the "+" metacharacters, it can be matched with a string of "fool", "fo", or "football" in the target object, and so on with one or more letters O consecutively after the letter F.

/eg*/because the preceding regular expression contains the "*" metacharacters, it can be matched with a string of "easy", "ego", or "egg" in the target object, such as 0 or more consecutive letters G after the letter E.

/wil?/because the "?" is included in the preceding regular expression Metacharacters, which indicates that a string that matches the "Win" in the target object, or "Wilson", and so on after the letter I, appears 0 or one letter L.

Sometimes you don't know how many characters to match. To be able to adapt to this uncertainty, the regular expression supports the concept of qualifiers. These qualifiers can specify how many times a given component of a regular expression must appear to satisfy a match.

{n} n is a non-negative integer. Matches the determined n times. For example, ' o{2} ' cannot match ' o ' in ' Bob ', but can match two o in ' food '.

{N,} n is a non-negative integer. Match at least n times. For example, ' o{2,} ' cannot match ' o ' in ' Bob ', but can match all o in ' Foooood '. ' O{1,} ' is equivalent to ' o+ '. ' O{0,} ' is equivalent to ' o* '.

{n,m} m and n are non-negative integers, where n <= m. Matches at least n times and matches up to M times. For example, "o{1,3}" will match the first three o in "Fooooood". ' o{0,1} ' is equivalent to ' O? '. Note that there can be no spaces between a comma and two numbers.

In addition to metacharacters, users can specify exactly how often the pattern appears in the matching object. For example,/jim {2,6}/The regular expression above specifies that the character m can appear consecutively 2-6 times in a matching object, so the above regular expression can match a string such as Jimmy or Jimmmmmy.

After a preliminary understanding of how to use regular expressions, let's take a look at some of the other important metacharacters uses.

\s: Used to match a single space character, including Tab key and line break;

\s: Used to match all characters except a single space character;

\d: Used to match numbers from 0 to 9;

\w: Used to match letters, numbers, or underscore characters;

\w: Used to match all characters that do not match the \w;

. : Used to match all characters except the line break.

(Note: We can think of \s and \s as well as \w and \w as inverse for each other)

Below, we'll look at how to use the above metacharacters in regular expressions using an example.

/\s+/the preceding regular expression can be used to match one or more space characters in the target object.

/\d000/If we have a complex financial statement in hand, we can easily find all sums amounting to thousands of dollars through these regular expressions.

In addition to the meta-characters we have described above, there is another unique special character in the regular expression, the locator. The locator is used to specify where the matching pattern appears in the target object. The more commonly used locators include: "^", "$", "\b", and "\b".

The "^" locator specifies that the matching pattern must be at the beginning of the target string

The "$" locator specifies that the matching pattern must be at the end of the target object

The "\b" locator specifies that the matching pattern must be one of the two boundaries at the beginning or end of the target string.

The "\b" locator specifies that the matching object must be within two boundaries of the beginning and end of the target string.

That is, a matching object cannot be the beginning of a target string or the end of a target string.

Similarly, we can think of "^" and "$" as well as "\b" and "\b" as two sets of locators for reciprocal operations. For example:/^hell/because the preceding regular expression contains a "^" Locator, you can match a string that begins with "Hell", "Hello" or "Hellhound" in the target object. /ar$/because the above regular expression contains a "$" locator, you can match a string that ends with "car", "bar" or "AR" in the target object. /\bbom/because the preceding regular expression pattern begins with the "\b" locator, you can match a string that starts with "bomb" or "BOM" in the target object. /man\b/because the preceding regular expression pattern ends with a "\b" locator, you can match a string that ends with "human", "Woman", or "man" in the target object.

In order to make it easier for users to set the matching pattern, the regular expression allows the user to specify a range in the matching pattern rather than a specific character. For example:

/[a-z]/the above regular expression will match any uppercase letter from a to Z range.

/[a-z]/the above regular expression will match any lowercase letter from a to Z range.

/[0-9]/the above regular expression will match any number in the range from 0 to 9.

/([a-z][a-z][0-9]) +/the above regular expression will match any string consisting of letters and numbers, such as "aB0".

One thing you should be reminded of here is that you can use "()" in regular expressions to group strings together. The "()" symbol contains content that must appear in the target object at the same time. Therefore, the preceding regular expression will not match a string such as "ABC", because the last character in "ABC" is a letter rather than a number.

If we want to implement a "or" operation in a regular expression similar to a programming logic, you can use the pipe symbol "|" In any of several different patterns to match. For example:/to|too|2/the above regular expression will match "to", "too", or "2" in the target object.

There is also a more commonly used operator in the regular expression, the negative character "[^]". Unlike the locator "^" we described earlier, the negation "[^]" specifies that a string specified in the pattern cannot exist in the target object. For example:/[^a-c]/the above string will match any character except A, B, and C in the target object. In general, "^" is considered a negation operator when it appears in "[]", and when "^" is outside of "[]" or "[]", it should be treated as a locator.

Finally, the escape character "\" can be used when the user needs to include metacharacters in the pattern of the regular expression and find the matching object. For example:/th\*/the above regular expression will match the "th*" in the target object, not the "the", and so on.

After the regular expression is constructed, it can be evaluated like a mathematical expression, that is, it can be evaluated from left to right and in a priority order. The priority levels are as follows:

1.\ Escape character

2. (), (?:), (? =), [] parentheses and square brackets

3.*, +,?, {n}, {n,}, {n,m} qualifier

4.^, $, \anymetacharacter position and order

5.|" or the action

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More