JS Regular expression

Source: Internet
Author: User

Transferred from: http://www.liaoxuefeng.com/wiki/001434446689867b27157e896e74d51a89c25cc8b43bdb3000/ 001434499503920bb7b42ff6627420da2ceae4babf6c4f2000

Strings are the most data structure involved in programming, and the need to manipulate strings is almost ubiquitous. For example, to determine whether a string is a legitimate email address, although you can programmatically extract @ the substring before and after, and then judge whether it is a word and domain name, but this is not only cumbersome, and code is difficult to reuse.

A regular expression is a powerful weapon used to match strings. Its design idea is to use a descriptive language to define a rule for a string, and any string that conforms to the rule, we think it "matches", otherwise the string is illegal.

So the way we judge whether a string is a legitimate email is:

    1. Create a regular expression that matches the email;

    2. Use the regular expression to match the user's input to determine whether it is legal.

Because the regular expression is also represented by a string, we first know how to describe the character with characters.

In regular expressions, if a character is given directly, it is exactly the exact match. To match a \d number, \w you can match a letter or a number, so:

    • ‘00\d‘Can match ‘007‘ , but cannot match ‘00A‘ ;

    • ‘\d\d\d‘can match ‘010‘ ;

    • ‘\w\w‘can match ‘js‘ ;

.Can match any character, so:

    • ‘js.‘Can match ‘jsp‘ , ‘jss‘ , and ‘js!‘ so on.

To match a variable-length character, in a regular expression, with a representation of * any character (including 0), with a representation of + at least one character, representing ? 0 or 1 characters, with a representation of {n} n characters, represented by {n,m} n-m characters:

Take a look at a complex example: \d{3}\s+\d{3,8} .

Let's read from left to right:

    1. \d{3}Indicates a match of 3 digits, for example ‘010‘ ;

    2. \sCan match a space (also including tab and other white space characters), so that \s+ there is at least one space, such as matching ‘ ‘ , ‘\t\t‘ etc.;

    3. \d{3,8}Represents a 3-8 number, for example ‘1234567‘ .

Together, the above regular expression can match a telephone number with an area code separated by any space.

What if you want to match ‘010-12345‘ a number like this? Because ‘-‘ it is a special character, it is escaped in the regular expression, ‘\‘ so the above is \d{3}\-\d{3,8} .

However, there is still no match ‘010 - 12345‘ because there are spaces. So we need more complex ways of matching.

Advanced

To make a more accurate match, you can use a [] representation range, such as:

    • [0-9a-zA-Z\_]Can match a number, letter, or underscore;

    • [0-9a-zA-Z\_]+Can match a string of at least one number, letter, or underscore, for example, and ‘a100‘ ‘0_Z‘ ‘js2015‘ so on;

    • [a-zA-Z\_\$][0-9a-zA-Z\_\$]*You can match a string consisting of a number, letter, or underscore, or $, which is the name of the variable allowed by JavaScript, by a letter or underscore, or $.

    • [a-zA-Z\_\$][0-9a-zA-Z\_\$]{0, 19}More precisely limit the length of a variable to 1-20 characters (1 characters before + 19 characters later).

A|BCan match A or B, so (J|j)ava(S|s)cript you can match ‘JavaScript‘ , ‘Javascript‘ , ‘javaScript‘ or ‘javascript‘ .

^Represents the beginning of a row, ^\d indicating that a number must begin.

$Represents the end of a line, indicating that it \d$ must end with a number.

You may have noticed it, but you can match it, js ‘jsp‘ but plus ^js$ it turns into an entire line match, it only matches ‘js‘ .

Regexp

With the knowledge of readiness, we can use regular expressions in JavaScript.

JavaScript has two ways of creating a regular expression:

The first way is by /正则表达式/ writing it directly, and the second way is by new RegExp(‘正则表达式‘) creating a RegExp object.

The two formulations are the same:

var re1 = /ABC\-001/;var re2 = new RegExp(‘ABC\\-001‘);re1; // /ABC\-001/re2; // /ABC\-001/

Note that if you use the second notation because of the escape problem of the string, the two of the string \\ is actually one \ .

Let's look at how to tell if a regular expression matches:

/^\d{3}\-\d{3,8}$/;re.test(‘010-12345‘); // truere.test(‘010-1234x‘); // falsere.test(‘010 12345‘); // false

The method of the RegExp object test() is used to test whether a given string conforms to a condition.

Slicing a string

Using regular expressions to slice a string is more flexible than a fixed character, see the normal segmentation code:

‘a b   c‘.split(‘ ‘); // [‘a‘, ‘b‘, ‘‘, ‘‘, ‘c‘]

Well, you can't recognize contiguous spaces, try using regular expressions:

‘a b   c‘.split(/\s+/); // [‘a‘, ‘b‘, ‘c‘]

No matter how many spaces can be divided normally. Add to try , :

‘a,b, c  d‘.split(/[\s\,]+/); // [‘a‘, ‘b‘, ‘c‘, ‘d‘]

Try again ; :

‘a,b;; c  d‘.split(/[\s\,\;]+/); // [‘a‘, ‘b‘, ‘c‘, ‘d‘]

If the user enters a set of tags, next time remember to use regular expressions to convert the nonstandard input into the correct array.

Group

In addition to simply judging whether a match is matched, the regular expression also has the power to extract substrings. The () Grouping (group) to be extracted is represented by the. Like what:

^(\d{3})-(\d{3,8})$Two groups are defined separately, and the area code and local numbers can be extracted directly from the matching string:

var re = /^(\d{3})-(\d{3,8})$/;re.exec(‘010-12345‘); // [‘010-12345‘, ‘010‘, ‘12345‘]re.exec(‘010 12345‘); // null

If a group is defined in a regular expression, you can extract the substring from the RegExp object using a exec() method.

exec()After the match succeeds, the method returns one Array , the first element is the entire string to which the regular expression matches, and the subsequent string represents the successful substring.

exec()Method is returned when a match fails null .

Extracting substrings is useful. Look at a more vicious example:

var re = /^(0[0-9]|1[0-9]|2[0-3]|[0-9])\:(0[0-9]|1[0-9]|2[0-9]|3[0-9]|4[0-9]|5[0-9]|[0-9])\:(0[0-9]|1[0-9]|2[0-9]|3[0-9]|4[0-9]|5[0-9]|[0-9])$/;re.exec(‘19:05:30‘); // [‘19:05:30‘, ‘19‘, ‘05‘, ‘30‘]

This regular expression can directly identify the legal time. However, there are times when it is not possible to fully validate with regular expressions, such as identifying dates:

var re = /^(0[1-9]|1[0-2]|[0-9])-(0[1-9]|1[0-9]|2[0-9]|3[0-1]|[0-9])$/;

For ‘2-30‘ , ‘4-31‘ such illegal date, with regular or can not be recognized, or write out to be very difficult, then need to program with identification.

Greedy match

In particular, a regular match is a greedy match by default, which is to match as many characters as possible. For example, match the following numbers 0 :

var re = /^(\d+)(0*)$/;re.exec(‘102300‘); // [‘102300‘, ‘102300‘, ‘‘]

Because \d+ of the greedy match, directly the back of 0 all matching, the result 0* can only match the empty string.

\d+a non-greedy match (that is, as few matches as possible) must be used in order to match the latter 0 and add a ? \d+ non-greedy match to it:

var re = /^(\d+?)(0*)$/;re.exec(‘102300‘); // [‘102300‘, ‘1023‘, ‘00‘]
Global Search

JavaScript regular expressions also have several special flags, most commonly used g to represent global matches:

var r1 = /test/g;// 等价于:var r2 = new RegExp(‘test‘, ‘g‘);

A global match can execute exec() the method multiple times to search for a matching string. When we specify a g flag, each time it is run exec() , the regular expression itself updates the lastIndex property, representing the last index to which it was last matched:

var s = ‘JavaScript, VBScript, JScript and ECMAScript‘;var re=/[a-zA-Z]+Script/g;// 使用全局匹配:re.exec(s); // [‘JavaScript‘]re.lastIndex; // 10re.exec(s); // [‘VBScript‘]re.lastIndex; // 20re.exec(s); // [‘JScript‘]re.lastIndex; // 29re.exec(s); // [‘ECMAScript‘]re.lastIndex; // 44re.exec(s); // null,直到结束仍没有匹配到

The global match is similar to a search and therefore cannot be used, so it /^...$/ will only match at most once.

The regular expression can also specify i flags, which indicate that the case is ignored, and the flag indicates that a m multiline match is performed.

Summary

The regular expression is very powerful, it is impossible to finish it in a short section. You can write a thick book if you want to know everything about the regular. If you frequently encounter problems with regular expressions, you may need a reference book for regular expressions.

JS Regular expression

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.