JavaScript Advanced Programming (3rd Edition) Learn notes JS Regular Expressions _ Basics

Source: Internet
Author: User
Tags lowercase
It should be noted that this is just a summary of the common and simpler syntax of regular expressions, rather than the full syntax, which, in my view, is enough to handle everyday applications. Regular expressions are not only applied in ECMAScript, but are also applied in Java,. Net, UNIX, etc., and this article is based on regular expressions in ECMAScript.

the basis of regular expressions

1. Ordinary characters: Letters, numbers, underscores, Chinese characters, and all characters that have no special meaning, such as ABC123. Matches the same character as the match.

2, special characters: (when needed, use the backslash "\" to escape)

Character Meaning Character Meaning Character Meaning Character Meaning
\a Bell symbol = \x07 ^ Match the start position of the string \b Match the start or end of a word N Match n Times
\f Page break = \x0c $ Match end position of string \b Match is not the start and end of a word {N,} Match at least N times
\ n Line feed = \x0a () Marks the beginning and end of a subexpression \d Matching numbers {N,m} Match N to M times
\ r return character = \x0d [] Custom character combination Matching \d Match any character that is not a number [0-9] Match any number from 0 to 9
\ t tab characters = \x09 {} A symbol that modifies the number of matches \s Match any white space character [F-m] Match any letter from F to M
\v Vertical tab = \x0b . Match characters except for line breaks \s Match any non-white-space character
\e ESC = \x1b ? Match 0 or 1 times \w Match letters or numbers or underscores or kanji
\xxx Use a two-bit hexadecimal representation to match the character of the number + Match 1 or more times \w Match any characters that are not letters, numbers, underscores, and Chinese characters
\uxxxx Matches the character of the number in a four-bit hexadecimal representation * Match 0 or more times [^x] Match all characters except X
\X{XXXXXX} Use any bit hexadecimal representation to match the character of the number | The "or" relationship between the left and right sides of the expression [^aeiou] Match all characters except Aeiou

These special characters listed above can be roughly divided into:

(1) Inconvenient book characters: such as Bell character (\a), page feed (\f), line break (\ n), carriage return (\ r), tab (\ t), ESC (\e)

(2) Hexadecimal characters: two-bit (\x02), four-bit (\x012b), Arbitrary bit (\x{a34d1})

(3) Position characters: such as String start (^), end of string ($), start and end of Word (\b), middle of word (\b)

(4) denotes a minor number character: such as 0 or 1 times (?), 1 or more (+), 0 or more times (*), n ({n}), at least n times ({n,}), N to M times ({n,m})

(5) Modifier characters: such as cosmetic times ({}), custom combination matching ([]), subexpression (())

(6) Antisense characters:

(A) antisense by capitalization: such as \b and \b, \d and \d, \s and \s, \w and \w

(B) through [^] antisense: such as [^x], [^aeiou]

(C) Other exceptions: such as \ n and. Also constitute a antisense

(7) Range character: such as number range ([0-9]), Letter range ([f-m])

(8) Logical characters: such as representations or (|)

3. Escape

(1) using a backslash "\" to escape a single character

(2) Use "\q...\e" to escape, all the characters appearing in the middle of the expression as ordinary characters

(3) Use "\u...\e" to escape, all the characters appearing in the middle of an expression as normal characters, and convert lowercase letters to uppercase matches

(4) Use "\l ... \e "Escapes, converts the characters appearing in the middle of an expression as normal characters, and translates uppercase letters to lowercase matches

4. Greedy mode and lazy mode

If the regular expression contains a secondary number character, in general, will match as many characters as possible, such as using L*n to match linjisong words, will match the Linjison, rather than Lin, this pattern is the regular expression of the greedy pattern, corresponding, can add characters "? "To set to lazy mode, which matches as few characters as possible. For example, repeat 0 or more times, but as little as possible.

5. Grouping and reverse references

(1) The expression is included with parentheses (()) so that the expression can be treated as a whole to achieve the purpose of grouping.

(2) By default, each group automatically gets a group number, numbering backwards from 1 in the order of the opening parenthesis.

(3) When the engine is processed, the contents of the inner expression of the parentheses are saved to facilitate further processing in the matching process or after the match is completed, and can be used to refer to this content using backslashes and group numbers, such as \1, which represents the first grouped text.

(4) can also customize the group name, syntax is (? <name>exp), this time when the reverse reference, you can also use \k<name>.

(5) also can not save matching content, also do not assign group number, syntax is (?: EXP).

(6) Parentheses have some other special syntax, here are several, no longer discussed in depth:

Classification Code/Syntax Description
Capture (exp) Match exp, and capture text into an automatically named group
(? <name>exp) Match exp and capture the text into a group named name, or you can write a (? ' Name ' exp ')
(?: EXP) Matches exp, does not capture matching text, and does not assign group numbers to this group
0 Wide Assertion (? =exp) Match the position of the exp front
(? <=exp) Match the position of the exp back
(?! Exp Match the back to the position not exp
(? <!exp) Matches a position that is not exp at the front
Comments (? #comment) This type of grouping does not have any effect on the processing of regular expressions and is used to provide comments for people to read

In this case, it is enough to understand common regular expressions, and if you want to continue to learn regular expressions, refer to the 30-minute introductory tutorial for regular expressions. Let's familiarize yourself with the regular expression implementations in JavaScript.

Second, regular expression object RegExp in JavaScript

1. Creating Regular Expressions

(1) Use literal: syntax var exp =/pattern/flags;

A, pattern is any regular expression

B, flags have three kinds: g means global mode, I means ignore case, M represents multiline mode

(2) using the REGEXP built-in constructor: syntax var exp = new REGEXP (pattern, flags);

A, when using constructors, both pattern and flags are strings, so dual escapes are required for escape characters, for example:

Literal amount Constructors
/\[bc\]at/ "\\[bc\\]at"
/\.at/ "\\.at"
/name\/age/ "Name\\/age"
/\d.\d{1,2}/ "\\d.\\d{1,2}"
/\w\\helllo\\123/ "\\w\\\\hello\\\\123"

Note: ECMAScript 3 uses a literal to share a RegExp instance, and using new RegExp (Pattern,flags) creates an instance for each regular expression, and ECMAScript 5 sets a new instance to be created each time.

2. Instance Properties

(1) Global: Boolean value that indicates whether the G flag is set.

(2) IgnoreCase: Boolean value indicating whether I flag is set.

(3) Multiline: Boolean value that indicates whether the M flag is set.

(4) lastindex: An integer that represents the position of the character at which to start searching for the next occurrence, starting from 0.

(5) Source: A String that represents a string pattern created in literal form, even if the instance is created with a constructor and is a literal string pattern.

3. Example method

(1) Exec () method

A, a parameter, the string to which the pattern is to be applied, returns an array of the first occurrence information, and returns NULL if there is no match.

B, the returned array is an array instance, but there is also an additional input and index property, representing the position of the string and the match that applied the regular expression in the string.

C, when matched, in the returned array, item 1th is a string that matches the entire pattern, and the other items are strings that match the groupings in the pattern (if there is no grouping, then the array returns only 1 entries).

D, for EXEC (), even if set G, each return is also a match, the difference is, set the G, multiple call exec start search location is different, no set g, every time from the beginning of the search.

(2) Test () method

Accepts a string argument, the match returns True, and a mismatch returns false.

Third, the case analysis

Here is a regular expression for formatting from the PhoneGap source code

Copy Code code as follows:

var pattern =/(. *?)% (.) (.*)/;
var str = ' Lin%%jisong ';
var match = pattern.exec (str);
Console.info (Match.join (', '));//lin%%jisong,lin,%,jisong

var pattern2 =/(. *)% (.) (.*)/;
var match2 = pattern2.exec (str);
Console.info (Match2.join (', '));//lin%%jisong,lin%,j,isong


Analysis: Here pattern and pattern2 all contain three groups, 2nd, 3 are the same, 2nd group (.) Matches any newline character, the 3rd group (. *) matches as many (greedy patterns) as possible, any newline character, and the 1th group in pattern (. *?) Match as little as possible (lazy mode) to any newline character, while the 1th (. *) group in pattern2 is as much (greedy pattern) as possible to match any newline character. Therefore, if the entire pattern match is guaranteed to be successful (thus preserving a% character to match% of the regular expression), the 1th group in pattern matches Lin, and the 1th grouping in pattern2 is lin%, and the output from the example above is not difficult to understand.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.