A summary of regular expression learning

Source: Internet
Author: User
Tags character classes closing tag

I learned a bit about the use of regular expressions in the book "JavaScript Ninja Cheats" a while ago.

Regular expressions are commonly used in some common JavaScript libraries to handle a variety of tasks.

    • Manipulating strings in an HTML node
    • To position a partial selector using CSS selector expressions
    • Determines whether an element contains a specific style name (class)
    • More..

Terminology and operators

Exact match

  

1 var pattern =/test/

If a character is not a special character or operator, it means that the character must appear in an expression. For example, above expressions, these characters must appear in a string in order to successfully match.

/test/means that one character after another, "T" followed by "E", "E" followed by "S", "s" followed by "T".

  Match a class of characters

  Many times we do not want to match a particular character, but rather to match one of the characters in a finite character set. We can specify the character set operator by placing the character set in brackets: [ASD].

The above example says we want to match any one character in "a", "s", "D". Note that this expression spans five characters in time, but it can only match one character in the candidate string.

Sometimes we want to match a set of characters other than a finite set. This can be achieved by adding a caret (^) after the first opening parenthesis of the brackets: [^ASD], which means any character other than "a", "s", "D".

The character set also has a scope operation. For example [A-m], the middle line indicates that all characters between "a" and "M" are within the character set.

  Escape

 There are some special characters in the regular expression, such as the [,],-, ^ and some other special characters that will be mentioned later in this article, what should we do if we want to match such special characters? In the regular case, a backslash can be used to escape any string, allowing the escaped character to be matched as the character itself. So \[means to match the [character. Two backslash (//) matches a backslash.

  Match start and match end

  We may often need to make sure that the pattern matches the beginning of a string, or the end of a string (for example, the go space trick mentioned later). The caret (^), as the first character of the regular, means that the match is made from the beginning, and/^test/only matches the string at the beginning of the test. Note that the (^) here is an overload, which is also used to negate a character set. Similarly, the dollar symbol ($) indicates that the match must appear at the end of the string:/test$/.

Use both ^ and $ to indicate that the specified pattern must contain the entire candidate string:/^test$/.

  Repeated occurrences

  On repeated options, regular expressions provide a number of ways

    • Adding a question mark (?) after a character can be defined as optional, that is, it can occur one time or does not appear at all. For example,/t?est/can match "test" and "est".
    • If a character is to appear 0 or more times, you can use the plus sign (+). For example,/t+est/indicates that it can match "test", "Tttest", "ttest", and not "est".
    • If a character is to appear multiple times or 0 times, you can use an asterisk (*). For example,/t*est/indicates that you can match "test", "TTest", "Tttest", and "est".
    • You can also specify a number in the curly brackets following the character to indicate the number of repetitions, such as/a{4}/, which matches a string containing four consecutive "a".
    • You can also specify two numbers (separated by commas) in curly brackets following the character to indicate the repetition interval. For example,/a{4,10}/represents a match for any string containing a continuous 4 to 10 "a" characters.
    • The second value of the number interval is optional (but the comma is reserved), which represents an open interval. For example,/a{4,}/represents a match to any string containing 4 or more 4 "a" characters in a row.

These repeating operators can be greedy or non-greedy. By default, they are greedy: they match all combinations of characters. Add a question mark (?) after the operator, such as A +, to make the expression non-greedy: make a minimum match.

For example, if we match "AAA", the regular expression/a+/will match all three characters, not the greedy expression/a+?/only match one a character, because a a can satisfy a + term.

  Predefined character classes

  Common pre-defined character sets

predefined character classes and character terms
\ t Horizontal tab
\b Space
\v Vertical tab
\f Page break
\ r Enter
\ n Line break
\CA: \cz Control, for example: \CM matches a control+m
\x0000:\xffff Hexadecimal Unicode
\x00:\xff Hexadecimal ASCII
. Matches any character except new lines (\ n)
\d Match any number equivalent [0-9]
\d Match any non-number, equivalent to [^0-9]
\w Match any word character that includes an underscore, equivalent to [a-za-z0-9]
\w Matches any non-word character, including spaces, tabs, page breaks, etc.
\s Matches any whitespace character, including spaces, tabs, page breaks, etc.
\s Match any non-whitespace character
\b Match word boundaries
\b Match non-word boundaries

  

  

  Group

  So far, the operators we've seen (such as + and *) can only affect the previous terminology. If you apply more than one set of terms to an operator, you can use parentheses on the group like a mathematical expression. For example,/(AB) +/matches one or more contiguous substring "AB".

  Or operator (or)

  You can use a vertical bar (|) to represent or to a relationship. For example:/a|b/matches "A" or "B" characters,/(AB) +| (CD) +/matches one or more occurrences of "AB" or "CD".

  Reverse reference

  This term notation is followed by a trailing slash and a number of captures to be referenced, starting with 1, such as \1,\2.

For example,,/< (\w+) > (. +) <\/\1>/,/1 matches the characters that match when the first group captures the matched values. That is, the value that (\w+) matches. The above-mentioned regex can be used to match HTML tags, such as "<strong>whatever</strong>". This simple element, which does not apply to a reverse reference, is impossible because we do not know whether the closing tag's start matches.

The basic usage of regular expressions is introduced in the above content. It is also a personal summary, although it is similar to the book, but also let oneself recall again.

Regular learning links 30 minutes to learn regular expressions

A summary of regular expression learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.