What is single-row mode? Detailed explanation of the single-line mode of JavaScript Regular Expressions

Source: Internet
Author: User
This article mainly introduces the JavaScript Regular Expressions and the single-line mode. For more information, see this article. It mainly introduces the JavaScript Regular Expressions and single-line mode. For more information, see

Regular expressions were first implemented by Ken Thompson in his improved QED editor in 1970. The simplest metacharacters in regular expressions are ". "At that time, all matching characters except line breaks were:

"." Is a regular expression which matches any character t .

The above sentence is from QED's official document in 1970, which may be the first regular document in history.

Why? The reason is that QED is used to edit the file in the unit of action, and the line break at the end of the line is also included in the content of this line. For example, if you want to delete all the single-line comments in a piece of code, you can use the following command in QED:

1,$s#//.*##

If "If a line break can be matched, the line break will also be deleted, resulting in the merge of these lines and the next line. This is generally not the result we want,". "It was designed to not match line breaks when it was first invented. Although there is no QED command on the operating system for us to test, we still have the "." In VIM and VIM, which cannot match line breaks for the same reason.

Unlike in Node, reading a file is usually a brain reading the entire file. Perl inherits the tradition of reading files by line by many Linux commands, like this:

while (<>) {print $_}

The end of _ also has a line break, so Perl naturally inherits the "." Of QED and does not match the line break. However, Perl is a gate programming language, rather than an editor. The objects to be matched by regular expressions are not only single-line text, but also multi-line text. Therefore, in its regular expressions, ". "there is a need for cross-row matching, so Perl invented the regular single-row mode/s, that is, let". "can also match line breaks.

The official description of the/s modifier used to open the single-line mode in Perl is "Treat the string as single line". This "single line" should be understood as follows: ". "In normal mode, only intra-row characters can be matched, but not cross-row. in single-row mode, Perl pretends to treat multi-row strings as one line, and regards linefeeds as intra-row characters, so ". and match them. More vividly, it is to put the following three lines of text

123

As a line of text "1 \ n2 \ n3 \ n", this is what the single line mode means.

However, for the same reason (string variables can contain multiple lines of text), Perl also invented the/m modifier, that is, multiline mode, the official description is "Treat the string as multiple lines". This pattern also exists in JavaScript Regular Expressions since ancient times. Here, "multiline" means: ^ and $ metacharacters do not match the positions before and after the line breaks in the middle of a string by default. That is to say, the string will always have only one line. After the multiline mode is enabled, it can match.

That is to say, the single-line mode and multi-line mode are for different metacharacters, people who are new to regular expressions will be confused by the seemingly corresponding concepts of "single-row mode" and "multi-row mode.

Later, Ruby authors may think that the regular term "single-line mode" is not a good term, and the special case alone makes ". "Match line breaks. This pattern is called" multiline pattern ", that is, let. * a regular expression like this can match multiple rows, so it fully makes sense. The modifier also uses/m (the "multiline mode" in Perl will be enabled by default in Ruby ", so/m is not occupied), this is really worse, more messy.

Later, the Python author may also think that the single-line mode should be avoided, so a new name "dotall" is created, that is, to allow dot to match all characters, A good name, and later Java used this name.

I have reviewed the history above, explained the origins of the single-row mode, and explained that the name of the single-row mode is not good. V8 recently implemented a stage 3 ES proposal github.com/mathiasbynens/es-regexp-dotall-flag. this example introduces the/s modifier and dotAll attribute for JavaScript Regular Expressions. The dotAll attribute is learned by Python and Java, the/s modifier inherits Perl, and there is no need to invent a new modifier, such as/d, to make things more complicated. The specific effect of/s in JavaScript is to make ". "can match four line terminologies that previously could not match: \ n (line feed), \ r (Press ENTER), \ u2028 (line separator), \ u2029 (paragraph separator ):


/foo/s.dotAll // true/^.{4}$/s.test("\n\r\u2028\u2029") // true

It is actually a very simple thing, but some people who have never touched the regular expressions other than JavaScript will be confused when they learn this new mode. Here, I will clarify: the multi-row mode controls the representation of ^ and $, while the single-row mode controls ". ", there is no direct relationship between the two.

However, the Perl language, which introduced the single-line and multi-line obfuscation concepts, has completely deleted these two models in Perl 6: ". by default, the "#" matches the line break, and \ N can match any character except the line break. ^ and $ always match the first and end of the string, the two metacharacters ^ and $ are introduced to match the beginning and end of a line.

[^] Or [\ s \ S], which we used in the past, are not completely useless. For example, in some editors that use JavaScript Regular Expressions (VS Code and Atom ), it is unlikely that you will be provided with an interface to enable the single-line mode. However, when talking about the Regular Expression Function in the editor, the Regular Expression Function in the editor implemented by JavaScript is still too weak. For example, you cannot enable some modes in the regular expression itself. For example, if you use the Python Regular Expression in Sublime) in the regular expression, use (? S) to enable the dotall mode. For example, you can use (? S)/\ *. +? \ */Matches all multiline comments.

What is the single-row mode? For more details about the single-line mode of JavaScript regular expressions, see other related articles in the first PHP community!

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.