Correcting errors related to matching continuous rows in "proficient Regular Expression 3rd"

Source: Internet
Author: User
Tags builtin

Original article:

Matching continuous rows (before continuation) continuing with continuation lines continue the example of matching continuous rows in the previous chapter (see page 178), we found that, use "^ \ W + =. * (\ n. *) * "does not match the following two lines of text: src = array. c builtin. c eval. c Field. c gawkmisc. c Io. C main. c \ missing. c MSG. c node. C re. c version. c: The first 「. * "after the backslash is matched," (\ n. *) * "\" that matches the backslash cannot match the backslash as expected. Therefore, the first experience in this chapter is: if you do not need to point and match the backslash, you should make such a rule in the regular expression, we can replace each vertex with "^ \ n \" (note that \ n is included in the exclusion character group. You should remember that one of the original regular expression assumptions is that the point number does not match the line break, and we do not want its replacement to match the line break), so we get: 「 ^ \ W + = [^ \ n \] * (\ n [^ \ n \] *) ** it can indeed match consecutive rows, however, a new problem also arises: in this way, the backslash cannot appear in a non-Ending position of a row. If the text to be matched contains another backslash, this regular expression may cause problems. Now we assume it will include, so we need to continue to improve the regular expression. So far, our thinking is that "match a row. If there are continuous rows, continue matching ". Now another way of thinking, I think this will usually work: Focus on the characters that actually allow matching at a specific time. When matching a line of text, we expect to match either common (except the backslash and linefeed) characters or a combination of the backslash and any other character. In the point wildcard mode, "\." can match a combination of backslash and line break. Therefore, the regular expression becomes "^ \ W + = ([^ \ n \] | \\.) * ", in the DOT/# wildcard mode, because it starts with" ^ ", you may need to use the enhanced text line anchor matching mode if needed (see page 112 ). However, this answer is still not perfect-we will see it again in the next chapter on efficiency issues (see page 270)

The above content is taken from Master Regular Expression version 3rd-Jeffrey E. F. Friedl, pp. 186th. For ease of understanding, the original article was slightly modified.

Unfortunately, I found that 「^ \ W + = ([^ \ n \] | \\.)*Does not match the continuous rows in the format of this article. The analysis is as follows:

^ \ W + ="Can match the string src =

[^ \ N \]As the author said, except for the backslash and linefeed. 「([^ \ N \]) *Because the * number is matched first, it will always match to array. C.
Builtin. c eval. c Field. c gawkmisc. c Io. C main. c

\\."Can match \ and the carriage return character [Cr]

So far, everything has been done according to the "Plan", but unfortunately, this regular expression cannot be further matched regardless of whether the "single row mode" is used.

Because the linefeed following the carriage return cannot be matched by this expression: Because [^ \ N \\] Linefeeds are excluded, so they cannot be matched 「\\.First, you must match the backslash, and the line break cannot be matched. Therefore, the match ends.

Follow-up:

This expression cannot be matched because 「\\."The line break cannot be matched. You only need to replace the expression with" or. "Or relationship and match a character" reminds us of character groups. Then the expression is changed:

^ \ W + = ([^ \ n \] | [\.]) *Unfortunately, it cannot be matched. Think about why?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.