Reading Notes mastering Regular Expressions (1)

Source: Internet
Author: User
Tags egrep
Mastering Regular Expressions

This book by jerffrey E. F. Friedl has been famous for a long time. I purchased a copy from Dangdang in December, but I have never had time to read it. The main reason is, of course, English is poor, and I have not developed the habit and ability to read English.

Sorry, why didn't I spend some time on English.

The terrible demand for searching duplicate words at the beginning is very vivid. If I encounter such a demand, it will certainly be quite distressing.

  • Check n files, find duplicate words (such as "this"), and then report the row of the file in which they are located, and use the standard ANSI escape sequence to display them in height;
  • In addition, the last word in a row must be the same as the first word in the next (non-empty) followed by cross-row operations.
  • In the search process, we also need to ignore the case differences and treat any number of spaces between words as a single space. The most important thing is that, one or two of the two duplicate words are surrounded by HTML tags! For example, "... it is <B> very </B> very important ..."

Such a requirement sounds annoying, but with regular expressions, everything becomes easy.

The following table shows the basic metacharacters:

Metacharacters Name Matching Behavior Remarks
^ Escape Character Starting position of matching row  
$ Circle character End position of matching row  
/< Backslash and less than character Match the start boundary of a word Not all versions of egrep support this feature.
/> Backslash and greater than character Match the end boundary of a word
. Point Match any single character
[...] Character Set Match All characters listed in square brackets
[^...] Character non-Set Match All characters not listed in square brackets  
| Or symbol Expressions that match or separate symbols  
(...) Parentheses Used to specify the range of the "or" symbol  

Note:

  • If a metacharacters appear in a character set (a list of characters enclosed by square brackets), it is no longer a metacharacters. For example, when the dot character is outside square brackets, it is a metacharacter that represents any character. If it appears in square brackets, it represents the dot character itself.
  • In character sets and character sets, if the minus sign appears at the first character position, it represents the minus sign itself, otherwise it represents a range, such as [-a-z0-9], the first minus sign represents the minus sign itself, and the second minus sign represents the range. It represents 26 lower-case letters from A to Z together with the characters a and Z, the third minus sign has the same meaning as the second minus sign.
  • For example, [^ x] does not mean "match as long as it is not character X", but "match any character not X ", the former can match an empty row, but [^ x] does not.
  • Some versions of egrep support the-I parameter to perform case-insensitive matching.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.