Regular expression pattern-matching string Basics _ regular Expressions

Source: Internet
Author: User
Tags first string
This article mainly introduces the regular expression pattern matching string basic knowledge, divided into matching string basic rules and regular matching, find and replace knowledge, this article introduces you very good, the need for friends can refer to the next

Introduced

There is a feature implementation in the actual project that needs to parse some specific patterns of strings. In the existing code base, in the implemented part of the functionality, are used to detect specific characters, the disadvantage of using this method is:

    • It's logically easy to make mistakes.

    • It's easy to miss out on some boundary condition checks.

    • Code complexity difficult to understand, maintenance

    • Poor performance

See the code base has a CPP, the entire CPP 2000 lines of code, there is a method, the light parsing string of more than 400 lines! A comparison of characters in the past, really unsightly. And many of the above comments have expired, a lot of code writing style is also different, basic can be judged by a lot of hands.

In this case, basically there is no way to go down the old road, it is natural to think of the use of regular expressions. And I myself have no practical experience in the regular expression, especially for writing matching rules is smattering. The first time you want to find some information from the Internet, the first general understanding. But the results of the mother still let people very disappointed. (Of course, if you want to find some more professional knowledge, the results of the Niang every time will let people break, all is a copy of the same. But the usual degree of Niang life is still can be to give up degrees Niang query results, FQ to outside to find, also found some more basic video (need FQ).

This article can be said to be a summary of the basic knowledge of writing a regular expression matching string. The following two sections are mainly divided into:

    1. Basic rules for matching strings

    2. Regular match, find and replace

The regular expression rule described in this article is ECMAScript. The programming language used is C + +. Other aspects of the non-introduction.

Basic rules for matching strings

1. Match a fixed string

regex e("abc");

2. Match fixed string, case insensitive

regex e("abc", regex_constants::icase);

3. Match a fixed string more than one character, case-insensitive

regex e("abc.", regex_constants::icase); // . Any character except newline. 1个字符

4. Match 0 or 1 characters

regex e("abc?"); // ? Zero or 1 preceding character. 匹配?前一个字符

5. Match 0 or more characters

regex e("abc*"); // * Zero or more preceding character. 匹配*前一个字符

6. Match 1 or more characters

regex e("abc+"); // + One or more preceding character. 匹配+前一个字符

7. Match characters in a specific string

regex e("ab[cd]*"); // [...] Any character inside square brackets. 匹配[]内的任意字符

8. Matching characters of non-specific strings

regex e("ab[^cd]*"); // [...] Any character not inside square brackets. 匹配非[]内的任意字符

9. Match a specific string, and specify the number

regex e("ab[cd]{3}"); // {n} 匹配{}之前任意字符,且字符个数为3个

10. Match a specific string, specify the number range


Regex e ("ab[cd]{3,}");  {n} matches any character preceding {} with a number of 3 or more than 3 regex e ("ab[cd]{3,5}");  {n} matches any character before {}, and the number of characters is more than 3, 5 below the closed interval


11. Match a rule in a rule

regex e("abc|de[fg]"); // | 匹配|两边的任意一个规则

12. Matching Grouping

regex e("(abc)de+"); // () ()表示一个子分组

13. Matching sub-groups


Regex e ("(ABC) de+\\1");  ()    () () represents a sub-group, while \1 represents the content of the first grouping in this position regex e ("(ABC) C (de+) \\2\\1");  \2 indicates that the content of the second grouping is matched here


14. Match the beginning of a string


Regex e ("^abc."); ^ Begin of the string finds substrings beginning with ABC


15. Match the end of a string


Regex e ("abc.$");//$ end of the string to find substrings ending in ABC


The above is the most basic matching pattern of writing. Typically, if you want to match a specific character, you need to escape with \, such as a match in a string that matches ".", then the match string should be preceded by a certain character. Out of the above basic rules, if you do not meet the specific needs, then you can refer to this link. With the understanding of basic matching patterns, you need to use regular expressions to match, find, or replace.

Regular match, find and replace

After writing the pattern string, the string to be matched and the pattern string must be matched in a regular way. There are three ways: match (Regex_match), find (Regex_search), replace (regex_replace).

The match is straightforward, passing the string to be matched and the pattern string directly into Regex_match, returning a bool quantity to indicate whether the string to be matched satisfies the rule of the pattern string. Matches the entire STR string.


BOOL match = Regex_match (str, e);//Match entire string str


The lookup is a substring that is found in the entire string and satisfies the pattern string. That is, it returns true as long as the satisfy pattern string exists in Str.


BOOL match = Regex_search (str, e);//Find substrings in string str that match e rules


However, in many cases, it is not enough to return a matching bool quantity, we need to get the matching substring. Then you need to group the matching strings in the pattern string, referring to the "basic rules for matching strings" 12th. Once the smatch is passed into the regex_search, you get a string that satisfies each sub-group.


Smatch M;bool found = Regex_search (str, M, e); for (int n = 0; n < m.size (); ++n)  {    cout << "m[" << N << "].str () =" << m[n].str () << Endl;  }


The substitution is also done based on the pattern string in the case of grouping.


cout << regex_replace (str, E, "are on $");


At this point, the string that satisfies grouping 1 and 2 is added in the middle of "is".

The above three functions have many versions of overloads that can meet the needs of different situations.

Actual combat

Requirements: Find the pattern string that satisfies Sectiona ("sectionb") or Sectiona ("sectionb"). and isolate the Sectiona and sectionb. Sectiona and sectionb do not appear as numbers, characters can be case-sensitive, or at least one character.

Analysis: According to the requirements, can be broadly divided into two parts, namely Sectiona and Sectionab. This is the need to use the grouping.

First step: Write a pattern string that satisfies the section case

[a-zA-Z]+

Step Two: Spaces may appear in Sectiona and sectionb. Assume that there are at most 1 spaces

\\s?

Combine the above two cases, which is the pattern string that satisfies our needs. But how can you organize it into two groups?

[a-zA-Z]+\\s[a-zA-Z]+

This is definitely not the case, according to the grouping rules, you need to distinguish the group by ()

regex e("([a-zA-Z]+)\\s?\\(\"([a-zA-Z]+)\"\\)");

At this point \ \ (\) After \\s is escaped in order to satisfy the SECTIONB outer quotation marks and parentheses.

After completion, you can use Regex_match to match, if matching, then continue to use Regex_search to find the string


if (Regex_match (str, E)) {smatch m; auto found = Regex_search (str, M, e); for (int n = 0; n < m.size (); ++n) {cout < ;< "m[" << n << "].str () =" << m[n].str () << Endl;}} else{cout << "not matched" << Endl;}


The first string of the object m array is the entire substring that satisfies the requirement, followed by the substring that satisfies grouping 1, grouping 2.

Summarize

The above is a small series to introduce you to the regular expression pattern matching string basic knowledge, I hope we have some help, if you have any questions please give me a message, small series will promptly reply to you. Thank you very much for your support for topic.alibabacloud.com!

Related recommendations:

Use regular expressions to verify that the login page input meets the requirements _ regular expressions

The meaning of regular expression \w \d

Methods for masking keywords using regular expressions

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.