Basic knowledge of regular expression pattern matching string, regular expression string

Last Update:2017-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Introduction

There is a function implementation in the actual project that needs to parse strings of some specific modes. In the existing code library, some implemented functions use specific characters to detect. The disadvantage of this method is:

Logically error-prone
It is easy to miss checks on some boundary conditions.
Code complexity is hard to understand and maintain
Poor performance

We can see that there is a cpp in the code base. The entire cpp has more than two thousand lines of code. In a method, there are more than 400 lines of code that parse strings! Comparing the characters one by one is really unsightly. In addition, many of the above comments have expired, and the writing style of many codes is also different. It can be judged that there have been a lot of people.

In this case, there is basically no way to go down this old road. Naturally, I thought of using regular expressions. I have no practical application experience in regular expressions, especially when writing matching rules. The first time I want to find some information from the Internet, I 'd like to get a general idea. However, du Niang's results are still disappointing. (Of course, if you want to find some professional knowledge, the results of DU Niang will be heartbroken every time, all of which are copies of the same style. However, du Niang's daily life is still acceptable. Later, du Niang's query results were abandoned. FQ went outside to find some basic videos (FQ required ).

This article is a summary of the basics of writing regular expressions to match strings. It consists of the following two parts:

Basic rules for matching strings
Regular Expression matching, search, and substitution

The regular expression rule described in this article is ECMAScript. The programming language used is C ++. Other aspects are not described.

Basic rules for matching strings

1. Match fixed strings

regex e("abc");

2. Match fixed strings, case insensitive

regex e("abc", regex_constants::icase);

3. Match one character other than a fixed string, case-insensitive

Regex e ("abc.", regex_constants: icase); //. Any character t newline. 1 character

4. Match 0 or 1 Character

Regex e ("abc? ");//? Zero or 1 preceding character. Match? First character

5. Match 0 or more characters

Regex e ("abc *"); // * Zero or more preceding character. Match the first character *

6. Match one or more characters

Regex e ("abc +"); // + One or more preceding character. Match + previous character

7. Match characters in a specific string

Regex e ("AB [cd] *"); // [...] Any character inside square brackets. match Any character in []

8. Match non-specific characters

Regex e ("AB [^ cd] *"); // [...] Any character not inside square brackets. match Any character not in []

9. match a specific string and specify the number

Regex e ("AB [cd] {3}"); // {n} matches any character before {}, and the number of characters is 3

10. match a specific string and specify the number range

Regex e ("AB [cd] {3,}"); // {n} matches any character before, and the number of characters is 3 or more regex e ("AB [cd] {3, 5}"); // {n} matches any character before, the number of characters is more than 3, and the number of characters is less than 5 closed intervals

11. match a rule in the rule

Regex e ("abc | de [fg]"); // | match | any rule on both sides

12. Matching Group

Regex e ("(abc) de +"); // () indicates a Sub-Group

13. Match sub-groups

Regex e ("(abc) de + \ 1"); // () indicates a sub-group, \ 1 indicates matching the content of the first group in this position. regex e ("(abc) c (de +) \ 2 \ 1 "); // \ 2 indicates matching the content of the second group here

14. Match the start of a string

Regex e ("^ abc."); // ^ begin of the string searches for substrings starting with abc

15. Match the end of a string

Regex e ("abc. $"); // $ end of the string to find the substring ending with abc

The above is the writing of the most basic matching mode. If you want to match a specific character, you need to use \ for escape. For example, if you want to match "." In a matching string, you should add \ before a specific character in the matching string \. If the above basic rules are not met, you can refer to this link. After using the basic matching mode, you need to use a regular expression for matching, searching, or replacement.

Regular Expression matching, search, and substitution

After writing a pattern string, you must match the string to be matched with the pattern string according to certain rules. There are three methods: Match (regex_match), search (regex_search), and replace (regex_replace ).

Matching is simple. You can directly pass the string to be matched and the pattern string to regex_match, and return a bool value to indicate whether the string to be matched meets the pattern string rule. Matches the entire str string.

Bool match = regex_match (str, e); // match the entire string str

Searching is a substring that finds and satisfies the pattern string in the entire string. That is, if 'str' contains a pattern string that meets the condition, true is returned.

Bool match = regex_search (str, e); // search for the substring matching the e rule in the str string

However, in many cases, it is not enough to return a matched bool volume. We need to obtain the matched substring. In this case, you need to group matching strings in the mode string. For details, refer to [basic rules for matching strings. Then pass the smatch into regex_search to obtain a string that meets each sub-group.

smatch m;bool found = regex_search(str, m, e);for (int n = 0; n < m.size(); ++n)  {    cout << "m[" << n << "].str()=" << m[n].str() << endl;  }

Replacement is also completed in the case of grouping based on the pattern string.

cout << regex_replace(str, e, "$1 is on $2");

In this case, "is on" is added between strings that meet the requirements of group 1 and group 2 ".

The above three functions have many versions of overload, which can meet the needs of different situations.

Practice

Requirement: Find the pattern string that meets sectionA ("sectionB") or sectionA ("sectionB. And sectionA and sectionB are separated. SectionA and sectionB do not contain numbers. The characters are case-sensitive and contain at least one character.

Analysis: according to the requirements, it can be roughly divided into two parts: sectionA and sectionaB. This requires grouping.

Step 1: Write the pattern string that meets the section condition

[a-zA-Z]+

Step 2: spaces may appear in sectionA and sectionB. For the moment, assume that there is at most one space.

\ S?

Combine the above two cases, that is, the pattern string that can meet our needs. But how can we divide it into two groups?

[a-zA-Z]+\\s[a-zA-Z]+

The preceding statement is definitely incorrect. According to the grouping rules, you need to distinguish groups ().

regex e("([a-zA-Z]+)\\s?\\(\"([a-zA-Z]+)\"\\)");

At this time, in \ s? The following \ (\ "is used to meet the escape conditions of the outer quotation marks and parentheses of sectionB.

After the preceding steps are completed, you can use regex_match to match the string. If yes, use regex_search to search for the string.

if (regex_match(str, e)){ smatch m; auto found = regex_search(str, m, e); for (int n = 0; n < m.size(); ++n) { cout << "m[" << n << "].str()=" << m[n].str() << endl; }}else{ cout << "Not matched" << endl;}

The first string of the object m array is the entire substring that meets the requirements, followed by the substring that meets the requirements of group 1 and group 2.

Summary

The above section describes the basic knowledge of regular expression pattern matching strings. I hope it will be helpful to you. If you have any questions, please leave a message and I will reply to you in a timely manner. Thank you very much for your support for the help House website!

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Basic knowledge of regular expression pattern matching string, regular expression string

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Basic knowledge of regular expression pattern matching string, regular expression string

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support