A regular form of expression is a method that is often used. The more famous class library is boost, but the class library is heavy. It's all like looking for some lightweight class libraries.
Later found that the quasi-standard library TR1 has been very convenient, Microsoft VS2008 SP1 above version numbers are supported. It's very convenient to use it all directly.
It is also very convenient to support Unicode encoding.
Examples:
#include <iostream>
#include <string>
#include <regex>
int _tmain (int argc, _tchar* argv[])
{
Std::locale Loc ("");
Std::wcout.imbue (Loc);
Std::wstring text (_t ("My IP address is: 109.168.0.1."));
Std::wstring Newip (_t ("127.0.0.1"));
Std::wstring regstring (_t (\\d+) \ \. ( \\d+) \ \. (\\d+) \ \. (\\d+) "));
Expression options-ignore uppercase and lowercase
Std::regex_constants::syntax_option_type fl = std::regex_constants::icase;
Compiling a regular table-statement
Std::wregex regexpress (Regstring, FL);
Save the results of a lookup
Std::wsmatch MS;
Infer whether the full line matches
if (Std::regex_match (text, MS, regexpress))
{
std::wcout<<_t ("Normal table:") <<regstring<<_t ("match:") <<text<<_t ("success.") <<std::endl;
}
Else
{
std::wcout<<_t ("Normal table:") <<regstring<<_t ("match:") <<text<<_t ("failed.") <<std::endl;
}
Find
if (Std::regex_search (text, MS, regexpress))
{
std::wcout<<_t ("Normal expression:") <<regstring<<_t ("Find:") <<text<<_t ("success.") <<std::endl;
for (size_t i= 0; i < ms.size (); ++i)
{
std::wcout<<_t ("<<i<<_t") ("Result: \" ") <<ms.str (i) <<_t (" \ "-");
std::wcout<<_t ("Starting Position:") <<ms.position (i) <<_t ("Length") <<ms.length (i) <<std::endl;
}
std::wcout<<std::endl;
Replacement 1
Text = Text.replace (Ms[0].first, Ms[0].second, NEWIP);
std::wcout<<_t ("Replace text after 1:") <<text<<std::endl;
}
Else
{
std::wcout<<_t ("<<regstring<<_t:") ("Lookup:") <<text<<_t ("failed.") <<std::endl;
}
Replacement 2
Newip = _t ("255.255.0.0");
Std::wstring NewText = std::regex_replace (text, regexpress, NEWIP);
std::wcout<<_t ("Replace text after 2:") <<newText<<std::endl;
End
std::wcout<<_t ("Press ENTER to end ...");
Std::wcin.get ();
return 0;
}
Cyclic fetching:
Std::regex_constants::syntax_option_type fl = std::regex_constants::icase;
Const Std::tr1::regex pattern ("http://[^\\\" \\>\\<]+?\\. Png|jpg|bmp) ", FL);
Std::tr1::smatch result;
Std::string::const_iterator its = Strhtml.begin ();
Std::string::const_iterator ItE = Strhtml.end ();
while (Regex_search (its,ite, result, pattern)//assuming a match succeeds
{
M_clbregex.addstring (CString) result[0].str (). C_STR ());
M_clbregex.addstring ((CString) (String (Result[0].first,result[0].second)). C_STR ());
its=result[0].second;//new position Start match
}
[code Description]
1. Create a regular table-type object with 3 methods:
(1) using constructors
std::regex_constants::syntax_option_type fl = std::regex_constants::icase;//syntax options, which can be used to set which style of the normal table syntax and so on.
Std::wregex regexpress (regstring, FL);
(2) using the assignment operator, the disadvantage is that you cannot specify syntax options and are also less efficient.
Std::wregex regexpress;
regexpress = regstring;
(3) Use the Assign method.
Std::wregex regexpress;
regexpress.assign (Regstring, FL);
The construction of a regular object is called "compilation".
2. Regex_match () and Regex_search ()
Regex_match () returns true only if the entire string matches the normal table, and Regex_search () returns true if the substring matches.
3. Match result object Std::wsmatch.
people familiar with the Perl statement know that a successful match can be done with a $ ... $N to get the substring, the TR1 Regex library stores the matching results in a std::wsmatch (UNICODE)/Std::smatch (ANSI) object.
Std::wsmatch is an array of several Std::wssub_match objects. And Std::wssub_match derives from pair.
The starting position pointer of the substring is saved by Std::wssub_match::first (in fact, the iterator is a bit more accurate).
The pointer to the end position of the substring is saved by std::wssub_match::second (Common principle of STL, half open interval).
so [Std::wssub_match::first,std::wssub_match::second] is all the content of the substring.
of course, Std::wsmatch (pre-defined class for Match_result templates) provides a few easy ways to access substrings:
(1) the STR (IDX) method returns the Std::string/std::wstring object of the corresponding substring. It is only used most often.
(2) the position (IDX) method returns the starting offset of the corresponding substring. (not a pointer, which is the offset from the first byte address or begin (). )
(3) Length (IDX) returns the lengths of the substrings.
4. Replace the substring.
before we say that Std::wssub_match::first/second holds the start/end position of a substring, we can of course use the pointer (iterator) to replace the text (see "Replacement 1" in the code).
or use Std::regex_replace () can also achieve the purpose (see "Replacement 2" in the code).
Several commonly used expressions:
"\\b1[35][0-9]\\d{8}|147\\d{8}|1[8][01236789]\\d{8}\\b";//Mobile phone number
"\\b0\\d{2,3}\\-?\\d{7,8}\b"; Landline
"\\b[1-9]\\d{5} (?: 19|20) \\d{2} (?: 0 [1-9]| [1] [012]) (? #月) (?: 0 [1-9]| [12] [0-9]| [3] [01]) (? #日) \\d{3}[\d| x|x]\\b "; 18-digit ID
"\\b[1-9]\\d{7} (?: 0 [1-9]| [1] [012]) (? #月) (?: 0 [1-9]| [12] [0-9]| [3] [01]) (? #日) \\d{3}\\b "; 15-digit ID
"\\b (?:(? : 2[0-4]\\d|25[0-5]| [01]?\\d\\d?] \\.) {3} (?: 2[0-4]\\d|25[0-5]| [01]?\\d\\d?] \\b "; Ip4
"\\b (?: [a-za-z0-9_-]) +@ (?: [a-za-z0-9_-]) + (?: \ \. [a-za-z0-9_-] {2,3}) {1,2}\\b "; Mailbox
C + + Regular table-type