Use the ATL Regular Expression Library

Source: Internet
Author: User

Reprinted: In www.csdn.net, due to the needs of the ATL server, it is necessary to decode the address, command, and other complex text field information sent from the client, and the regular expression is recognized as the most powerful text parsing tool, therefore, ATL provides some libraries for regular expressions to facilitate our work.

ATL1Catlregexp classDeclaration: Template <class chartraits = catlrechartraits> class catlregexp; Initialization: Unlike Microsoft's Greta class library (another regular expression Class Library launched by Microsoft Research Institute, catlregexp does not provide initialization methods to match strings in the constructor. Instead, it allows the user to use a regular expression string as a parameter by calling its parse () method, we can construct a required class for matching. For example, we need to match a time format, which can be h: mm or hh: Mm, then we can construct our catlregexp class: catlregexp <> re; re. parse ("{[0-9]? [0-9] }:{ [0-9] [0-9]} "); the regular expression syntax of ATL is similar to that of Perl, however, one thing worth noting is that ATL uses braces ({}) to indicate that it matches the group in the string. In the above expression, two groups are declared, one is [0-9]? [0-9], and [0-9] [0-9].

Matching:
Call the match () method of catlregexp to use this class for matching. The prototype of the match method is as follows: bool match (const RECHAR * szin, catlrematchcontext <chartraits> * pcontext, the const RECHAR ** ppszend = NULL) parameter has obvious meanings. However, you must note that the type of the first parameter is const RECHAR * szin, which is a const pointer, this indicates that we can easily use the c_str () method of the STD: string class to pass parameters to it.

The match result is returned through the catlrematchcontext <> class pointed to by the second pcontext parameter. The match result and related information are stored in the catlrematchcontext class, you only need to access the catlrematchcontext method and the members to obtain the matching results.

2Catlrematchcontext classDeclaration: Template <class chartraits = catlrechartraits> class catlrematchcontext uses: catlrematchcontext provides the caller with matching result information through m_unumgroups members and getmatch () methods. M_unumgroups indicates the number of matched groups. getmatch () indicates the group
Returns the pstart and pend pointers of the matched strings. The caller can easily obtain the matching results.3, A small exampleThe following example demonstrates the typical use of the catlregexp and catlrematchcontext classes from msdn: # include "stdafx. H "# include <atlrx. h> int main (INT argc, char * argv []) {catlregexp <> reurl; // five match groups: Scheme, authority, path, query, fragment reparseerror status = reurl. parse ("({[^ :/? #] + }:)? (// {[^ /? #] *})? {[^? #] *} (? {[^ #] *})? (#{.*})? "); If (reparse_error_ OK! = Status) {// Unexpected error. Return 0;} catlrematchcontext <> mcurl; If (! Reurl. Match ("http://search.microsoft.com/us/Search.asp? Qu = ATL & Boolean = all # results ", & mcurl) {// Unexpected error. return 0;} For (uint ngroupindex = 0; ngroupindex <mcurl. m_unumgroups; ++ ngroupindex) {const catlrematchcontext <>:: RECHAR * szstart = 0; const catlrematchcontext <>:: RECHAR * szend = 0; mcurl. getmatch (ngroupindex, & szstart, & szend); ptrdiff_t nlength = szend-szstart; printf ("% d:/" %. * s/"/N", ngroupindex, nlength, szstart) ;}} output 0: "HTTP"
1: "search.microsoft.com"
2: "/US/search. asp"
3: "Qu = ATL & Boolean = all"
4: The regular expression used in the "Results" example is :( {[^ :/? #] + }:)? (// {[^ /? #] *})? {[^? #] *} (? {[^ #] *})? (#{.*})? Divided into five groups with () as the demarcation mark. The first group is {[^ :/? #] +}:, ^ Indicates the meaning of a Member after "not". That is to say, the first group starts from the beginning :,/,? , # Any of them ends. After you contact the string to be matched, the matching result is HTTP.4And customize the abbreviated form of matching stringFor convenience, Atl has helped us define some simple forms of regular expressions that are frequently used. For example,/D indicates ([0-9]),/n indicates (/R | (/R? /N. These abbreviations are embodied in catlrechartraitsa/catlrechartraitsw and other classes. If these classes are passed as template parameters to catlregexp and catlrematchcontext, we can define our own matching string abbreviations. Class catlrechartraitsa {static const rechartype ** getabbrevs () {static const rechartype * s_szabbrevs [] = {"A ([a-zA-Z0-9])", // alpha numeric "B ([// T])", // white space (blank) "C ([A-Za-Z])", // Alpha "d ([0-9])", // digit "H ([0-9a-fa-f])", // hex digit "N (/R | (/R? /N) ", // newline" Q (/"[^/"] */") | (/''' [^/'''] */''') ", // quoted string" W ([A-Za-Z] +) ", // simple word" Z ([0-9] +) ", // integer null}; return s_szabbrevs ;}}; The above is atlrx. the Code extracted by H clearly shows that ATL defines the abbreviation of a string through a getabbrevs () function. To define a new abbreviated form, we only need to: Class myregtraits: Public ATL: catlrechartraitsa {public: static const rechartype ** getabbrevs () {static const rechartype * s_szabbrevs [] = {"A ([a-zA-Z0-9])", // alpha numeric "B ([// T])", // white space (blank) "C ([A-Za-Z])", // Alpha "d ([0-9])", // digit "H ([0-9a-fa-f])", // hex digit "N (/R | (/R? /N) ", // newline" Q (/"[^/"] */") | (/''' [^/'''] */''') ", // quoted string" W ([A-Za-Z] +) ", // simple word" Z ([0-9] +) ", // integer" E ([0-8] +) ", // Add null by yourself }; return s_szabbrevs ;}}; Let's define the trait class inherited from catlrechartraitsa, then rewrite the getabbrevs () function, and add some abbreviations to be used. The following code example uses the simple expression "/E" defined in our own class: int main () {ATL: catlregexp <myregtraits> re; re. parse ("// e +"); ATL: catlrematchcontext <myregtraits> MC; bool RES1 = Re. match ("678", & Mc); // returns true: Successful match RES1 = Re. match ("999", & Mc); // returns false: Match fail} when constructing the ATL: catlregexp and ATL: catlrematchcontext classes, pass the previous myregtraits class as a traits parameter, you can directly use your own simple symbol.

5And end

Although the C ++ Community already has the boost: RegEx, Greta, and other well-known Regular Expression Libraries, but as a template library that comes with VC ++, the Regular Expression Library in ATL still provides great convenience for our work. As ATL is a library officially released by Microsoft, it has good documentation, strict tests, and official Microsoft technical support. In addition, when using ATL to develop COM components, you can easily use the huge power of the Regular Expression Library.

Due to my knowledge Limited, the content of the article is inevitable, if the word criticism correction, Please mail: firingme@sina.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.