Introduction to boost tokenizer

Source: Internet
Author: User
Introduction to boost tokenizer

-------------------------
1. Introduction

Boost tokenizer provides the ability to convert character sequences into a set of tokens. Of course, you can also define the tokenizerfunction to define the splitting symbols of sequences. If this parameter is not specified, it is separated by spaces by default, remove some punctuation marks.


2. A few simple examples

The following is a simple example:

// simple_example_1.cpp#include<iostream>#include<boost/tokenizer.hpp>#include<string>int main(){   using namespace std;   using namespace boost;   string s = "This is,  a test";   tokenizer<> tok(s);   for(tokenizer<>::iterator beg=tok.begin(); beg!=tok.end();++beg){       cout << *beg << "\n";   }}

The result is as follows:
This
Is
A
Test

Punctuation marks have been filtered out here.

The following is an example of splitting by character step:

// Simple_example_3.cpp # include <iostream> # include <boost/tokenizer. HPP >#include <string> int main () {using namespace STD; using namespace boost; string S = "12252001"; int offsets [] = {, 4 }; // three steps are specified here: offset_separator F (offsets, offsets + 3); tokenizer <offset_separator> Tok (S, f); For (tokenizer <offset_separator> :: iterator beg = Tok. begin (); beg! = Tok. End (); ++ beg) {cout <* beg <"\ n ";}}

The result is as follows:
12
23
2001

3. What is tokenizerfunction?

Tokenizerfunction is a token used to query matching requirements. Currently, three tokenizerfunction templates are provided,
* Escaped_list_separator is mainly used to parse strings in CSV format.
Explicit escaped_list_separator (char E = '\', char c = ',', char q = '\"')
Escaped_list_separator (string_type E, string_type C, string_type q ):
* Offset_separator is mainly used to parse requirements based on specific step sizes.

Template <typename ITER>

Offset_separator (ITER begin, ITER end, bool bwrapoffsets = true, bool breturnpartiallast = true)

* Char_separator is mainly used to parse the requirements based on specific character segmentation.

Explicit char_separator (const char * dropped_delims,

Const char * kept_delims = "",
Empty_token_policy empty_tokens = drop_empty_tokens)

4. A simple example of parsing/etc/passwd

/** * @auth lemo.lu * @date 2011.11.03 * * example of Boost tokenizer template usage,This example uses delimiter * separator.  */// stl header#include <iostream>                  // iostream#include <string>                    // string#include <fstream>                   // ifstream// boost#include <boost/tokenizer.hpp>       // boost Tokenizerint main(){    std::ifstream passwdFile;    passwdFile.open("/etc/passwd",std::ifstream::in);    // store password line    char passwdString[256];        typedef boost::tokenizer<boost::char_separator<char> > passwdTokenizer;    // set a TokenizerFunction , dropped delimiters ":" and keep delimiters ""    boost::char_separator<char> tokenSep(":", "", boost::keep_empty_tokens);    // passwd format information    static const char* passwd_st[] = { "Account","password","UID","GID","GECOS","Dir","Shell"    };    // iterator the passwd file    while(passwdFile.good())    {        // get line        passwdFile.getline(passwdString,256);        passwdTokenizer tok(std::string(passwdString), tokenSep);        int passwd_c = 0;        for(passwdTokenizer::iterator curTok=tok.begin(); curTok!=tok.end(); ++curTok)            std::cout << passwd_st[passwd_c++] << ":" << *curTok  << std::endl;        std::cout << "---------------------" << std::endl;    }passwdFile.close();}

Some results are as follows:

Account: Root
Password: x
UID: 0
GID: 0
GECOS: Root
Dir:/root
Shell:/bin/bash
---------------------
Account: Daemon
Password: x
UID: 1
GID: 1
GECOS: Daemon
Dir:/usr/sbin
Shell:/bin/sh
---------------------

5. Reference

Http://www.boost.org/doc/libs/1_47_0/libs/tokenizer/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.