String splitting function STRTOK & Strtok_r (GO)

Source: Internet
Author: User
Tags function prototype strtok


1. An example of application


A more classic example of a network is slicing a string into a struct. For example, an existing structure



typedef struct person{
Char name[25];
Char sex[10];
Char age[4];
}person;



Required from string char buffer[info_max_sz]= "Fred male 25,john male 62,anna female 16"; Name, gender, and age are extracted.



One possible idea is to set up a two-layer loop. Outer loop, first with ', ' (comma) as the delimiter, the information of three people is separated, and then for each substring, and then the "(space) as the delimiter to get the person name, gender and age.



According to this idea, we should be able to achieve the desired function. To simplify the steps, we call Strtok, first save the substring to a string pointer array, the end of the program to print the pointer array to save all the substrings, verify the correctness of the program. The procedure to be obtained should be as follows:


int in=0;

Char buffer[info_max_sz]="Fred male 25,john male 62,anna female 16";

Char *p[20];

char *buf = buffer;

while ((P[in]=strtok (buf,","))!=null)

{

Buf=p[in];

While ((P[in]=strtok (buf,""))!=null)

{

in++;

Buf=null;

}

Buf=null;

}

printf ("Here we had%d strings/n", in);

for (int j=0; j<in; j + +)

{

printf (">%s</n", P[j]);

}








The result of execution is that only the first person's information is extracted. It seems that the implementation of the program did not follow our expectations. What is the reason?



The reason is: in the first outer loop , strtok "Fred male 25," after the comma, changed to '/0 ', when the strtok internal this pointer to the comma after the character ' J '. after the first internal cycle , "Fred" "Male" "25" were extracted. After extracting "25", the this pointer inside the function is modified to point to '/0 ' after "25". after the inner loop is finished (the inner loop actually executes 4 times), the second outer loop begins , since the first parameter of the function is set to Null,strtok where the this pointer points to the starting position of the decomposition. Unfortunately, at this point the this pointer is pointing to '/0 ', strtok cannot be sliced for an empty string, and returns NULL. The outer loop ends. So, we only got the first person's information.






It seems that using strtok can not solve the problem of extracting multi-person information through two-layer cyclic method. is there any other way? Obviously, there are other ways.



I have given a solution. At the same time, ', ' (comma) and ' (space) as the delimiter, a layer of loops to solve the problem.



in = 0;

while ((P[in] = Strtok (buf, ",")) = NULL)

{

switch (in% 3)

{

Case 0:

printf ("person%d: name!/n", in/3+1);

Break ;

Case 1:

printf ("person%d: sex!/n", in/3+1);

Break ;

Case 2:

printf ("person%d: age!/n", in/3+1);

Break ;

}

in++;

BUF = NULL;

}

printf ("Here we had%d strings/n", in);

for (int j=0; j<in; j + +)

{

printf (">%s</n", P[j]);

}









Although the program can achieve the desired results, it is not a very good solution. The program requires you to know exactly how many data members a struct contains before extracting it. obviously not as intuitive as a double loop.



If we must adopt the structure extraction of the double cycle, is there any suitable function to replace strtok? Yes, it is strtok_r.





2.strtok_r and its use


Strtok_r is a thread-safe version of the Strtok function under the Linux platform. It is not included in Windows String.h. To use this function, the Internet search its Linux implementation of the source code, copied into your program can be. There should be other ways, such as using the GNU C Library. I downloaded the GNU C Library and found the Strtok_r implementation code in its source code and copied it over. Can be seen as the combination of the first method and the second method.



Strtok function prototype is char *strtok_r (char *str, const char *delim, char * * Saveptr);



The following English explanation of Strtok is from http://www.linuxhowtos.org/manpages/3/strtok_r.htm, the translation is given by me.



The strtok_r() function is a reentrant version strtok(). The saveptr argument is a pointer to a char * variable that's used internally by Strtok_r() in Order to maintain context between successive calls that parse the same string.



The Strtok_r function is a reentrant version of the Strtok function. char * * The saveptr parameter is a pointer variable that points to char *, which is used to hold the context of the Shard inside the Strtok_r to resolve successive calls to decompose the same source string.



On the first call to Strtok_r(), str should point to the string to be parsed, and the value of SAVEP TR is ignored. In subsequent calls, str should is NULL, and saveptr should is unchanged since the previous call.



The first time you call Strtok_r, the str parameter must point to the string to be extracted, and the value of the SAVEPTR parameter can be ignored. On successive calls, STR is assigned a value of Null,saveptr, which is returned after the last call, and not modified.



Different strings May is parsed concurrently using sequences of calls to Strtok_r() that specify Differentsav Eptr arguments.



A series of different strings may be called consecutively at the same time to fetch STRTOK_R, to pass different saveptr parameters for different calls.



The strtok() function uses a static buffer while parsing, so it's not thread safe. Use the strtok_r() If this matters.



The Strtok function uses a static buffer when extracting a string, so it is not thread-safe. If you want to take into account the security of threads, you should use Strtok_r.






Strtok_r is actually the this pointer that implicitly holds the strtok internally, interacting with the outside of the function as arguments. Passed, saved, or even modified by the caller. Requires the caller to continuously slice the homologous string, in addition to assigning the STR parameter to NULL, and also passing the saveptr saved at the last shard.



For example, do you remember the example of extracting a struct from the previous paragraph? We can use Strtok_r to extract information from everyone in a double-loop format.


int in=0;

Char buffer[info_max_sz]="Fred male 25,john male 62,anna female 16";

Char *p[20];

Char *buf=buffer;

Char *outer_ptr=null;

Char *inner_ptr=null;

while ((P[in] = Strtok_r (buf, ",", &outer_ptr))!=null)

{

Buf=p[in];

while ((P[in]=strtok_r (buf, "", &inner_ptr))!=null)

{

in++;

Buf=null;

}

Buf=null;

}

printf ("Here we had%d strings/n", in);

for (int j=0; j<in; j + +)

{

printf (">%s</n", P[j]);

}






Calling Strtok_r's code is two more pointers than calling Strtok code, OUTER_PTR and Inner_ptr. Outer_ptr is used to mark the extraction position of each person, that is, the outer loop; Inner_ptr is used to mark the extraction location of each information within each person, that is, the inner loop. The process is as follows:



(1) The 1th outer Loop,outer_ptr ignores , extracts the entire source string, extracts "Fred male 25", delimiter ', ' modified to '/0 ', OUTER_PTR returns to ' J '.



(2) The first internal cycle,Inner_ptr ignored , the 1th outer loop extraction results "Fred Male 25" to extract the "Fred", The delimiter "was modified in order to '/0 ', inner_ptr return to the ' m ‘。



(3) The second inner loop, passing the first time within the loop return of the INNER_PTR, the first parameter is null, from Inner_ptr point to the position of ' m ' to extract the "male", the delimiter "is modified in order to '/0 ', inner_ptr return to ' 2 '.



(4) The third loop, passing the second inner loop return of the INNER_PTR, the first parameter is null, from Inner_ptr point to the position of ' 2 ' to extract the "25", because no ", inner_ptr return to 25 after the '/0 '.



(5) The fourth time loop, passing the third time the loop returns the Inner_ptr, the first parameter is null, because inner_ptr points to a position of '/0 ', cannot extract, return null value. End the inner loop.



(6) The 2nd outer Loop, passing the 1th outer loop return of the Outer_ptr, the first parameter is null, from Outer_ptr point to the position of ' J ' to extract the "John male 62", the delimiter ', ' modified in order to '/ 0 ', OUTER_PTR returns point ' A '. ( call Strtok is stuck in this step )



...... And so on, the outer loop extracts all the information of one person at a time, and the internal loop extracts the personal information from the results of the extraction of the outer loop two times.



You can see that Strtok_r displays the original internal pointer, providing the SAVEPTR parameter. Increases the flexibility and security of the function.





Source code for 3.strtok and Strtok_r


The implementations of these two functions are made up of numerous versions. I strtok_r from GNU C Library,strtok and called Strtok_r. Therefore, the source code of Strtok_r is given first.


/*

* STRTOK_R.C:

* Implementation of STRTOK_R for systems which don ' t has it.

*

* This is taken from the GNU C Library and is distributed under the terms of

* The LGPL. See copyright notice below.

*

*/

#ifdef HAVE_CONFIG_H

#include "Configuration.h"

#endif/* Have_config_h */

#ifndef Have_strtok_r

Static const char rcsid[] = "$Id: Strtok_r.c,v 1.1 2001/04/24 14:25:34 Chris Exp $";

#include <string.h>

#undef Strtok_r

/* Parse S into tokens separated by characters in DELIM.

If S is NULL, the saved pointer in Save_ptr is used as

The next starting point. For example:

Char s[] = "-abc-=-def";

Char *sp;

x = Strtok_r (S, "-", &SP); x = "abc", SP = "=-def"

x = Strtok_r (NULL, "-=", &SP); x = "Def", sp = NULL

x = Strtok_r (NULL, "=", &sp); x = NULL

s = "abc/0-def/0"

*/

Char *strtok_r (char *s, const Char *delim, char **save_ptr) {

Char *token;

if (s = = NULL) s = *save_ptr;

/ * Scan leading delimiters. */  

s + = STRSPN (s, Delim);

if (*s = = '/0 ')

return NULL;

/ * Find The end of the token. */  

token = s;

s = strpbrk (token, delim);

if (s = = NULL)

/ * This token finishes the string. */  

*save_ptr = STRCHR (token, '/0 ');

else {

/ * Terminate the token and make *save_ptr point past it. */  

*s = '/0 ';

*save_ptr = s + 1;

}

return token;

}



The overall code flow is as follows:



(1) Determine if the parameter S is null, and if it is null, the SAVE_PTR is passed in as the starting decomposition position, and if it is not null, The Shard begins with S.



(2) skips all delimiters that begin with the string to be exploded.



(3) Determine if the current position to be decomposed is '/0 ', if it returns Null (in relation to (a) the interpretation of the return value null);



(4) Save the current list of tokens to be decomposed, call strpbrk in token to find the delimiter: If not found, the SAVE_PTR is assigned to the end of the string to be decomposed "/0" location, token does not change; If you find it, assign the delimiter position to '/0 ', Token is equivalent to being truncated (extracted), and save_ptr points to the next bit of the delimiter.



(5) The last function (either found or not found) is returned.



For function strtok, it can be understood that an internal static variable is used to save the save_ptr in Strtok_r and not be visible to the caller. the code is as follows:


        1. Char *strtok (char *s, const char *delim)
        2. {
        3. static char *last;
        4. return Strtok_r (S, Delim, &last);
        5. }


String Split function STRTOK & Strtok_r (GO)


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.