[Switch] multi-layer string cutting in C Language (how to use strtok_r strtok) and strtok_rstrtok
[Switch] multi-layer string cutting in C Language (how to use strtok_r strtok)
This article mainly introduces the method of multi-layer string cutting in C language, introduces the weakness of strtok, and uses the strtok_r method.
Address: http://www.jb51.net/article/43744.htm
1. strtok Introduction
As we all know, strtok can be based on user-provided delimiters (and separators can also be plural numbers, such as ","). Splits a string until "\ 0" is encountered ".
For example, separator = "," string = "Fred, John, Ann"
Using strtok, we can extract the three strings "Fred", "John", and "Ann.
The above C code is
Copy the Code as follows: int in = 0; char buffer [] = "Fred, John, Ann" char * p [3]; char * buff = buffer; while (p [in] = strtok (buf ,","))! = NULL) {I ++; buf = NULL ;}
As shown in the above Code, the first execution of strtok needs to take the address of the target string as the first parameter (buf = buffer), and then strtok needs to take NULL as the first parameter (buf = NULL ). The pointer column p [] stores the split result. p [0] = "John", p [1] = "John ", p [2] = "Ann", and the buf is changed to Fred \ 0John \ 0Ann \ 0.
2. strtok Vulnerabilities
Let's change our plan: We have a string named "Fred male 25, John male 62, Anna female 16". We want to sort this string and input it to a struct,
Copy the Code as follows: struct person {char [25] name; char [6] sex; char [4] age ;}
To do this, one of the methods is to extract a string separated by commas (,) and then separate it with spaces. For example, we cut off "Fred male 25" and split it into "Fred" "male" "25". Below I wrote a small program to demonstrate this process:
Copy the Code as follows: # include <stdio. h> # include <string. h> # define INFO_MAX_SZ 255 int main () {int in = 0; char buffer [INFO_MAX_SZ] = "Fred male 25, John male 62, Anna female 16 "; char * p [20]; char * buf = buffer;
While (p [in] = strtok (buf ,","))! = NULL) {buf = p [in]; while (p [in] = strtok (buf ,""))! = NULL) {in ++; buf = NULL;} p [in ++] = "***"; // represents the split buf = NULL ;}
Printf ("Here we have % d strings \ n", in); for (int j = 0; j <in; j ++) printf ("> % s <\ n", p [j]); return 0 ;}
The output of this program is: Here we have 4 strings> Fred <> male <> 25 <> *** <this is just a small piece of data, not what we need. But why? This is because strtok uses a static (static) pointer to operate data. Let me analyze the running process of the above Code:
Red indicates the position pointed to by strtok's built-in pointer, and blue indicates strtok's string modification.
1. "Fred male 25, John male 62, Anna female 16" // External Loop
2. "Fred male 25 \ 0 John male 62, Anna female 16" // enter the inner loop
3. "Fred \ 0 male 25 \ 0 John male 62, Anna female 16"
4. "Fred \ 0male \ 025 \ 0 John male 62, Anna female 16"
5 "Fred \ 0male \ 025 \ 0 John male 62, Anna female 16" // inner loop Encounters "\ 0" back to outer loop
6 "Fred \ 0male \ 025 \ 0 John male 62, Anna female 16" // an External Loop Encounters "\ 0.
3. Use strtok_r
In this case, we should use strtok_r, strtok reentrant. char * strtok_r (char * s, const char * delim, char ** ptrptr );
Compared with strtok, we need to provide a pointer for strtok to operate, instead of using a matched pointer like strtok. Code:
Copy the Code as follows: # include <stdio. h> # include <string. h> # define INFO_MAX_SZ 255 int main () {int in = 0; char buffer [INFO_MAX_SZ] = "Fred male 25, John male 62, Anna female 16 "; char * p [20]; char * buf = buffer;
Char * outer_ptr = NULL; char * inner_ptr = NULL;
While (p [in] = strtok_r (buf, ",", & outer_ptr ))! = NULL) {buf = p [in]; while (p [in] = strtok_r (buf, "", & inner_ptr ))! = NULL) {in ++; buf = NULL;} p [in ++] = "***"; buf = NULL ;}
Printf ("Here we have % d strings \ n", in); for (int j = 0; j <in; j ++) printf ("> % s <\ n", p [j]); return 0 ;}
The output for this time is: here we have 12 strings> Fred <> male <> 25 <> *** <> John <> male <> 62 <> *** <> Anna <> female <> 16 <> *** <
Let me analyze the running process of the above Code:
The red color indicates the position pointed to by the strtok_r outer_ptr, the purple color indicates the position pointed to by the strtok_r inner_ptr, and the blue color indicates the strtok's modification to the string.
1. "Fred male 25, John male 62, Anna female 16" // External Loop 2. "Fred male 25 \ 0 John male 62, Anna female 16" // enter the inner loop 3. "Fred \ 0 male 25 \ 0 John male 62, Anna female 16" 4 "Fred \ 0male \ 025 \ 0 John male 62, anna female 16 "5" Fred \ 0male \ 025 \ 0 John male 62, anna female 16 "// inner loop Encounters" \ 0 "back to outer loop 6" Fred \ 0male \ 025 \ 0 John male 62 \ 0 Anna female 16 "// enters the inner loop
Originally, this function modified the original string.
Therefore, when char * test2 = "feng, ke, wei" is used as the first parameter, the content pointed to by test2 is stored in the text constant area at location ①, the content in this area cannot be modified, so a memory error occurs. the content indicated by test1 in char test1 [] = "feng, ke, wei" is stored in the stack, so you can modify