Sscanf ()
2011-11-23 09:07:56........ I have been crawling recently. The STL of C ++ makes me tangle with the lack of support for regular expressions. I should use regular expressions unless I install the boost library, but I am very lazy, considering the running efficiency, boost has not been installed since it was downloaded. All functions used to extract information from crawlers are self-written functions. Facts have proved that it is difficult and the effect is poor!
I accidentally read a book this morning and found a magic function sscanf (). I have seen it before, but today I know it is so powerful. The following content is taken from Baidu Encyclopedia:
Name:
... Sscanf ()-read data that matches the specified format from a string.
Function prototype:
....Int sscanf (const char *, const char *,...);
....Int sscanf (const char * buffer, const char * Format [, argument]...);
... Buffer stored data
... Format control string
... Argument specifies a string.
... Sscanf reads data from the buffer and writes the data back according to argument settings.
Header file:
... # Include <stdio. h>
Return Value:
... If the request succeeds, the number of parameters is returned. If the request fails, the value-1 is returned. The error cause is stored in errno.
Note:
Like scanf, sscanf is used for input, but the latter uses the keyboard (stdin) as the input source, and the former uses a fixed string as the input source.
.... The first parameter can be one or more {% [*] [width] [{H | L | i64 | L}] type | ''| '\ T' | '\ n' | non-% sign}
... Note:
1. * can also be used in the format. (% * D and % * s) with an asterisk (*) indicates skipping this data and not reading it. (that is, do not read this data into the parameter)
2. {A | B | c} indicates A, B, and C. Select [d], which indicates D or D.
3. width indicates the read width.
4. {H | L | i64 | L}: parameter size. Generally, h indicates a single-byte size, I indicates a 2-byte size, and l indicates a 4-byte size (double exception ), l64 indicates 8-byte size.
5. Type: this is a lot, such as % s and % d.
6. Special: % * [width] [{H | L | i64 | L}] type indicates that values that meet this condition are filtered out and no value is written to the target parameter.
0 is returned for failure. Otherwise, the number of formatted parameters is returned.
Collection operations are supported:
% [A-Z] indicates matching any character in A to Z, greedy (as many as possible)
% [AB '] matches a, B, and', greedy
% [^ A] matches any character other than a, greedy
Example:
1. Common usage.
Char Buf [512];
Sscanf ("123456", "% s", Buf); // here Buf is the array name, which means to store 123456 in the form of % s into Buf!
Printf ("% s \ n", Buf );
Result: 123456
2. Take a string of the specified length. In the following example, a string with a maximum length of 4 bytes is obtained.
Sscanf ("123456", "% 4 s", Buf );
Printf ("% s \ n", Buf );
Result: 1234
3. Obtain the string of the specified character. For example, in the following example, the string is obtained when a space is encountered.
Sscanf ("123456 abcdedf", "% [^]", Buf );
Printf ("% s \ n", Buf );
Result: 123456
4. Take a string that only contains the specified character set. For example, in the following example, take a string that only contains 1 to 9 letters and lowercase letters.
Sscanf ("123456 abcdedfbcdef", "% [1-9a-z]", Buf );
Printf ("% s \ n", Buf );
Result: 123456 abcdedf
When input:
Sscanf ("123456 abcdedfbcdef", "% [1-9a-z]", Buf );
Printf ("% s \ n", Buf );
Result: 123456
5. Obtain the string of the specified character set. For example, in the following example, a string with uppercase letters is used.
Sscanf ("123456 abcdedfbcdef", "% [^ A-Z]", Buf );
Printf ("% s \ n", Buf );
Result: 123456 abcdedf
6. Given a string iios/12ddwdff @ 122, get the string between/and @ and filter out "iios/" first, then, send a string of content other than '@' to the Buf.
Sscanf ("iios/12ddwdff @ 122", "% * [^/]/% [^ @]", Buf );
Printf ("% s \ n", Buf );
Result: 12 ddwdff.
7. Given a string "Hello, world", only world is retained. (Note: "," is followed by a space. % s is stopped when a space is used, and * is added to ignore the first read string)
Sscanf ("Hello, world", "% * S % s", Buf );
Printf ("% s \ n", Buf );
Result: World
% * S indicates that the first matching % s is filtered out, that is, hello is filtered out.
If there is no space, the result is null.
Sscanf is similar to a regular expression, but does not have a strong regular expression. Therefore, we recommend that you use a regular expression for complex string processing.
//-------------------------------------------------------
Use it to separate strings such as 2006: 03: 18:
Int A, B, C;
/* Sscanf ("200:0:18", "% d: % d", a, B, c); * // * the error method must be in the variables A, B, add the address character before C, modified by huanmie_09 */
Sscanf ("200:0:18", "% d: % d", & A, & B, & C );
And-2006: 04: 18:
Char sztime1 [16] = "", sztime2 [16] = "";
Sscanf ("2006:0:18-2006:04:18", "% s-% s", sztime1, sztime2 );
But later, I needed to handle
The space on both sides of '-' is canceled, but the % s definition of the string is broken.
I need to re-design a function to handle this situation? This is not complicated, but to make all Code All have a uniform style. I need to change many places and replace the existing sscanf with my own split function. I thought I must do this and fell asleep with a strong dissatisfaction with sscanf. I woke up and found that I didn't have.
The format-type has a type field such as %. If the string to be read is not separated by spaces, you can use % [].
% [] Is similar to a regular expression. [A-Z] indicates that all characters of A-Z are read, and [^ A-Z] indicates that all characters except a-Z are read.
That's why the problem was solved:
Sscanf ("2006:0:18-2006:04:18", "% [0-9,:]-% [0-9,:]", sztime1, sztime2 );
In the softmse (Jake) issue post, we gave a cool sscanf use case, and then learned to find that sscanf is awesome. Now we will make a summary.
Original problem:
Iios/12ddwdff @ 122
How to obtain the string between/and @?
C Program Is there any function in it?
Code:
# Include <stdio. h>
Int main ()
{
Const char * s = "iios/12ddwdff @ 122 ";
Char Buf [20];
Sscanf (S, "% * [^/]/% [^ @]", Buf );
Printf ("% s \ n", Buf );
Return 0;
}
Result: 12 ddwdff.