REGEX. c gnu extracts and filters data, regex. cgnu
Today, @ SVCHAO caught up with interest .. I have picked up the regular expression interest and tried notepad ++. Basically, the syntax is not forgotten. However, if it is used in an embedded solution, it still seems a little difficult.
Mark a basic syntax first.
Single Character matching
It is described in square brackets. For example, [0135678] indicates 0 1 3 5 6 7 8 can be matched through this time
Exclude using square brackets. For example, [^ azAZ] indicates that all except a z A and Z can be matched by this byte.
-Indicates the range. For example, [0-9a-zA-Z] indicates all numbers and letters.
Common expressions exist for a single character. For example, \ w represents [0-9a-zA-Z ].
Use {} to describe the number of matches. For example, \ w {2} indicates that two consecutive characters are numbers or letters.
Capture comparison results using...
The above content is self-developed.
This demo was found for embedded systems to adapt to application scenarios.
Slre (Github Super Light Regular Expression library)
But I don't seem to support {}, and I haven't found any greedy... So temporarily put on hold
Later, familiarize yourself with the regex. h In GNU.
It is found that the regex interface of python is similar to that of posix.
The machine of the potholes has not been updated for a long time. After a while, the environment has finally been tested.
#include <stdio.h>
#include <sys/types.h>
#include <regex.h>
int main(int char_c,char** char_v)
{
char* p_str = " Url = www.google.com.hk ;";
char* p_reg = " {0,}(\\S+) {0,}= {0,}(\\S+) {0,};";
regex_t reg;
regmatch_t matchs[20];
memset(matchs,0,sizeof(matchs));
int r;
r = regcomp(®, p_reg, REG_EXTENDED);
r = regexec(®, p_str, 20, matchs, 0);
return 0;
}
Because it is the code used for self-testing, I did not specifically output the code or directly break the breakpoint to see the result,
REG_EXTENDED must be added, otherwise some features will not be used,
Another strange point is that matchs [0] stores a matching range of the entire p_reg, So if you only need to extract the target result of () in our definition,
From p_reg [1.
It took a long time to explain C's string syntax... The result is \ and... However, there seems to be no more efficient solutions.
The next step is to manually add the greedy part to the SLRE, or find the GNU regex. c to crop it...
Let's talk about it later.
#include <stdio.h>
#include <sys/types.h>
#include <regex.h>
#include <string.h>
int main(int char_c, char** char_v) {
char* p_str = " Url = www.google.com.hk ;";
char* p_reg = " {0,}(\\S+) {0,}= {0,}(\\S+) {0,};";
regex_t reg;
regmatch_t matchs[20];
int r, i;
r = regcomp(®, p_reg, REG_EXTENDED);
if (r != 0) {
printf("err");
}
r = regexec(®, p_str, 20, matchs, 0);
if (r != 0) {
printf("err");
}
for (i = 0; i < 20; i++) {
int start_index = matchs[i].rm_so;
int end_index = matchs[i].rm_eo;
int len = end_index - start_index;
if (start_index >= 0) {
char dis_buffer[256];
strncpy(dis_buffer, &p_str[start_index], len);
dis_buffer[len]='\0';
printf("catch:%s\n",dis_buffer);
}
}
return 0;
}
Since I have captured the output for a long time... I decided to paste the data capture code...
Use the Environment G ++ with eclipse-cdt
I don't change IDE when I get used to debugging.
catch: Url = www.google.com.hk ;
catch:Url
catch:www.google.com.hk
The test output is correct...