[Disclaimer: All Rights Reserved. You are welcome to reprint it. Do not use it for commercial purposes. Contact Email: feixiaoxing @ 163.com]
Yesterday we wrote a simple character search function. Although relatively simple, it is also usable. However, after careful analysis and research, such a simple function still has room for improvement. Where can we improve it? You can take a look.
The following code is the code before optimization. Now I will paste it again to make it easier to analyze it:
Char * strstr (const char * str, char * data)
{
Int index;
Int len;
If (NULL = str | NULL = str)
Return NULL;
Len = strlen (data );
While (* str & (int) strlen (str)> = len ){
For (index = 0; index <len; index ++ ){
If (str [index]! = Data [index])
Break;
}
If (index = len)
Return (char *) str;
Str ++;
}
Return NULL;
}
Char * strstr (const char * str, char * data)
{
Int index;
Int len;
If (NULL = str | NULL = str)
Return NULL;
Len = strlen (data );
While (* str & (int) strlen (str)> = len ){
For (index = 0; index <len; index ++ ){
If (str [index]! = Data [index])
Break;
}
If (index = len)
Return (char *) str;
Str ++;
}
Return NULL;
} I don't know what my friends found. There is a very time-consuming operation in the original while condition. That is, the length of str needs to be determined every time str is moved. If the length of str is much larger than the length of data, the length of str is considerable.
Int check_length_of_str (const char * str, int len)
{
Int index;
For (index = 0; index <len; index ++ ){
If ('\ 0' = str [index])
Return 0;
}
Return 1;
}
Char * strstr (const char * str, char * data)
{
Int index;
Int len;
If (NULL = str | NULL = str)
Return NULL;
Len = strlen (data );
While (* str & check_length_of_str (str, len )){
For (index = 0; index <len; index ++ ){
If (str [index]! = Data [index])
Break;
}
If (index = len)
Return (char *) str;
Str ++;
}
Return NULL;
}
Int check_length_of_str (const char * str, int len)
{
Int index;
For (index = 0; index <len; index ++ ){
If ('\ 0' = str [index])
Return 0;
}
Return 1;
}
Char * strstr (const char * str, char * data)
{
Int index;
Int len;
If (NULL = str | NULL = str)
Return NULL;
Len = strlen (data );
While (* str & check_length_of_str (str, len )){
For (index = 0; index <len; index ++ ){
If (str [index]! = Data [index])
Break;
}
If (index = len)
Return (char *) str;
Str ++;
}
Return NULL;
} The above code solves the issue of length determination. In this way, the length of each comparison is very short. You only need to judge the length of len. However, we are not very satisfied. If the two are not compared, it would be better. Is this possible? We found that if str fails to be compared each time, it will increase by one. So we only need to judge if this one is '\ 0? Therefore, our code can also be written in the following form.
Char * strstr (const char * str, char * data)
{
Int index;
Int len;
If (NULL = str | NULL = str)
Return NULL;
Len = strlen (data );
If (int) strlen (str) <len)
Return NULL;
While (* str ){
For (index = 0; index <len; index ++ ){
If (str [index]! = Data [index])
Break;
}
If (index = len)
Return (char *) str;
If ('\ 0' = str [len])
Break;
Str ++;
}
Return NULL;
}
Char * strstr (const char * str, char * data)
{
Int index;
Int len;
If (NULL = str | NULL = str)
Return NULL;
Len = strlen (data );
If (int) strlen (str) <len)
Return NULL;
While (* str ){
For (index = 0; index <len; index ++ ){
If (str [index]! = Data [index])
Break;
}
If (index = len)
Return (char *) str;
If ('\ 0' = str [len])
Break;
Str ++;
}
Return NULL;
} Unlike the first optimization above, we will judge the length difference between the two before entering the while, but after the first judgment, we will no longer need to judge, next, we only need to determine whether the n-th element is '\ 0'. We have already determined the n-1 element and it must be a legal element. Why? You can think about it.
(2) KMP Algorithm
The KMP algorithm is essentially used to eliminate unnecessary Search Steps. How can this problem be solved. We can use the example to speak. Assume there are two strings:
A: baaaaabcd
B: aaaab
So what will happen to these two searches? Let's take a look:
/* 1 2 3 4 5 6 7 8 9
* A: B a B c d
* B: a B
* 1 2 3 4 5 6 7 8 9
*/
/* 1 2 3 4 5 6 7 8 9
* A: B a B c d
* B: a B
* 1 2 3 4 5 6 7 8 9
*/We found that when B and A were compared from the first element, we found that the last element was different. The first element of A was, the 5th elements of B are B. According to the regular string search algorithm, the following A will continue to move one to the right, but in fact we have already compared the 2-5 characters, in addition, the four elements 2-5 correspond to the first four elements of B. In this case, B should use the last element to compare with the 7th-bit element of. If this computing step can be saved, will the search speed be improved?