Write the algorithm step by step (in string SEARCH)

Source: Internet
Author: User

[Disclaimer: All Rights Reserved. You are welcome to reprint it. Do not use it for commercial purposes. Contact Email: feixiaoxing @ 163.com]

 


Yesterday we wrote a simple character search function. Although relatively simple, it is also usable. However, after careful analysis and research, such a simple function still has room for improvement. Where can we improve it? You can take a look.

The following code is the code before optimization. Now I will paste it again to make it easier to analyze it:


Char * strstr (const char * str, char * data)
{
Int index;
Int len;
 
If (NULL = str | NULL = str)
Return NULL;
 
Len = strlen (data );
While (* str & (int) strlen (str)> = len ){
For (index = 0; index <len; index ++ ){
If (str [index]! = Data [index])
Break;
}
 
If (index = len)
Return (char *) str;
 
Str ++;
}
 
Return NULL;
}
Char * strstr (const char * str, char * data)
{
Int index;
Int len;

If (NULL = str | NULL = str)
Return NULL;

Len = strlen (data );
While (* str & (int) strlen (str)> = len ){
For (index = 0; index <len; index ++ ){
If (str [index]! = Data [index])
Break;
}

If (index = len)
Return (char *) str;

Str ++;
}

Return NULL;
} I don't know what my friends found. There is a very time-consuming operation in the original while condition. That is, the length of str needs to be determined every time str is moved. If the length of str is much larger than the length of data, the length of str is considerable.


Int check_length_of_str (const char * str, int len)
{
Int index;
 
For (index = 0; index <len; index ++ ){
If ('\ 0' = str [index])
Return 0;
}
 
Return 1;
}
 
Char * strstr (const char * str, char * data)
{
Int index;
Int len;
 
If (NULL = str | NULL = str)
Return NULL;
 
Len = strlen (data );
While (* str & check_length_of_str (str, len )){
For (index = 0; index <len; index ++ ){
If (str [index]! = Data [index])
Break;
}
 
If (index = len)
Return (char *) str;
 
Str ++;
}
 
Return NULL;
}
Int check_length_of_str (const char * str, int len)
{
Int index;

For (index = 0; index <len; index ++ ){
If ('\ 0' = str [index])
Return 0;
}

Return 1;
}

Char * strstr (const char * str, char * data)
{
Int index;
Int len;

If (NULL = str | NULL = str)
Return NULL;

Len = strlen (data );
While (* str & check_length_of_str (str, len )){
For (index = 0; index <len; index ++ ){
If (str [index]! = Data [index])
Break;
}

If (index = len)
Return (char *) str;

Str ++;
}

Return NULL;
} The above code solves the issue of length determination. In this way, the length of each comparison is very short. You only need to judge the length of len. However, we are not very satisfied. If the two are not compared, it would be better. Is this possible? We found that if str fails to be compared each time, it will increase by one. So we only need to judge if this one is '\ 0? Therefore, our code can also be written in the following form.


Char * strstr (const char * str, char * data)
{
Int index;
Int len;
 
If (NULL = str | NULL = str)
Return NULL;
 
Len = strlen (data );
If (int) strlen (str) <len)
Return NULL;
 
While (* str ){
For (index = 0; index <len; index ++ ){
If (str [index]! = Data [index])
Break;
}
 
If (index = len)
Return (char *) str;
 
If ('\ 0' = str [len])
Break;
 
Str ++;
}
 
Return NULL;
}
Char * strstr (const char * str, char * data)
{
Int index;
Int len;

If (NULL = str | NULL = str)
Return NULL;

Len = strlen (data );
If (int) strlen (str) <len)
Return NULL;

While (* str ){
For (index = 0; index <len; index ++ ){
If (str [index]! = Data [index])
Break;
}

If (index = len)
Return (char *) str;

If ('\ 0' = str [len])
Break;

Str ++;
}

Return NULL;
} Unlike the first optimization above, we will judge the length difference between the two before entering the while, but after the first judgment, we will no longer need to judge, next, we only need to determine whether the n-th element is '\ 0'. We have already determined the n-1 element and it must be a legal element. Why? You can think about it.

 


(2) KMP Algorithm

The KMP algorithm is essentially used to eliminate unnecessary Search Steps. How can this problem be solved. We can use the example to speak. Assume there are two strings:

A: baaaaabcd

B: aaaab

So what will happen to these two searches? Let's take a look:


/* 1 2 3 4 5 6 7 8 9
* A: B a B c d
* B: a B
* 1 2 3 4 5 6 7 8 9
*/
/* 1 2 3 4 5 6 7 8 9
* A: B a B c d
* B: a B
* 1 2 3 4 5 6 7 8 9
*/We found that when B and A were compared from the first element, we found that the last element was different. The first element of A was, the 5th elements of B are B. According to the regular string search algorithm, the following A will continue to move one to the right, but in fact we have already compared the 2-5 characters, in addition, the four elements 2-5 correspond to the first four elements of B. In this case, B should use the last element to compare with the 7th-bit element of. If this computing step can be saved, will the search speed be improved?

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.