This article summarizes and implements common library functions for C-language string processing.
1. String comparison
Int strcmp (const char * S1, const char * S2 );
When comparing the sizes of two strings (case-insensitive), the return value is learned: If S1 is less than S2, a number smaller than 0 is returned. If S1 is greater than S2, a number greater than 0 is returned, if they are equal, 0 is returned. The return value is the difference between the first unequal ASCII code of two strings. The implementation is as follows:
int my_strcmp(const char *s1, const char *s2){//important! validate arguments first!assert(NULL !=s1 && NULL != s2);while(*s1 != '\0' && *s2 != '\0' && *s1==*s2){s1++;s2++;}return *s1 - *s2;}
Note that the function starts parameter check again to prevent runtime errors when the input parameter is null.
Strcmp is the most commonly used string comparison function. The general usage is if (! Strcmp (S1, S2 )){...}. If you do not compare the entire string but only compare a specified number of strings, you can use the following function:
Int strncmp (const char * S1, const char * S2, size_t N );
The usage and return values are similar to those of strcmp, which compares the first n characters of a given string or ends with any character string. The implementation is as follows:
int my_strncmp(const char *s1, const char *s2, size_t n){//important! validate arguments first!assert(NULL!=s1 && NULL!=s2);if(n == 0)return 0;size_t cnt = 1;while(*s1 != '\0' && *s2 != '\0' && *s1==*s2 && cnt < n){s1++;s2++;cnt++;}return *s1 - *s2;}
In addition to parameter check, pay attention to the special case where n = 0. Here we will always return 0 if n = 0.
There are other string comparison functions with special requirements, such:
Stricmp, memcmp, memicmp, and so on. If I is added, Case sensitivity is ignored. If mem is used, a memory interval is compared.
2. string SEARCH
The simplest is to search for character strings:
Char * strchr (const char * s, int C );
As for the reason why the parameter is int, we will not discuss the issues left over from history here. The function returns the pointer to the first C position found in S. Note that '\ 0' at the end of the string can also be searched. The implementation is as follows:
char *my_strchr(const char *s, int n){assert(s != NULL);char c = (char)n;do{if(*s == c)return (char *)s;}while(*s++);return NULL;}
Strstr:
Char * strstr (const char * S1, const char * S2 );
The function returns the position of the first character of S2 in S1. the implementation is as follows:
char *my_strstr(const char *s1, const char *s2){assert(NULL!=s1 && NULL!=s2);size_t len = strlen(s2);while(*s1){if(!strncmp(s1,s2,len))return (char *)s1;s1++;}return NULL;}
The C standard library does not define a function similar to strnchr and strnstr that limits the search range. Of course, we can define the function as needed, for example:
char *strnstr(const char* s1, const char* s2, size_t n){ const char* p; size_t len = strlen(s2); if (len == 0) { return (char *)s1; } for (p = s1; *p && (p + len<= buffer + n); p++) { if ((*p == *token) && (strncmp(p, token, tokenlen) == 0)) { return (char *)p; } } return NULL;}
3. Copy strings
The most common string replication function is strcpy:
Char * strcpy (char * DST, const char * SRC );
Copy the string ending with null indicated by Src to the string referred to by DST. SRC and DST cannot be the same (can be declared by the restrict keyword of c99 ), DST must have enough space to store the copied string.
Note that the return value of the function is a pointer to DST. This is used to facilitate inline statements in the program, such as strlen (strcpy (S, T )).
The function is implemented as follows:
char *my_strcpy(char *dst, const char *src){assert(NULL!=dst && NULL!=src);char *p = dst;while((*dst++ = *src++) != '\0');return p;}
It is dangerous to use strcpy because the function itself does not check whether the space pointed to by DST is sufficient to store the string to be copied, resulting in a potential risk of string overflow. This is also a classic vulnerability that hackers used in the last century. Therefore, in most cases, strncpy is more secure:
char *my_strncpy(char *dst, const char *src, size_t n){assert(NULL!=dst && NULL!=src);char *p = dst;while(n){if((*dst++ = *src++) == '\0')break;n--;}return p;}
Note that another function strdup:
Char * strdup (const char *);
The difference between this function and strcpy is that the function will apply for memory space to store the copied string, and then return a pointer to the string. Therefore, when using the strdup function, you must note that after using the copied string, use the free function to release the occupied space.
In addition, the memcpy function is similar to strncpy, but does not terminate the copy when null is encountered. This function will certainly copy n characters.
4. String connection
A string connection connects the header of a string to the end of another string.
Char * strcat (char * S1, const char * S2 );
The function is implemented as follows:
char *my_strcat(char *s1, const char *s2){assert(NULL!=s1 && NULL!=s2);char *p =s1;while(*s1)s1++;strcpy(s1,s2);return p;}
Similarly, strcat is insecure because it also assumes that the buffer zone is sufficient to store the connected strings. Therefore, in most cases, we should use more secure:
Char * strncat (char * S1, const char * S2, size_t N );