5. a deep understanding of C pointers: 5. pointers and strings
Basic Concepts
Strings can be allocated to different areas of memory. Pointers are usually used to support string operations. A string is a sequence of characters ending with the ASCII character NUL. The ASCII character NUL is \ 0. Strings are usually stored in arrays or memory allocated from the stack. However, not all character arrays are strings. For example, the character array may not contain NUL characters.
C has two types of strings.
*Single-byte string. A sequence composed of char data types.
*Wide string. A sequence composed of wchar_t data types.
The wchar_t data type is used to indicate a wide string, which may be a 16-bit or 32-bit width. Both strings end with NUL. The wide character is mainly used to support non-Latin character sets and is useful for applications that support foreign languages.
The length of a string is the number of characters except NUL characters. When allocating memory to strings, remember to reserve space for NUL characters. Note that NUL is a character defined as \ 0, which is different from NULL (void * (0. NULL indicates a special pointer.
Character constantIt is the Character Sequence caused by single quotes. A character constant generally consists of one character. It can also contain multiple characters, such as escape characters. In C, their type is int. The length of a char is 1 byte, and the length of a character literal is sizeof (int) bytes.
printf("%d\n", sizeof('a'));//4
There are three methods to declare a string:Literal,Character array,Character pointer. String Literal is a string sequence that is enclosed by double quotation marks and is often used for initialization. They are located in the string literal pool. If you declare an array with 32 characters, you can only put 31 string texts, because the string must end with NUL. The position of the string in the memory depends on the declared position.
When defining a literal volume, it is usually allocated to the literal volume pool. When using the same literal volume multiple times, the pool usually has only one copy. The literal is generally considered immutable. You can close the literal pool to generate multiple copies. Where to use the string literal is not important, it has no scope concept.
Some compilers allow modifying strings, so declaring strings that do not want to be modified as constants is a good choice.
The string initialization method depends on whether the variable is declared as an array or a pointer. The memory used by the string is either an array or a block of memory pointed to by the pointer. We can get characters from strings or other places (such as standard input.
char head[] = "hello man";printf("size of head is : %d\n",sizeof(head));//size of head is : 10
We can see that "hello man" has 9 characters, but the length is 10, because it has NUL characters. You can also use the strcpy function to initialize an array.
char head1[10];strcpy(head1,"hello man");
Do not use the array name as the left value.
Dynamic memory allocation can provide more flexibility and may extend the memory. Normally, malloc and strcpy are used to initialize strings.
char* head2;
head2 = (char*) malloc (strlen("hello man")+1);strcpy(head2,"hello man");
Note that when you use malloc to determine the memory lengthNULReserved space and use strlen instead of sizeof. Sizeof returns the length of the array and pointer, rather than the length of the string. If a pointer is initialized with a string literal, the Pointer Points to the string literal pool. Do not assign the character literal to the pointer because it is of the int type. You can assign the literal number to the pointer after the uncited operation.
*(head2 + 7) = 'e';printf("head2 is %s\n", head2);//head2 is hello men
In short, the string may be in the global or static memory (global or static array), or in the string literal pool ({......}), It may be located on the stack (malloc), or in the stack frame of the function (char array []). StringLocationDetermine how long it can exist and which programs can access it. The global memory string always exists and can be accessed by multiple functions; the static string also exists, but only the function defining it can access; the string on the stack can be accessed by multiple functions, it exists until it is released.
Standard string operations
The standard method for comparing strings isStrcmpFunction. The prototype is as follows.
int strcmp(const char* s1, const char* s2);
If the two strings are equal, 0 is returned. If s1 is greater than s2, a positive number is returned. If s1 is less than s2, a negative number is returned.
char command[16];printf("enter a command :");scanf("%s",command);if(strcmp(command,"quit")==0)
{ printf("you typed quit!\n");}else{ printf("i don't know what you typed!\n");}
Note that if (command = "quit") is used here, the actual address of command and the string literal address are compared.
Copying stringsStrcpyFunction implementation. Its prototype is as follows:
char* strcpy(char* s1, const char* s2);
An application reads a series of strings and stores arrays with the minimum memory. First, create a long enough string array, which is long enough to accommodate the longest string allowed by the user, and then read the string into this array. With the read string, we can allocate suitable memory according to the length of the string.
char mynames[32];char* myname[30];size_t count = 0;printf("enter a name please:");scanf("%s",mynames);myname[count] = (char*) malloc (strlen(mynames)+1);strcpy(myname[count], mynames);count++;
You can repeat this operation in a loop.
The two pointers can reference the same string. Two pointers reference the same address calledAlias. Assigning a pointer to another pointer only copies the address of the string.
String concatenation involves merging two strings. GenerallyStrcatTo perform this operation. This function is prototype:
char* strcat(char* s1, const char* s2);
The following describes how to use a buffer to splice strings.
char* error = "ERROR: ";char* errormsg = "not enough memory!";char* _buffer = (char*) malloc (strlen(error) + strlen(errormsg) +1);strcpy(_buffer, error);strcpy(_buffer, errormsg);printf("%s\n", _buffer);printf("%s\n", error);printf("%s\n", errormsg);
If strcpy (error, errormsg) is used directly, it may overwrite some unknown content after the literal address of the error string, because we did not allocate independent memory for the new string. A common error in concatenating strings is that space is not allocated for a new string. In addition, do not use the character literal instead of the string literal as the parameter of this function.
Passing and returning strings
Define a function first.
size_t strLength(char* string){size_t length = 0;while(*(string++)){length++;}return length;}char simpleArray[] = "simple string";char* simplePtr = (char*) malloc (strlen("simple string")+1);strcpy(simplePtr, "simple string");printf("%d\n", strLength(simplePtr));//13
To call this function for a pointer, you only need to input the pointer name. This function can also be used for arrays. Here, the array name is interpreted as an address.
printf("%d\n", strLength(simpleArray));
You can also use the address fetch operator for the 0 subscript of the array, but it is too cumbersome: strLength (& simpleArray [0]).
Declare the parameter to pointCharacter constant pointerTo prevent the string from being modified. If you want the function to return a string initialized by the function, you must determine whether the function caller is responsible for releasing the allocated memory. If the function dynamically allocates memory and returns a pointer to the memory, the caller must be responsible for the final release of the memory, which requires the caller to know how to use the function.
The main function is usually the first function executed by the application. For a command line-based program, it is common to enable certain functions by passing some information to it. For example, the ls command in linux uses parameters such as-la to execute different behaviors. C passedArgcAndArgvCommand line parameters are supported. The first argc parameter is an integer used to specify the number of parameters passed. The system will pass at least one parameter, which is the name of the executable file. The second parameter argv is usually regarded as a one-dimensional array of string pointers. Each pointer references a command line parameter.
int main(int argc, char** argv){int i =0;while(i<argc){printf("argv[%d]is %s\n",i,argv[i]);//argv[0]is ./mysenderi++;}}
As you can see, no parameters are attached. The default built-in parameter is./mysender, which is the name of my compiled file. Run the following command:./mysender-f jack-la limit = 100. The output is:
argv[0]is ./mysenderargv[1]is -fargv[2]is jackargv[3]is -laargv[4]is limitargv[5]is =argv[6]is 100
Because I separated the "=" symbol by space, the result "=" is treated as a parameter. In fact, each parameter should be separated by spaces. The parameter itself should not contain spaces.
FunctionReturns a string.Returns the string address. This may be a string literal address, a dynamic memory address, or a local string variable address.
First, let's look at the first situation. For a static pointer to a string literal, you should note that repeated use in different places will overwrite the previous result. A string is not always regarded as a constant. You can use a command to close the String constant pool and declare the string as a constant to prevent the string from being modified. If the returned memory is dynamically allocated, be sure to prevent memory leakage. The address of the local variable string returned may be faulty, because after the function is executed, the memory may be overwritten by other stack frames. For example, if you declare a string array in the function and initialize it, and then return the address of the array, the memory occupied by this array is insecure, it is at risk of being overwritten by stack frames of other functions.
Function pointer and string
PassFunction pointerControlling program execution is a very flexible method.
#include <stdio.h>#include <stdlib.h>#include <string.h>char* stringToLower(const char* string){char* tmp = (char*) malloc (strlen(string) + 1);char* start = tmp;while(*string != 0){*(tmp++) = tolower(*(string++));}*tmp = 0;return start;}main(){typedef int (fptroperation)(const char*, const char*);int compare(const char* s1, const char* s2){return strcmp(s1,s2);}int compareIgnoreCase(const char* s1, const char* s2){char* t1 =stringToLower(s1);char* t2 =stringToLower(s2);int result = strcmp(t1,t2);free(t1);free(t2);return result;}void sort(char* array[], int size, fptroperation operation){int swap = 1;while(swap){swap = 0;int l = 0;while(l<size-1){if(operation(array[l],array[l+1])>0){swap = 1;char* tmp = array[l];array[l] = array[l+1];array[l+1] = tmp;}l++;}}}void display(char* names[], int size){int i = 0;while(i<size){printf("%s ",names[i]);i++;}printf("\n");}char* names[] = {"jack","rose","Titanic","hello","World"};char* newnames[] = {"jack","rose","Titanic","hello","World"};sort(names, 5, compare);display(names,5);//Titanic World hello jack rose sort(newnames, 5, compareIgnoreCase);display(newnames, 5);//hello jack rose Titanic World}
In this example, the function pointer is used to compare strings under different rules.