Python source code profiling note 0--c Language Basics Review
To analyze the Python source code, C language can not be based on a few, especially pointers and structures and other knowledge. This article first reviews the C Language Foundation, facilitates the subsequent code reading.
1 About elf Files
The target files and executables that are compiled by C in Linux are in elf format, and the executable file is divided by segment, in the target file, we are divided by section. A segment contains one or more sections, and the complete section and segment information can be seen through the readelf command. Look at a chestnut:
char pear[40 ]; static double Peach; int mango = 13 ; char *str = "Hello" ; static long melon = 2001 ; int Main () {int i = 3 , J; Pear[5 ] = i; Peach = 2.0 * Mango; return 0 ;}
This is a simple C-language code that now analyzes where each variable is stored. where Mango,melon belongs to data section,pear and peach belong to the common section, and peach and melon add static, indicating that this file can only be used. The string "HelloWorld" corresponding to str is stored in the Rodata section. The main function is attributed to the text section, where the local variable i,j allocates space in the stack at run time. Notice that the global uninitialized variables peach and pear are in the common section, which is set for the strong and weak symbols. In fact, after the final link becomes an executable file, it is attributed to the BSS segment. Similarly, the text section and Rodata sections belong to the same segment in an executable file.
More Elf content See the book "Procedural Ape Self-cultivation".
2 About pointers
Want to learn C language The most afraid of is the pointer, of course, "C and Hands" and "c expert programming" and "high-quality C programming" in the face of the hands are very good explanation, the system review or read it, here I summed up some of the basic and easy to wrong points. The environment is a ubuntu14.10 32-bit system that compiles the tool GCC.
2.1 Pointers are easy to mistake.
/*** pointer Easy Error Example 1 demo1.c***/ Span class= "Hljs-keyword" >int Main () {char *str = " HelloWorld "; //[1] str[1 ] = ' M '
; //[2] will error char arr[] = "hello" ; //[3] arr[1 ] = ' M '
; return 0 ;}
In demo1.c, we define a pointer and an array that points to a string, and then modifies the value of a character in the string. Running after compiling will find [2] error, this is why? gcc -S demo1.c
Generating assembly code with commands will find that the HelloWorld at [1] is stored in Rodata section, is read-only, and [3] is stored in the stack. So [2] Error and [3] normal. In C, the string constants are stored in the Rodata section by creating a string constant in [1] and assigning a value to the pointer. If it is assigned to an array, it is stored in the stack or in the data section (e.g. [3] is stored in the stack). Example 2 gives more error-prone points to look at.
/*** Pointers Easy Error Example 2 demo2.c***/Char*getmemory (intNUM) {Char*p = (Char*)malloc(sizeof(Char) * num);returnP;}Char*getmemory2 (Char*p) {p = (Char*)malloc(sizeof(Char) * -);}Char*getstring () {Char*string="HelloWorld";return string;}Char*getstring2 () {Char string[] ="HelloWorld";return string;}voidParamArray (CharA[]) {printf("sizeof (a) =%d\n",sizeof(a));//sizeof (a) = 4, parameter passed as pointer}intMain () {intA[] = {1,2,3,4};int*b = A +1;printf("delta=%d\n", b-a);///delta=4, note int array step is 4 printf("sizeof (a) =%d, sizeof (b) =%d\n",sizeof(a),sizeof(b));//sizeof (a) =16, sizeof (b) =4ParamArray (a);//References an address that is not part of the program address space, resulting in a segment error / * int *p = 0; *p = 17; */ Char*str = NULL; str = getmemory ( -);strcpy(STR,"Hello"); Free(str);//Free memorystr = NULL;//Avoid wild hands ///error version, because the function parameter is passed a copy. / * char *str2 = NULL; GetMemory2 (STR2); strcpy (str2, "Hello"); */ Char*STR3 = GetString ();printf("%s\n", STR3);///error version, the stack pointer is returned, and the compiler will have a warning. /* Char *STR4 = GetString2 (); */ return 0;}
2.2 Pointers and arrays
Part pointers and arrays are also mentioned in 2.1, where pointers and arrays in C can be converted to each other in some cases, such as char *str="helloworld"
by str[1]
accessing a second character or by accessing it *(str+1)
.
In addition, the use of arrays and pointers is equivalent in function arguments. however, pointers and arrays are not equivalent in some places and require special attention.
For example, if I define an array char a[9] = "abcdefgh";
(note the auto-fill after the string), the process of reading the character ' B ' with a[1] is this:
- First, the array A has an address, which we assume is 9980.
- Then take the offset value, the offset value is the index value element size, where the index is 1,char size is also 1, so plus 9980 is 9981, get the address of the 1th element of array A. (If an array of type int, then this offset is 1 4 = 4)
- Take the value at address 9981, which is ' B '.
If you define a pointer char *a = "abcdefgh";
, we use a[1] to take the value of the first element. Unlike the array flow, it is:
- First, pointer A has an address for itself, assuming it is 4541.
- Then, take the value of a from 4541, which is the address of the string "ABCDEFGH", which is assumed to be 5081.
- Then the same steps as before, 5081 plus offset 1, take the value of 5082 address, here is ' B '.
As you can see from the instructions above, pointers are a step more than arrays, although the results appear to be consistent. Therefore, the following error is better understood. An array is defined in demo3.c and then declared and referenced in demo4.c by a pointer, which is obviously an error. If the change is extern char p[];
correct (of course you can also write extern char p[3], declaring that the size of the array is not related to the actual size of the inconsistency), be sure to ensure that the definition and declaration match.
/***demo3.c***/char"helloworld";/***demo4.c***/char *p;int main(){ printf("%c\n", p[1]); return0;}
3 about typedef and # define
typedef and # define are often used, but they are not the same. A typedef can be crammed into multiple declarators, and # define can have only one definition. In a continuous declaration, a typedef-defined type can guarantee that the declared variables are of the same type, and that # define does not work. In addition, a typedef is a thorough encapsulation type, and no additional types can be added after the declaration. As shown in the code.
#define int_ptr int *//i是int *类型,而j是int类型。typedefchar//c1, c2都是char *类型。#define peach intunsigned//正确int banana;unsigned//错误,typedef声明的类型不能扩展其他类型。
Also, typedef is common in struct definitions, as defined in the following code. It is important to note that [1] and [2] are very different. When you define a struct foo in a typedef like [1], in addition to its own Foo structure tag, you define the struct type Foo, so you can declare the variable directly with Foo. As defined in [2], a variable cannot be declared with bar because it is simply a struct variable, not a struct type.
It is also important to note that the structure has its own namespace, so the structure of the field can be the same as the structure of the name, such as [3] is also legal, of course, try not to use it. The following section also explores the structure in more detail, as it is also useful in Python source code for many constructs.
typedefstruct foo {int//[1]struct bar {int//[2]struct//正确,使用结构标签foo//正确,使用结构类型foostruct//正确,使用结构标签bar// 错误,使用了结构变量bar,bar已经是个结构体变量了,可以直接初始化,比如bar.i = 4;struct foobar {int//[3]合法的定义
4 about structs
When learning data structures, the definition of linked lists and tree structures is often used in structs. For example, the following:
struct node { int data; struct node* next;};
It may be a bit strange to define a list, why it can be defined like this when struct node is not yet defined why it can be defined with the next pointer in this struct?
4.1 Incomplete types
Here is the non-complete type of C language. The C language can be divided into function types, object types, and incomplete types. Object types can also be divided into scalar and non-scalar types. Arithmetic types (such as Int,float,char, etc.) and pointer types are scalar types, while defined complete structures, unions, arrays, and so on are non-scalar types. The incomplete type refers to a type that is not defined as complete, such as the following
struct s;union u;charstr[];
A variable with an incomplete type can be combined into a full type by multiple declarations. For example, the following 2 words declare that the STR array is legal:
charstr[];charstr[10];
In addition, if two source files define the same variable, they can be compiled as long as they are not all strongly typed. For example, the following is legal, but if you change the file1.c int i;
into a strong definition, int i = 5;
then it will be wrong.
//file1.cint i;//file2.cint4;
4.2 Incomplete type struct
The structure of the incomplete type is very important, such as the definition of struct node that we mentioned at the outset, when the compiler goes back and finds struct node *next
that struct node is an incomplete type, next is a pointer to an incomplete type, although the pointer itself is a full type , because no matter what the pointer is on a 32-bit system, it takes 4 bytes. By the end of the definition, struct node becomes a complete type, so next is a pointer to the full type.
4.3 Structure initialization and size
Structure initialization is relatively simple, it is important to note that when a struct contains pointers, if you want to do a string copy and other operations, the pointer needs to allocate additional memory space. As the following defines a struct student variable stu and pointer to the struct PSTU, although the STU definition has been implicitly assigned to the structure of the body, but you want to copy the string to its point of memory, you need to display the allocation of memory.
struct student { char *name; int age;} stu, *pstu;int main(){ 13//正确 // strcpy(stu.name,"hello"); //错误,name还没有分配内存空间 stu.name = (char *)malloc(6); strcpy"hello"//正确 return0;}
The structure size involves an alignment problem, and the alignment rules are:
- The first address of the struct variable is the widest member length (if any
#pragma pack(n)
) an integer multiple of the widest member length and N's smaller value, the default pragma of n=8)
- Structure size is an integer multiple of the widest member length
- The offsets for each member of the struct relative to the first address of the struct are the integer multiples of each member's own size (if there is pragma pack (n), which is the smaller value of N and member size)
As a result, the following structures S1 and S2, although the contents are the same, but the field order is different and the size is different sizeof(S1) = 8, 而sizeof(S2) = 12
. If defined #pragma pack(2)
, thesizeof(S1)=8;sizeof(S2)=8
typedefstruct node1{ int a; char b; short c;}S1;typedefstruct node2{ char b; int a; short c;}S2;
4.4 Flexible arrays
A flexible array refers to the last face of a struct a member can be an array of unknown size, so that a variable length string can be stored in the struct. As shown in the code. Note that the flexible array must be the last member of the struct, and the flexible array does not occupy the structure size. Of course, you can also write char str[0]
the array, meaning the same.
structFlexarray {intLenCharStr[];} *pfarr;intMain () {CharS1[] ="Hello, World"; Pfarr =malloc(sizeof(structFlexarray) +strlen(S1) +1); Pfarr->len =strlen(S1);strcpy(Pfarr->str, S1);printf("%d\n",sizeof(structFlexarray));//4 printf("%d\n", Pfarr->len);// printf("%s\n", PFARR->STR);//Hello, World return 0;}
5 Summary
- Const is not a constant in the const,c language, so it is not possible to define an array with a const variable, as
const int N = 3; int a[N];
this is wrong.
- Note memory allocation and release, eliminate wild pointers.
- It is legal to link weak and strong symbols in C language.
- Note the difference between the pointer and the array.
- typedef and # define are different.
- Note the initialization of the struct that contains pointers and the use of flexible arrays.
6 References
- "C Expert Programming"
- "Linux c one-stop programming"
- "High quality C + + programming"
- struct-Body byte alignment
- Flexible arrays
Python source code profiling note 0--c Language basics