C global variables (1)

Source: Internet
Author: User

As a programmer, if a programmer is addicted to a programming language as a pleasure, then at the same time, turning black into a programming language is such a pleasure. Today, we are going to look at the crazy side of the classic language.

We know that global variables are an important knowledge point in the syntax and semantics of C language. First of all, they need to be understood from three different perspectives: For programmers, it is a variable that records the content; For the compiler/linker, it is a symbol to be parsed; for the computer, it may be a piece of memory (memory) with an address ). Second, syntax/semantics: in terms of scope, the scope of global variables with static keywords can only be limited to files; otherwise, they will be extended to the entire module and project; in terms of survival, it is static and runs throughout the entire program or module. It is precisely because of the cross-unit access and continuous life cycle that global variables often become a breakthrough in the attacked code, understand this 01:10 important); from the perspective of space allocation, the defined and initialized global variables are in the Data Segment during compilation (. data) to allocate space, define but not initialize the global variables temporarily stored (tentative definition) in. bss segments are automatically cleared during compilation. Instead, the declared global variables can only be counted as symbols and are stored in the symbol table of the compiler without allocating space, the link or runtime is redirected to the corresponding address.

We will show you what interesting things will happen when a non-static global variable is compiled/linked and the program runs. By the way, we can look at the parsing principles of C compiler/linker. The following example is effective for both ansi c and gnu c, and the compiling environment of the author is the GCC-4.4.3 under Ubuntu.

 
 
  1. /* t.h */ 
  2. #ifndef _H_ 
  3. #define _H_ 
  4. int a; 
  5. #endif 
  6.   
  7. /* foo.c */ 
  8. #include <stdio.h> 
  9. #include "t.h" 
  10.   
  11. struct { 
  12.    char a; 
  13.    int b; 
  14. } b = { 2, 4 }; 
  15.   
  16. int main(); 
  17.   
  18. void foo() 
  19.     printf("foo:\t(&a)=0x%08x\n\t(&b)=0x%08x\n 
  20.         \tsizeof(b)=%d\n\tb.a=%d\n\tb.b=%d\n\tmain:0x%08x\n", 
  21.         &a, &b, sizeof b, b.a, b.b, main); 
  22.   
  23. /* main.c */ 
  24. #include <stdio.h> 
  25. #include "t.h" 
  26.   
  27. int b; 
  28. int c; 
  29.   
  30. int main() 
  31.     foo(); 
  32.     printf("main:\t(&a)=0x%08x\n\t(&b)=0x%08x\n 
  33.         \t(&c)=0x%08x\n\tsize(b)=%d\n\tb=%d\n\tc=%d\n", 
  34.         &a, &b, &c, sizeof b, b, c); 
  35.     return 0; 

Makefile is as follows:

 
 
  1. test: main.o foo.o 
  2.     gcc -o test main.o foo.o 
  3.   
  4. main.o: main.c 
  5. foo.o: foo.c 
  6.   
  7. clean: 
  8.     rm *.o test 

Running status:

 
 
  1. foo:    (&a)=0x0804a024 
  2.     (&b)=0x0804a014 
  3.     sizeof(b)=8 
  4.     b.a=2 
  5.     b.b=4 
  6.     main:0x080483e4 
  7. main:   (&a)=0x0804a024 
  8.     (&b)=0x0804a014 
  9.     (&c)=0x0804a028 
  10.     size(b)=4 
  11.     b=2 
  12.     c=0 

This project defines four global variables, t. the h header file defines an integer a, main. c defines two integer types, B and c, which are not initialized. foo. c defines an initialized struct and a main function pointer variable. Since each source file in C language is compiled separately, t. h contains two times, so int a is defined twice. In the two source files, variable B and function pointer variable main are repeatedly defined. In fact, they can be seen as the address of the code segment. But the compiler does not report an error. Only one warning is given:

 
 
  1. /usr/bin/ld: Warning: size of symbol 'b' changed from 4 in main.o to 8 in foo.o 

The running program found that main. in c printing, B is 4 bytes in size, while foo. c is 8 bytes, because the sizeof keyword is the compile-time resolution, and the B type definition in the source file is different. But it is surprising whether it is in main. c or foo. in c, a and B are the same address. That is to say, a and B are defined twice. B is still of different types, but there is only one copy in the memory image. We also see that main. in c, the value of B is actually foo. in c, the first member variable B of the struct. the value of a, which confirms the previous inference-even if multiple definitions exist, there is only one initial copy in the memory..In addition, c is an independent variable out of the box.

Why? This involves parsing and linking the global symbols of multiple definitions by the C compiler. During the compilation phase, the compiler implicitly codes the global symbol information in the symbol table that can be relocated to the target file. Here is a concept of strong and weak-The former refers to the defined and initialized variables, such as foo. struct B in c. The latter refers to undefined or defined but uninitialized variables, such as main. the integer B and c in c, and the two source files contain a in the header file. When symbols are defined multiple times, the GNU linker (ld) uses the following rules to determine:

  • Multiple identical strong symbols are not allowed.
  • If there is one strong symbol and multiple weak symbols, select a strong symbol.
  • If there are multiple weak symbols, the maximum size is determined first. If the size is the same, the first one is selected based on the link order.

In the preceding example, global variables a and B are repeatedly defined. If we initialize and assign values to B in main. c, there are two strong symbols that violate rule 1 and the compiler reports an error. If rule 2 is met, only a warning is given. In actual operation, the strong symbol in foo. c is determined. Variable a is a weak symbol, so only one is selected according to the order of the target file link ).

In fact, this rule is a pitfall in the C language. The Compiler's "conniving" for the multiple definitions of this global variable is likely to modify a variable without reason, resulting in uncertain program behavior. If you are not aware of the seriousness of the situation, let me give another example.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.