[Reprinted] links to Unix programs

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Reprint address: http://bbs.chinaunix.net/archiver? Tid-779785.html
UNIX link Processing

We already know that the link actually refers to the process of linking the symbol referenced in one module with its definition in another module. We also know that links are divided into dynamic links and static links. Either way. The linked program searches for every module in the program, including every library file used, to find definitions of external symbols that are not defined in a module. If no definition of a referenced symbol is found, the linked program reports an error. In this case, the creation of executable files will fail.

The main difference between static and dynamic links lies in the different work done by the linked program after the definition of a symbol is found:

For static links, the link program copies the target code of the symbolic definitions referenced by user programs in the static link library (archives) to the final executable file. In this case, the external symbolic references in the program and the link defined in the program are completed when the executable file is created.

For dynamic links, the content in the shared object (Dynamic Link Library) is mapped to the virtual address space of the user process at runtime. The linked program only records the destination code of the external symbol definition in the final executable file. In this case, the external reference of the symbol and its defined link are completed when the program is running.

In this section, we will discuss the link process in detail. For example, some default settings of the compilation system, how do users generate their own dynamic or static libraries, how to link these library files in programs, and how dynamic link libraries are implemented. After understanding the content, the reader will be able to efficiently organize their own source files to improve the development efficiency and maintainability of the program.

Default settings

In the previous section, we use:

$ CC-O myprg myprog. c myfunc. c

Command to generate an executable file. In this case, CC generates the target files corresponding to each c source program and links them to generate an executable program file. For each generated target file, we call it a relocated target file because these target files contain symbolic references that are not linked to their definitions, that is, no address is allocated in the memory.

However, we can note that the printf () called in myprog. C and the isdigit () called in myfunc. C are not defined in our own programs. These two functions are provided by the Standard C library. By default, the linked program automatically explores the definitions of the functions called by the user program in the Standard C library.

The standard C function library has two versions: Dynamic and Static. The file names are libc. So and libc. A, which are used for dynamic and static links respectively. By default, the linking program dynamically links the function calls of the Standard C library (using libc. So), that is, the called function is linked to the program at runtime. Some standard functions are not defined in libc. So due to some design omissions. The function linking program will use libc. A for static link, and the code of these functions will be copied to the executable file. But for which static links and dynamic link symbolic references can be made, the programmer can choose which link method to use. We will introduce how to do this later.

The standard C function library contains the following types of functions:

Standard I/O functions, such as fopen, fread, fwrite, and fclose;
String operation functions, such as strcat, strcmp, and strcpy;
An integer classification function that encodes eight characters, such as isalpha, isupper, islower, and isdigit;
Character, integer, or string conversion functions, such as atoi, atol, strtoul, and atof;
System calls implemented in the form of library functions, such as open, read, write, and close.

For other unlisted functions and instructions, refer to the relevant manual.
CC automatically performs dynamic connections to the Standard C library based on the following conventions:
According to the Conventions, the name of the shared object or dynamic link library is prefixed with Lib, And the suffix is. So. The file library or static Link Library is prefixed with Lib, And the suffix is.. Therefore, the shared version of the standard C library is named libc. So, while the static version is named libc..

The-L option of the CC command recognizes the above conventions. That is to say:

$ CC...-lx

The linked program will search for the dynamic library libx. So or the static library libx.. By default, CC automatically sends the-LC option to the linked program. Therefore, in reality, the-L option is a link program option.

By default, the linked program first searches for the dynamic library version libx. So in the same directory. If the link program fails, the static library libx. A is selected.
By default, the linked program searches for the required libraries in the standard/usr/CCS/lib and/usr/lib directories of the system. Standard libraries provided by the compilation system are generally stored in the/usr/CCS/lib directory.

Based on these Conventions, we can make it clearer that the default CC command line will guide the link program to search for/usr/CCS/lib/libc. so instead of the corresponding static library. Later we will talk about how to link your program with the static version of a library (like libc. A is connected to libc instead of libc. so), and how users can establish their own static or dynamic libraries, and link programs with these libraries. Of course, if the default CC command line can meet the compilation requirements, no additional link processing is required.

Links to standard library functions

Libc. So is only a target file, which contains the code for each function in the Standard C library. When a program calls a function in the library and dynamically links its own program. all so content will be mapped to the virtual address space of the process corresponding to the program at runtime.

This is not the case for static libraries (archives). Each function or a set of related function code is saved to their own target files. These target files are then collected in an archive. However, when the programmer specifies a static link to the Standard C library in the command line, the linked program searches the archives for the code of the called function and copies it to the final executable file. Here we can see only the required code contained in the final executable file when using static links.

So how can I change the default dynamic link mode of the link program for static link? The method is to add the-DN option in the CC command line, as shown below:
$ CC-O myprog-DN myprog. c myfunc. c

In this way, for the Standard C library function calls in myprog. C and myfunc. C, the linking program will search for the target code in libc. A and copy it to the final executable file.

When the program is complex, a program may not just call the functions in the Standard C library. For example, for a program that uses functions such as sin () and cos (), it may need to be linked to the mathematical function library. The compilation system only provides the dynamic versions of libc and libdl (Set of function calls that control Dynamic Links). Therefore, unless you have installed the dynamic version libm of the mathematical function library in the standard location. so, the link program will look for libm in the standard position. database a function. Of course, you need to add a-L option in the CC command line, as shown below:
$ CC file. c file2.c-LM

Note that the-DN option is not specified in the preceding command line. In this way, the linking program calls each library function and will still try to dynamically link it. For example, for standard C library function calls in file1.c and file2.c, the linking program searches for its definition in libc. So. Next we will see how to specify the same static version chain or the same dynamic version chain for each database.

In addition, it should be noted that the link program only searches for static databases to solve previously discovered and undefined external references. Therefore, the position of the-L option in the command line is very important. For example:

$ CC-DN file1.c-LM file2.c

In this way, the link program searches for libm. A to solve the mathematical function call in file1.c. Therefore, if a mathematical function is called in file2.c, the linked program cannot find the definition of the function, and thus the link fails. In this case,

Unless it is clear which functions are called in each c file, it is better to put the-L option at the end of the command line, and the-L option can appear multiple times, to specify multiple different library files.

Next part: static library and dynamic library creation ......

Development of UNIX systems-creation of static and dynamic libraries

UNIX systems and various software packages provide developers with a large number of library files. However, in general, these library files cannot meet all your needs. Most developers write many functions according to their own development and research requirements. For these functions, if you use the method of specifying the source file in the command line to call their program Link, although it is also possible, but there are also some disadvantages:

Every program that calls these functions needs to re-compile the code of these functions during compilation, which is a waste of computing time.
A file usually contains more than one function definition. By using the above compilation method, code of a large number of irrelevant functions will be copied to the final executable file, increasing the usage of storage resources without any reason, and slowing down the running time.
The maintenance is inconvenient. Because a source file is used by multiple programs, some modifications to the source file may cause unexpected troubles. And so on.

All these reasons make us think of the ability to compile our own functions into a library file for multiple programs to call, just like the standard library functions. In fact, this tool is provided in UNIX systems. With these tools, we can not only place functions in a static library, but also make them a dynamic library.

The following describes how to generate a static database.

We know that static databases, also known as archives, actually collect a series of target files in these archives. These target files are compiled and generated by CC source code for the function. Therefore, the static library generation method can be divided into two steps:

1. Compile the source file where the function code is located into the target file. For example, for myfunc. C, you can use the following command to compile it into the target file:

$ CC-C myfunc. c

Of course, when there are multiple source files, you only need to column them in the CC command line.

In this step, we will be able to obtain the target file of each source file. For the above example, myfunc. O is obtained.

2. Collect the target files and place them in a static library file. This is mainly accomplished by using the AR command, such:

$ Ar r $ home/lib/libtest. A myfunc. o
AR: Creating/home/yxz/libtest. A (screen output)

Here-o $ home/lib/libtest. A is the full path name of the generated static library. Assume that the $ home/lib directory already exists. Note that the static library name should follow the libx. A principle, so that you can use the-L option to specify it in the CC command line. Myfunc. O is the target file name to be collected to the archive. When multiple target files exist, you only need to column them separately.

After the libtest. A file library is generated and myprog. C is compiled, the following method can be used:

$ CC-L $ home/lib-O myprog. C-ltest

Here, the-L option indicates that the linked program searches for the relevant library files in the $ home/lib directory (of course, it will automatically search for the standard location ). In the next section, we will describe this in more detail. The last-ltest option indicates that the linked program searches for references to testinput () in myprog. c In libtest. So or libtest.. Of course, because the libtest. So file is not generated, the link program can only search for libtest.. In addition, because the-DN option is not specified in the command line, for the default-LC option, the link sequence will search for libc. So instead of libc..

Although the generation of static link libraries is relatively simple, this link method is not efficient in some cases because of its simplicity. Compared with the dynamic link method, this link method has the following obvious shortcomings:
Static library limitations:
1: The generated executable file contains a separate copy of function code, which consumes a lot of disk space. (Isn't the disk space a problem ?)
2: During the runtime, each process independently installs the code of each function called by the process in its own address space, so that when multiple processes call the same function, there will be multiple copies of the function code in the memory, occupying a lot of memory for no reason.
3: the symbolic reference must be completed at the compilation link. Therefore, when updating the function definition, you must relink the program that calls these functions.

Based on the virtual storage management solution, the dynamic connection method overcomes the above limitations of static connections, and enables the entire system to achieve high efficiency. Therefore, by default, the linked program tries to dynamically link as long as it is possible (find the dynamic version of the library function ).

The core issue of dynamic link is to generate a dynamic link library (shared object ). Next we will introduce how to generate a dynamic link library and discuss some principles for creating a dynamic link library.

Other tools are not required to create a dynamic link library. You can use the CC command to create a dynamic link library. In this case, you need to add the-k pic and-G options in the command line. We can create a dynamic version of libtest as follows:

$ CC-K pic-g-o $ home/lib/libtest. So myfunc. c

Here-o $ home/lib/libtest. So specifies the full path name of the dynamic link library to be generated. Like static libraries, dynamic library commands should follow the libx. So conventions. The-G option indicates that CC organizes the target code of each file according to the format of the dynamic link library.

The-k pic option is required to generate a dynamic link library. We already know that dynamic links are based on Webpage-based virtual management. In this mode of memory management, Memory sharing between processes is measured in pages. As long as the memory page does not change during runtime, they can be shared. However, these shared pages are in different processes. It may have different virtual addresses. Therefore, the physical address of these codes can only be obtained at runtime. (This process is called address relocation .) If a process writes a sharing page when relocating a reference to a shared object, the operating system generates a dedicated copy of the page for the process. In this case, the benefits of page sharing are gone. Therefore, the program must minimize the number of page modifications. The method to reduce the number of modifications is to use the address floating code.

Code with floating addresses can be installed anywhere in the process address space. Because this code does not rely on absolute addresses, it will run correctly in different virtual addresses in every process using it, and there will be no page modifications during the running process. The-k pic option indicates the target file with floating address generated by the compilation system. In this case, the relocated reference in the target file will be moved from the local body segment to the table in the data segment.

After the dynamic library is generated, you can use it in the CC command line. For example:

$ CC-L $ home/lib-O myprog. C-l test

At this time, although the static version libtest. A of the Test Library is also available in the $ home/lib directory, the link program will first search for libtest. So.

After figuring out how to generate a dynamic library, let's take a look at some principles for creating a dynamic library. All these principles are proposed to improve performance.

Dynamic library performance improvement:
The performance improvement mainly involves two aspects.
The first is to minimize the data segment of the dynamic link library. We know that sharing is just code. However, the data segments of the dynamic library cannot be shared by multiple processes. The system allocates a copy of memory for each process that shares the file in the database. Therefore, to truly reduce the memory usage of dynamic linked libraries, We must minimize the Data Segment Size of shared objects. To sum up, there are roughly four methods:

(1) Use automatic (stack) variables whenever possible. Do not use global or static variables if automatic variable rows are passed.
(2) Try to use function interfaces rather than global variables for parameter transmission. This can also improve the maintainability of the program.
(3) exclude functions that use a large number of global variables from the dynamic link library. For such functions, it is more appropriate to put them in the static Link Library.
(4) the dynamic link library should be self-contained. That is to say, when generating a dynamic link library, do not use dynamic links instead of static links for function calls from other libraries. In this case, when dynamic links are used, the process that calls the function in the dynamic library will not only obtain the copy of the Data Segment of the dynamic library, you will also get a copy of the data segments of other dynamic libraries that the database says are connected. This is actually not worth the candle.

The second problem involved in the improvement of dynamic link performance is to minimize swap actions on memory pages. Although the process using the shared library does not write shared pages, they may still cause page failure, resulting in performance degradation of dynamic links. You can use either of the following methods to solve this problem:

(1) Improve the positioning of symbolic references, which includes two meanings. One is to exclude the rarely used function definitions in the shared library that the library itself does not depend on. If many irrelevant functions are installed in the shared library, and only some irrelevant processes call these functions occasionally, positioning will be reduced and page exchanges will become frequent. The second is to try to combine relevant functions and put them together on the same page to improve the positioning of references.

For example, assume that func1 () calls func2 () and func3 (), and the Code of these three functions is put on the same page. When executing the code of func1 (), the Code of func2 () and func3 () will also be loaded into the memory at the same time. Assume that the code of func2 () is not on the same page as the code of func1. In this case, page adjustment may be required during the execution of func1 (), and the system efficiency will be affected.

(2) Adjust the page arrangement. This mainly refers to sorting out the target files of the shared library, so that the code of frequently-used functions should not be distributed to different pages across the page boundary. To do this, you must first understand the size and size of the system memory page. Then, run the NM command to display the offset values of each symbol in the target file. Then, you can adjust the positions of each function so that the functions that must span the page boundary are less frequently used to maximize the page hit rate.

The above analysis on the performance improvement of dynamic links is actually some shortcomings of dynamic links. Therefore, it does not mean that static links should be excluded from dynamic links under any circumstances. Ideally, two different static and dynamic versions should be provided for the same database. This is mainly because some users may not find a dynamic library suitable for their own applications, and some UNIX systems do not support shared objects. From the above introduction to the static library and dynamic library creation method, it is not very laborious to complete this.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

[Reprinted] links to Unix programs

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support