Principles of Apache and DSO

Source: Internet
Author: User

Apache HTTP Server is a modular software that allows managers to choose modules contained in the core to crop functions. Static inclusion can be selected during compilationhttpdBinary Image modules can also be compiledhttpdDSO is a dynamic shared object of binary images. The DSO module can be compiled after the server is compiled, or compiled and added using Apache extension tool (apxs.

This article describes how to use the DSO module and its working principles.

Working Principle

DSO is short for dynamic shared objects. It is a dynamic connection mechanism that exists in all operating systems derived from modern Unix. It provides a special format of code at runtime. When the program is running, it transfers the required part from the external memory to the memory for execution. Apache is supported after version 1.3. Apache has long used a module concept to expand its functions and internally uses a scheduling-based list to link the extension module to the Apache core module. therefore, Apache has long been destined to use DSO to load its modules at runtime.

Let's take a look at the Apache program structure: This is a very complex layer-4 Structure-each layer is built on the next layer.
The fourth layer is a third-party library developed using the Apache module. For example, Open SSL is generally empty in the official Apache release, however, in the actual Apache structure, the layer structure composed of these libraries must exist.

The third layer is some optional additional function modules, such as mod_ssl and mod_perl. Each module at this layer usually implements an independent separation function of Apache. In fact, none of these modules are required. Running a minimum Apache does not require any modules at this layer.

The second layer is the basic function library of APACHE-which is also the core essential layer of APACHE-this layer includes the Apache kernel, http_core (APACHE core module ), they implement basic HTTP functions (such as resource processing (through file descriptors and memory segments) and maintain the pre-generated (pre-Forked) sub-process model, listen to the TCP/IP socket of the configured virtual server, transmit the HTTP request to the processing process, process the HTTP protocol status, read/write buffer, in addition, there are many additional functions such as URL and Mime header parsing and DSO loading), as well as Apache application interface (API) (In fact, the real functions of Apache are still included in internal modules. To allow these modules to fully control the Apache process, the kernel must provide API interfaces.) This layer also includes a general available code library (libap) the Library (libregex) that matches the implementation of regular expressions is also a small abstract Library (libos) of the operating system ).

The lowest layer is OS-related platform application functions, which can be different from modern unix variants, Win32, OS/2, MacOS, or even a POSIX subsystem.

Figure 1 functional hierarchy of the Apache module

What's interesting about this complex program structure is that, in fact, there is a loose connection between Layer 3 and Layer 4 and Layer 2, on the other hand, the modules on the third layer are mutually dependent. The significant impact of this structure is that the Code on the third and fourth layers cannot be statically connected to the Code on the lowest layer. Therefore, the DSO mode becomes a means to solve it. In combination with the DSO function, this structure becomes very flexible, so that the Apache kernel (technically speaking, it should be the mod_so module rather than the kernel) can be started (rather than installed) load necessary parts to implement Layer 3 and Layer 4 functions.

Modern Unix-like systems all have a clever mechanism for dynamic connection/loading of dynamic shared objects (DSO), so that they can run at runtime, load compiled code in special format into the address space of an executable program.

There are two loading methods: one is that the system program LD is used when the executable file is started. so automatic loading; the second is that the system calls dlopen ()/dlsym () manually through the system interface of the UNIX loader in the execution program for loading.

In the first method, DSO is usually calledShared Library(Shared libraries)OrDSOLibrary(DSO libraries), Use libfoo. so or libfoo. the file name of so.1.2 is stored in the system directory (usually/usr/lib). During compilation and installation, the connector parameter-lfoo is used to establish a connection to the executable program. By setting the connector parameter-R or the environment variable LD_LIBRARY_PATH, the library hardcoded the path of the executable file so that the Unix loader can locate libfoo in/usr/lib. so to parse the DSO symbol that has not been resolved in the executable file.

In general, DSO does not reference the symbols in the executable file (because it is a reusable library of Common Code), nor does it perform subsequent parsing operations. The executable file does not need to perform any action on its own to use the symbols in DSO, but is fully handled by the Unix loader (in fact, the callld.soThe code is part of the code to start when each executable file is connected ). The advantage of dynamic loading of public library code is obvious: you only needlibc.soTo save disk storage space for each program.

In the second method, DSO is usually calledShared Objects)OrDSOFile (DSO files), You can use any file name (but the standard name isfoo.so), Is stored in a specific program directory, and does not automatically establish a connection to the executable files it uses, which is called by the executable files at runtime.dlopen()To load DSO to its address space, and does not parse the symbols in DSO for executable files. The UNIX loader automatically parses unparsed symbols (especially ubiquitous) in DSO Based on the output symbol table of the executable program and the loaded DSO library.libc.soSo that dso obtains the symbolic information of the executable program, which is like being statically connected.

Finally, to take advantage of DSO APIs, the execution program must usedlsym()Parse the symbols in DSO for later use in such as assigning tables. That is to say, the execution program must parse the symbols it needs. The advantage of this mechanism is that you can not load optional program components until the program needs to be dynamically loaded (that is, no memory overhead is required) to extend the functions of the program.

Although this DSO mechanism seems very direct, there is at least one difficulty, that is, parsing the symbols in the executable program for DSO when using the DSO Extension function (that is, the second method, this is because the DSO symbol in the "reverse resolution" executable program conflicts with the Library Design on all standard platforms (the Library does not know what programs will use it ). In practice, the global symbols in the executable files are usually not output repeatedly, so they cannot be used by DSO. Therefore, to use DSO to extend the program function at runtime, you must find a method to force the connector to output all global symbols.

A shared library is a typical solution because it complies with the DSO mechanism and is used by almost all types of libraries provided by the operating system. On the other hand, using shared objects is not the way many programs use to extend their functions.

As of 1998, only a few software packages used the DSO mechanism to actually expand their functions at runtime, such as Perl 5 (through its Xs mechanism and dynaloader module) and Netscape ServerAnd so on. Apache has also joined this column since version 1.3, because Apache has used the dispatch-list-based method to connect external modules to the core of Apache. Therefore, Apache uses DSO to load its modules at runtime.

Advantages and disadvantages

The above DSO-based functions have the following advantages:

  • The server package assembly can be performed by using the httpd. conf configuration command loadmodule at runtime, rather than using configure in compilation, so it is more flexible. For example, you only need to install an Apache server to run multiple server instances (for example, Standard & SSL versions, concentrated & Enhanced Function versions [mod_perl, php3]).And so on.).
  • Server packages can be easily expanded using third-party modules after installation. This is at least a huge benefit to the maintainer of the vendor's release package. He can build an Apache core package, for example, php3, mod_perl, mod_fastcgiAnd so onExpand and create additional packages.
  • A simpler Apache module prototype. Using DSO with apxs, you can get rid of the Apache source code tree. You only need one apxs-I and one apachectl restart command to include the new version of the developed module into the running Apache server.

DSO has the following Disadvantages:

  • Since not all operating systems support dynamic code loading to the address space of a program, the DSO mechanism cannot be used on all platforms.
  • Because the Unix loader has the required symbol parsing overhead, the startup of the server will be about 20% slow.
  • On Some platforms, the independent location code (positon independent code [PIC]) sometimes requires complex assembly language skills to implement relative addressing, while absolute addressing is not required, therefore, the server will slow down by about 5% during running.
  • Because the DSO module cannot be connected to other DSO-based libraries on all platforms (LD-lfoo), for example, based on. the out platform generally does not provide this function, but the elf-based platform does. Therefore, the DSO mechanism cannot be used for all types of modules. Alternatively, the module compiled as a DSO file can only be used by the Apache core, C library (libc) and all other dynamic or static libraries used by the Apache core, static libraries containing independent location code (libfoo. a) provided symbols. To use other code, ensure that the Apache core itself contains references to the Code, or use dlopen () to load the code.

ModuleImplementation

Related modules

Related commands

  • Mod_so
  • <Ifmodule>
  • Loadmodule

Apache's DSO support for independent modules is based on the mod_so module that is statically compiled into the Apache core. This is the only module other than core that cannot be used as a DSO, all other released Apache modules can be configured using the "enable-" option described in the installation document-Module= Shared, which is compiled into DSO independently and effective. A dso module compiled as mod_foo.so can use the loadmodule command of mod_so in httpd. conf and be loaded when the server is started or restarted.

Use the command line parameter-L to view the modules that have been compiled into the server.

New Support Program apxs (Apache ExtensionIn the Apache source code treeOtherCompile the DSO-based module to simplify the establishment of the Apache DSO module. The principle is very simple: when installing Apache, the configuration command make install will install the Apache C header file and pass the platform-dependent compiler and connector parameters to the apxs program, this allows you to compile the module source code from the Apache release source code tree without changing the parameters that support DSO compilers and connectors.

Usage Overview

A brief description of DSO functions of Apache 2.0:

  1. Compile and installReleasedFor example, compile the DSO module where mod_foo.c is mod_foo.so:

$./Configure -- prefix =/path/to/install -- enable-Foo = shared
$ Make install

  1. Compile and installThird-partyApache module. For example, compile the DSO module where mod_foo.c is mod_foo.so:

$./Configure -- add-module = module_type:/path/to/3 rdparty/mod_foo.c -- enable-Foo = shared
$ Make install

  1. Configure Apache for sharingSubsequent InstallationModule:

$./Configure -- enable-so
$ Make install

  1. Use apxs in the Apache source code treeOtherCompile and installThird-partyApache module. For example, compile the DSO module where mod_foo.c is mod_foo.so:

$ CD/path/to/3 rdparty
$ Apxs-C mod_foo.c
$ Apxs-I-a-n Foo mod_foo.la

After the shared module is compiled, you must use the loadmodule command in httpd. conf to activate Apache.

References:

Ralf S. engelschall -- Apache 1.3 dynamic shared object (DSO) Support

Apache2.0 documentation-dynamic shared object

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.