|
|
|
Send this page as an email |
|
|
Level: Intermediate Sean A. Walberg (sean@ertw.com), Senior Network Engineer June 07, 2007
Today, applications using lamp (Linux, Apache, MySQL, and PHP/perl) architectures are constantly being developed and deployed. However, server administrators often have little control over the application itself, because the application is compiled by others. This three-part series will discuss many server configurations that affect application performance. The second article focuses on the measures that can be taken to optimize Apache and PHP.
Linux, Apache, MySQL, and PHP (or Perl) are the basis for the lamp architecture of many web applications. There are many open-source software packages based on lamp components that can be used to solve various problems. With the increase of application load, the bottleneck of the underlying infrastructure will become more and more obvious, in the form of slow response to user requests. The previous article demonstrated how to optimize the Linux system and introduced the basic knowledge of lamp and performance measurement. This article focuses on Web server components: Apache and PHP. Optimize Apache Apache is a highly configurable software. It has many features, but each one is expensive. To some extent, optimizing Apache involves allocating resources in an appropriate way and simplifying the configuration to include only necessary content. Configure mpm Apache is modular because features can be easily added and removed. At the core of Apache, the multi-processing module (MPM) provides this modular feature-managing network connections and scheduling requests. MPM enables you to use threads and even migrate Apache to another operating system. Only one MPM is active at a time and must be used--with-mpm=(worker|prefork|event) Static compilation. Each request uses a traditional process model calledPrefork. The newer threading model is calledWorkerIt uses multiple processes, and each process has multiple threads, so that it can achieve better performance with low overhead. LatestEventMPM is an experimental model that uses separate thread pools for different tasks. To determine which MPM is currently used, you can executehttpd -l . Choosing which MPM to use depends on many factors. Before the event MPM is out of the experiment state, you should not consider this model, but make a choice between the thread used and the thread not used. On the surface, if all underlying modules (including all libraries used by PHP) are thread-safe, the thread is better than forking ). Prefork is a safer option. If worker is selected, perform the test with caution. Performance gains also depend on the libraries and hardware attached to your release. No matter which MPM is selected, you must configure it properly. Generally, configuring MPM includes telling Apache how to control how many workers are running, whether they are threads or processes. Important configuration options of prefork MPM are shown in Listing 1. Listing 1. prefork MPM Configuration
StartServers 50MinSpareServers 15MaxSpareServers 30MaxClients 225MaxRequestsPerChild 4000 |
|
Compile your own software When I first used UNIX, I insisted on compiling software for all the systems I joined. In the end, maintenance and updates have brought me trouble, so I learned how to build a package to simplify this task. Later, I realized that most of the time I was repeating what I had done with the release. Now, to a large extent, I will try my best to stick to everything I have chosen for the release and use my own package only when necessary. Similarly, you may find that, in terms of maintainability, the software package provided by the vendor is better than the latest and best code. Sometimes, performance tuning conflicts with system management goals. If you use a commercial version of Linux or rely on third-party support, you may have to consider vendor support. If you are alone, learn how to build a pack that can work with your release, and how to integrate it into the patch system. This ensures that the software and any changes you make are built in a consistent manner and can be used across multiple systems. Appropriate email lists and RSS feeds should also be subscribed for timely software updates. |
|
The prefork model creates a new process for each request. Redundant processes remain idle to process incoming requests, which reduces startup latency. As long as the Web server appears, the pre-configured 50 processes will be started immediately, and try to keep 10 to 20 idle servers running. Hard limit on the number of processesMaxClients . Although a process can process many successive requests, Apache will cancel the process with more than 4,000 connections, which reduces the risk of Memory leakage. Similar to configuring a threaded mpm, the difference is that you must determine how many threads and processes are used. The Apache document explains all necessary parameters and calculations. The value to be used can be selected only after several attempts and errors. The most important value isMaxClients . The goal is to allow enough workder processes or threads to run without causing excessive server exchanges. If the incoming request exceeds the processing capability, the requests that meet this value at least will receive services and other requests will be blocked. IfMaxClients If it is too high, all clients will experience bad services, because the web server will try to swap out a process so that another process can run. Too low may cause unnecessary Denial of Service. It is helpful to set this value to view the number of processes running under high load and the memory usage caused by all Apache processes. IfMaxClients The value of must be greater than 256ServerLimit Set the value to the same value. Read the MPM documentation carefully to learn more. The number of servers to be started and kept idle according to the server role optimization. If the server only runs Apache, you can use a moderate value, as shown in Listing 1, because this will make full use of the machine. If there are other databases or servers in the system, you should limit the number of idle servers in operation. Effectively use options and override Each request processed by Apache must fulfill a complex set of rules that indicate the constraints or special commands that the Web server must follow. Access to folders may be restricted by IP addresses as a specific folder, or you can configure the user name and password. These options also include processing specific files. For example, if a directory list is provided, how the files are processed, or whether the output results should be compressed. These configurations are based on httpd. the format of the container appears in Conf, such as <directory>, so that the configuration used references a location on the disk. If <location> is used, the reference is the path in the URL. Listing 2 shows an actual directory container. Listing 2. A directory container for the root directory Application
<Directory /> AllowOverride None Options FollowSymLinks</Directory> |
In Listing 2Directory And/Directory The Configuration between tags is applied to the given directory and everything in the directory-in this example, the given directory is the root directory. Here,AllowOverride Mark that the user is not allowed to override any option (will be further described later ).FollowSymLinks Option is enabled, which allows Apache to view previous symbolic connections to provide services for the request, even if the file is located outside the directory containing the Web file. This means that if a file in the web directory is connected by the/etc/passwd symbol, the web server will provide services for the file at request time. If-FollowSymLinks This feature will be disabled. The same request will cause an error to be returned to the client. Finally, this scenario is the cause of two concerns. The first aspect is related to performance. If disabledFollowSymLinks Apache must check all components that use the file name (directories and files) to ensure that they are not symbolic connections. This will incur additional overhead (disk operations ). The other is calledFollowSymLinksIfOwnerMatch When the file owner and the connection owner are connected using symbols. For optimal performance, use the option in Listing 2. So far, readers with security awareness should feel vigilant. Security is always a trade-off between functionality and risks. In our example, the functionality is speed, and the risk is to allow unauthorized access to files on the system. One of the measures to mitigate the risk is that the lamp application server typically focuses on a specific function, and users cannot create dangerous symbolic connections. If you need to enable symbolic connections, you can restrict them to a specific area of the file system, as shown in listing 3. Listing 3. constrain followsymlinks into a user's directory
<Directory /> Options FollowSymLinks</Directory><Directory /home/*/public_html> Options -FollowSymLinks</Directory> |
In listing 3, any public_html directory in a user's home directory and all its subdirectories are removedFollowSymLinks . As you can see, you can configure options separately for each directory through the master server configuration. You can manually override this server configuration (if the administratorAllowOverrides Statement), you only need to put a. htaccess file into the directory. This file contains additional server commands that will be loaded and applied each time a request contains a directory containing the. htaccess file. Although we have discussed the problem that the system has no users, many lamp applications use this function to control access and implement URL rewriting. Therefore, it is necessary to understand how it works. EvenAllowOverrides The statement can prevent users from doing what you don't want them to do. Apache must also check the. htaccess file to see if there is any work to be done. The parent directory can specify the commands to be processed by requests from sub-directories. This means that Apache must search for all components in the directory tree of the requested file. It is conceivable that this will cause a large number of disk operations for each request. The simplest solution is not to allow rewriting, which can eliminate the need for Apache to check. htaccess. Any subsequent special configuration will be directly placed in httpd. conf. Listing 4 shows the code added to httpd. conf by checking the password of a user's project directory, instead of placing it into a. htaccess file and relying onAllowOverrides . Listing 4. Move the. htaccess configuration to httpd. conf
<Directory /home/user/public_html/project/> AuthUserFile /home/user/.htpasswd AuthName "uber secret project" AuthType basic Require valid-user</Directory> |
If the configuration is transferred to httpd. conf andAllowOverrides If it is disabled, the disk usage can be reduced. A user's project may not attract many people to click on, but imagine how powerful it will be when applied to a busy site. Sometimes it is impossible to completely eliminate the use of. htaccess files. For example, in listing 5, if an option is restricted to a specific part of the file system, the rewrite can also be scoped. Listing 5. limits the scope of the. htaccess check
<Directory /> AllowOverrides None</Directory><Directory /home/*/public_html> AllowOverrides AuthConfig</Directory> |
After listing 5 is implemented, Apache will find the. htaccess file in the parent directory, but it will stop in the public_html directory, because the remaining part of the file system has disabled this function. For example, if you are requesting a file mapped to/home/user/public_html/project/notes.html, only the public_html and project directories are searched. The last prompt for separate configuration of each directory is: Perform the configuration in sequence. Any article about Apache optimization will tell you thatHostnameLookups off Command to disable DNS lookup, because attempting to reverse resolve all IP addresses connected to your server is a waste of resources. However, any constraints based on the host name will force the Web server to perform reverse lookup on the IP address of the client and perform forward lookup on the result to verify the authenticity of the name. Therefore, it is wise to avoid using access control based on the customer's host name and set its scope when the time limit is required. Persistent connection When a client connects to the Web server, it allows the client to send multiple requests through the same TCP connection, which reduces the latency related to multiple connections. This is useful when multiple images are referenced on a web page: the client can first request the page and then request all images through a connection. The disadvantage is that the worker process on the server must wait for the session to be closed by the client before it can be transferred to the next request. Apache enables you to configure how to handle persistent connections (calledKeepalives). Httpd. confKeepAlive 5 The server is allowed to process five requests for a connection before the connection is forcibly closed. Setting this value to 0 will disable persistent connections. At the global levelKeepAliveTimeout Determine how long Apache will wait for another connection before the session is closed. Handling persistent connections is not a one-size-fits-all configuration. For some Web sites, it is more appropriate to disable keepalives (KeepAlive 0 ), But for some other sites, enabling it will bring huge benefits. The only solution is to try the two configurations and observe which one is more appropriate. However, if keepalives is enabled, it is wise to use a small timeout value, for example, 2, that isKeepAliveTimeout 2 . This ensures that the client that wishes to send another request has sufficient time, and the worker process will not remain idle, waiting for the next request that may never occur. Compression The Web server can compress the output before sending it back to the client. This will make the pages sent over the Internet smaller, at the cost of the CPU cycle on the Web server. This is a good way to speed up page downloads for servers that can afford CPU overhead-it is not uncommon to compress the page size to 1/3. Images are usually compressed, so Compression should be limited to text output. Apache usesmod_deflate Provide compression. Althoughmod_deflate It can be easily enabled, but it involves too much complexity. Many manuals have explained these complicated contents. This article does not introduce the compression configuration, but provides links to the relevant documentation (see references ). Optimize PHP PHP is the engine for running application code. Install only the modules you plan to use and configure your web server so that PHP is used only for script files (usually those files ending with. php), not all static files. Operation Code Cache When a PHP script is requested, PHP will read the script and compile itZend operation codeThis is a binary representation of the code to be executed. This operation code is then executed and discarded by PHP. The operation code cache will save the compiled operation code and reuse it the next time you call this page. This saves a lot of time. Multiple types of caches are available. I usually use eaccelerator. To install eaccelerator, you must have a php development library on your computer. Because different Linux releases store different files in different locations, it is best to obtain installation instructions directly from the eaccelerator web site (see references for links ). Your release may also contain an operation code cache. You only need to install it. No matter how you install eaccelerator on the system, you need to pay attention to some configuration options. The configuration file is usually/etc/PHP. d/eaccelerator. ini.eaccelerator.shm_size Define the size of the shared cache. The compiled scripts are stored here. The unit of the value is MB ). Determine the proper size based on your application. Eaccelerator provides a script to display the cache status, including memory usage. 64 MB is a good choice (eaccelerator.shm_size="64" ). If the value you selected is not accepted, you must modify the maximum shared memory size of the kernel. Add to/etc/sysctl. confkernel.shmmax=67108864 , Runsysctl -p To make the settings take effect.kernel.shmmax The unit of the value is byte. If the shared memory allocation exceeds the limit, the eaccelerator must clear the old script from the memory. This is disabled by default;eaccelerator.shm_ttl = "60" Specify: When the eaccelerator runs out of shared memory, all scripts not accessed within 60 seconds will be cleared. Another popular alternative tool for eaccelerator is alternative PHP cache (APC ). Zend vendors also provide a commercial operation code cache, including an optimizer for further efficiency improvement. PHP. ini PHP configuration is completed in PHP. ini. Four important settings control how many system resources PHP can use, as shown in table 1. Table 1. resource-related settings in PHP. ini
Set |
Description |
Recommended Value |
Max_execution_time |
How many CPU seconds can a script use? |
30 |
Max_input_time |
How long does a script wait for input data (seconds) |
60 |
Memory_limit |
Memory size (in bytes) of a script before cancellation) |
32 m |
Output_buffering |
How much data (bytes) needs to be cached before data is sent to the client |
4096 |
The specific number depends on your application. If you want to receive large files from the usermax_input_time It may have to be added. You can modify it in PHP. ini or rewrite it using code. Similarly, programs with a large CPU or memory usage may require a larger value. The goal is to mitigate the impact of programs that exceed the limit. Therefore, we do not recommend that you disable these settings globally. Aboutmax_execution_time Note: It indicates the CPU time of the process, rather than the absolute time. Therefore, the running time of a program that performs a large amount of I/O and a small amount of computing may be far greatermax_execution_time . This is alsomax_input_time Can be greatermax_execution_time . The number of PHP executable log records is configurable. In the production environment, disabling all log records except the most important logs can reduce disk write operations. If you need to use logs to troubleshoot the problem, you can enable logging as needed.error_reporting = E_COMPILE_ERROR|E_ERROR|E_CORE_ERROR Sufficient logging is enabled, allowing you to detect problems and eliminate a large amount of useless content from the script.
Conclusion
|
Share this article ......
|
|
Submit this article to Digg |
|
|
Release to Del. icio. us |
|
|
Submit to Slashdot! |
|
|
|
This article focuses on the optimization of web servers, including Apache and PHP. For Apache, The general idea is to eliminate unnecessary checks that must be performed by the Web server, such as processing. htaccess files. You must also tune the multi-processing module to balance the system resources used with the idle worker available for incoming requests. The best thing for PHP is to install an operation code cache. Pay close attention to several resource settings to ensure that the script does not waste system resources and does not slow down the system's processing of other tasks. The next and last article in this series will introduce MySQL database optimization. Please stay tuned! References Learning
- For more information, see the original article on the developerworks global website.
- "Use Application Tracking to quantitatively analyze performance changes" (developerworks, February August 2006) describes how to use Application Tracking to demonstrate the effects of Apache configuration changes.
- "New features in PHP v5.2, Part 1: using a new memory manager" (developerworks, 1st) covers the latest changes in memory processing in PHP 5.2. PHP has been constantly optimizing its use of system resources.
- Mod_deflate is an Apache module that can dynamically compress the output. This function can also be achieved through output compression in PHP.
- The pre-Cache compresses static files such as JavaScript code. CSS is another way to improve performance. It is better to compress and connect all JavaScript code and CSS.
- The Apache document about the multi-processing module is worth reading. You can learn about the functions of each module and click the corresponding link to view the specific documents of the selected MPM.
- Find more resources for Linux developers in the Linux area of the developerworks Chinese website.
- Stay tuned to developerworks technical events and network broadcasts.
Obtain products and technologies
- If your release does not contain an eaccelerator, the install from source instruction will help you.
- Alternative PHP cache and Zend Platform are alternative tools for eaccelerator.
- Siege allows you to simulate users to learn how much traffic the site can process.
- Sooner or later, you plan to cache some elements of the site and distribute the load to multiple web servers. Squid's accelerator mode (also called reverse proxy) or Linux virtual server project are good tools.
- Order SEK for Linux, which has two DVDs, including the latest IBM trial software for Linux, including DB2, Lotus, rational, Tivoli, and websphere.
- Build your next Linux development project with the IBM trial software that can be downloaded directly from developerworks.
Discussion
- Join the developerworks community by participating in the developerworks blog.
About the author
|