Brief introduction
Major web properties, such as Wikipedia, Facebook, and Yahoo!, use the LAMP architecture to service millions of of requests a day, while Web application software such as Wordpress, Joomla, Drupal, and SugarCRM uses its architecture to make the organization light Loose deployment of web-based applications.
The advantage of the architecture is its simplicity. While. NET-like stacks and Java™ technologies can use a lot of hardware, expensive software stacks, and complex performance tuning, the LAMP stack can run on top of the merchandise hardware and use the open source stack. Because the software stack is a loose set of components, rather than a whole stack, performance tuning is a big challenge because each component needs to be analyzed and tuned.
However, there are a few simple performance tasks that can have a huge impact on the performance of sites of any size. In this article, we'll explore 5 of these tasks designed to optimize the performance of LAMP applications. These projects should have little need to make architectural changes to your application, making it a safe and easy choice to maximize the responsiveness and hardware requirements needed for your Web applications.
Using opcode caching
The easiest way to improve the performance of any PHP application (of course, "P" in LAMP) is to use an opcode cache. For any web site I use, it's one of the things I'm sure of, because the performance impact is great (most of the time you have the opcode cache and the response time can be reduced by half). But a big question for most people unfamiliar with PHP is why the improvements are so great. The answer is how PHP handles Web requests. Figure 1 is an overview of the PHP request process.
Figure 1. PHP Request
Because PHP is an interpreted language rather than a compiled language such as C or Java, the entire step of "parse-compile-execute" is performed on each request. You can see why this can be time-consuming and resource-intensive, especially if the script rarely changes between requests. After parsing and compiling the script, the script is in the machine-resolvable state as a series of opcode. This is where the opcode cache works. It caches these compilation scripts as a series of opcode to avoid each request step for parsing and compiling. You will see how this workflow works in Figure 2.
Figure 2. PHP request using opcode cache
So when the php script's cache opcode exists, we can skip the parsing and compiling steps of the PHP request process, execute the cache opcode directly and output the result. The check algorithm handles situations where you may have made changes to the script file, so after the first request of the changed script, the opcode is automatically recompiled and cached for subsequent requests, replacing the cached script.
OpCode caching has been popular with PHP for a long time, with some of the early dates going back to the heyday of PHP V4. Currently there are some popular options being actively developed and used:
- The alternative PHP cache (APC) is probably the most popular opcode cache in PHP (see Resources). It was developed by a number of core PHP developers and contributed significantly to the speed and stability of Facebook and Yahoo! Engineers. It also supports a number of other speed improvements for processing PHP requests, including a user cache component, which is discussed later in this article.
- Wincache is an opcode cache that is actively developed by the Microsoft® Internet information Services (IIS) team and is intended for use on Windows® on IIS Web servers only (see Resources). The main motivation to develop it is to make PHP a first-rate development platform on the windows-iis-php stack, as it is known that the APC does not work well on the stack. It is functionally similar to APC and supports a user cache component and a built-in session handler to leverage Wincache as a session handler.
- Eaccelerator is a derivation of one of the original PHP caches Turck the MMCache opcode cache (see resources). Unlike APC and Wincache, it is only an opcode cache and optimizer, so it does not contain user-cached components. It is fully compatible on the UNIX® and Windows stacks and is popular with sites that do not intend to take advantage of other features provided by APC or Wincache. This is a common scenario if you want to use a solution such as memcache to provide a separate user caching server for a multiple Web server environment.
There is no doubt that an opcode cache is the first step in speeding up PHP by eliminating the need to parse and compile scripts after each request. After you complete the first step, you should see improvements in response time and server load. But optimizing PHP can do more than that, and we'll talk about it next.
Optimize your PHP settings
While implementing opcode caching is a great innovation for performance improvements, there are a number of other optimization options that you can use to optimize your PHP settings based on the settings in the php.ini file. These settings are more appropriate for production instances, and you may not want to make these changes on a development or test instance because it makes debugging of application problems more difficult.
Let's take a look at some of the items that are important for performance improvement.
Options that should be disabled
There are several php.ini settings that should be disabled because they are often used for backward compatibility:
register_globals
-This feature is often the default value before PHP V4.2, where the incoming request variables are automatically assigned to normal PHP variables. In addition to causing significant security issues (mixing unfiltered incoming request data with the contents of a normal PHP variable), doing so will incur overhead for each request. Disabling this setting makes your application more secure and improves performance.
magic_quotes_*
-This is another legacy of PHP V4, where incoming data automatically avoids risky form data. It is intended to be a security feature that is organized before sending incoming data to a database, but is not effective because it does not help users prevent common SQL injection attacks. Disabling this setting eliminates this annoying performance problem again, because most database tiers support prepared statements that better handle the risk.
always_populate_raw_post_data
-This is only necessary if you need to view the entire load of incoming unfiltered data for some reason POST
. Otherwise, it stores only one copy of the POST data in memory, which is not necessary.
However, disabling these options on legacy code can be risky because they may depend on their settings for proper execution. You should not develop any new code based on the options that are set, and, if possible, you should seek ways to refactor your existing code to avoid using them.
Options for setting should be disabled or adjusted
You can enable some of the best performance options for php.ini files to increase your script speed:
output_buffering
-You should make sure that this option is enabled because it will brush the output back to the browser in blocks instead of each echo
or every print
statement, and the latter will significantly slow down your request response time.
variables_order
-This command controls the Egpcs (,,,, Environment
Get
Post
Cookie
and Server
) variable parsing order of incoming requests. If you do not use some kind of hyper-global variables, such as environment variables, you can safely delete them to achieve a bit of acceleration, thus avoiding parsing them on every request.
date.timezone
-This is a directive that is added to the PHP V5.1 to set the default time zone, and then for the function to be described later DateTime
. If you do not set this option in the php.ini file, PHP performs a number of system requests to figure out what it is, and a warning is issued for each request in the PHP V5.3.
These are considered "handy" for settings that should be configured on your production instance. As far as PHP is concerned, there is one more thing to consider. This is the require()
use of and include()
(and its peers require_once()
and include_once()
) in your application. These functions optimize your PHP configuration and code to prevent unnecessary file state checks on each request, thereby reducing response time.
Manage your require()
andinclude()
In terms of performance, file state calls (that is, calls to the underlying file system to check for the existence of a file) are quite expensive. One of the biggest culprits of file state is in the form of a require()
include()
statement, which is used to bring code into the script. require_once()
and include_once()
sibling calls are more problematic because they need to not only verify that the file exists, but that it was not previously included.
So what's the best way to solve this problem? You can do something to speed up the solution.
require()
include()
Use absolute paths for all and calls. This will make PHP clearer about the exact files you want to include, so there is no need to check your files for the entire include_path
.
include_path
the number of entries in the hold is low. This is useful in situations where it is difficult to require()
provide an absolute path for each and every include()
invocation, usually in a large legacy application, by not checking the location of the files that you include.
APC and Wincache also have a mechanism for caching the results of file state checks in PHP, so there is no need for repeated file system checks. When you leave include file names as static rather than variable-driven, they are most effective, so it is useful to try to do so as much as possible.
Optimize your database
Database optimization will soon become a frontier topic, and I have little room to do this completely and impartially here. But if you're looking to optimize the speed of your database, you should take steps first, which should help with common problems.
Put the database on your own machine
The database query itself can become quite intense, and is typically SELECT
limited to 100% of CPUs when executing simple statements on a reasonably sized dataset. If both your Web server and the database server are using CPU time on a single machine, this will undoubtedly slow down your request. So I think the first step is to put the Web server and the database server on a separate machine, and make sure that your database server is more robust (the database server prefers large amounts of memory and multiple CPUs).
Rational design and indexing of tables
The biggest problem with database performance may be from bad database design and missing indexes. SELECT
statements are typically the most common query types that run in a typical Web application. They are also the most time-consuming queries that run on the database server. In addition, these types of SQL statements are most sensitive to the appropriate index and database design, so look at the following instructions to get the tips for achieving optimal performance.
- Make sure that each table has a primary key. This provides a default order and a quick way for tables to join other tables.
- Make sure that the index of any foreign key in one table (that is, the key of a record linked to a record in another table) is properly compiled. Many databases automatically impose constraints on these keys so that the value really matches a record in another table, which helps to get rid of this difficulty.
- An attempt was made to limit the number of columns in a table. There are too many columns in one table that are required to be scanned longer than when there are only a few columns. In addition, if you have a table that is not commonly used with multiple columns, you are also
NULL
wasting disk space through the Value field. This is also true for variable size fields such as text or BLOBs, where the table size can grow far beyond the requirements. In this case, you should consider dividing the other columns into different tables and uniting them on the primary key of the record.
Profiling queries running on the server
The best way to improve database performance is to analyze what queries are running on your database server and how long it will take to run them. Almost every database has a tool with this functionality. For MySQL, you can use the slow query log to find problematic queries. To use it, set it to 1 in the MySQL configuration file, slow_query_log
and then set Log_output to file, and record them in Hostname-slow.log. You can set long_query_time
thresholds to determine how many seconds a query must run before it is considered a "slow query." I would recommend that you set this threshold to 5 seconds first, and reduce it to 1 seconds over time, depending on your dataset. If you explore the file, you will see a detailed query similar to Listing 1.
Listing 1. MySQL Slow Query log
/usr/local/mysql/bin/mysqld, Version:5.1.49-log, started with:
TCP port:3306 Unix socket:/tmp/mysql.sock
Time Id Command Argument
# time:030207 15:03:33
# User@host:user[user] @ localhost.localdomain [ 127.0.0.1]
# query_time:13 lock_time:0 rows_sent:117 rows_examined:234 use
SugarCRM;
SELECT * from accounts inner join leads on accounts.id = leads.account_id;
|
The key object we want to consider is the Query_time
time it takes to display the query. Another consideration is the Rows_sent
number of and Rows_examined
, because these can refer to situations where a query is written incorrectly if it sees too many rows or returns too many rows. You can delve deeper into how to write a query, which, by adding it at the beginning of the query, EXPLAIN
returns the query plan, not the result set, as shown in Listing 2.
Listing 2. MySQL EXPLAIN
Results
Mysql> Explain SELECT * from accounts inner join leads on accounts.id = leads.account_id;
+----+-------------+----------+--------+--------------------------+---------+---
ID select_type Table type possible_keys
key key_len ref rows Extra
+----+-------------+-- --------+--------+--------------------------+---------+--------
1 simple leads all Idx_leads_acct_del null NULL NULL 1 simple accounts eq_ref Primary,idx_accnt_id_del PRIMARY 108
sugarcrm.leads.account_id 1 +
----+-------------+ ----------+--------+--------------------------+---------+---------
2 rows in Set (0.00 sec)
|
The MySQL manual delves deeper into EXPLAIN
the subject of the output (see Resources), but one of the important things I'm thinking about is ' type ' as ' all ' because it requires MySQL to do a full table scan, and no keys are needed to execute the query. These help you increase the query speed significantly when you add an index.
Effective caching of data
As we saw in the previous section, databases tend to be the biggest pain points of your Web application's performance. But what if the data you're querying doesn't change often? In this case, a good choice is to store the results locally rather than invoking the query for each request.
The two opcode caches we've explored before the APC and Wincache have the tools to do this, where you can store PHP data directly in a shared memory segment for quick queries. Listing 3 provides a concrete example.
Listing 3. Example of caching database results using APC
<?php
function getlistofusers ()
{
$list = Apc_fetch ("Getlistofusers");
if (empty ($list)) {
$conn = new PDO ("mysql:dbname=testdb;host=127.0.0.1", "Dbuser", "Dbpass");
$sql = "SELECT ID, name from users order by name";
foreach ($conn->query ($sql) as $row) {
$list [] = $row;
}
Apc_store ("Getlistofusers", $list);
}
return $list;
}
|
We only need to execute the query once. After that, we push the results to the getListOfUsers
APC cache under the key. Starting here, until the cache expires, you can get the result array directly from the cache and skip the SQL query.
APC and Wincache are not the only choices for a user cache; Memcache and Redis are other popular choices that do not require you to run user caching on the same server as the WEB server. This improves performance and flexibility, especially when your Web application extends outwards across multiple Web servers.
In this article, we explored 5 simple ways to tune the performance of your LAMP. Not only did we explore PHP-level technology by using an opcode cache and optimized PHP configuration, but we explored how to optimize your database design to achieve proper indexing. We also explored how to use a user cache (for example, APC) to show how to avoid duplicate database calls when the data does not change frequently.