Lamp website performance optimization

Source: Internet
Author: User
Tags website performance

Let's talk about the background: A lamp online testing website, with an average daily PV of around. It takes tens of thousands of hours for the competition. Currently, there are about 0.3 million records in the database. The server configuration is high. Recently, performance problems have occurred. When the access volume increases, the pressure on the database server is very high. The memory usage of MySQL can usually reach 400%. At this time, services are basically unavailable and the website page cannot be opened. Theoretically, the current data volume is not large, the traffic volume is not large, and the server configuration is also high. This situation is abnormal. I have been investigating this problem for a long time. I finally found the cause of the problem yesterday. I will record it and I will have a reference for similar problems in the future. Let's take a look at a picture. This is the glances monitoring software. We can see that the red 496.6 represents the CPU usage of MySQL.

Previously, this problem was solved only by restarting the MySQL service. However, this is only temporary. After the restart, the CPU usage will come up in less than one minute, which is not a long term.

Then consider modifying the configuration files of the database and Apache to increase the cache space, but the effect is not obvious.

Later, I thought about whether it was a router problem, because at present, two servers are connected through a small router to form a small LAN, one is Apache (the pressure is very small ), another server is placed in the database (under great pressure), and data exchange must first go through the router. As a result, I began to doubt the performance of the vro. I simply connected two servers with one network cable (the server has two NICS ). In this way, the two servers can communicate directly. Facts have proved that this still cannot solve the problem of high database pressure.

Next, there are various kinds of confusions. I have thought of various methods, such as changing nginx, changing other databases, and getting a database cluster or something, which are troublesome. I also need to modify PHP.CodeThe key is that PHP is disgusting ......

One day, I found out that memcached and memcached are a cache system. In this case, if there is a query statement, it will take a lot of time. If there is no cache, the database will be accessed every time you refresh the page, in this way, the database pressure is relatively high. If there is a cache, it will be different. If the data to be queried exists in the cache, it will be retrieved directly from the cache, so that you do not need to go to the database for query, if the access volume is large, the number of queries to the database can be significantly reduced, which of course reduces the pressure on the database. First paste a piece of code without Caching:

 
$ List = $ statics-> getfirsttenlist ();

The following is the cache code:

 
$ List = NULL; If (! ($ List = $ cache-> get ('top _ ten_list ') {$ list = $ statics-> getfirsttenlist (); $ cache-& gt; Set ('top _ ten_list ', $ list, 7200 );}

No cached code. Each time you call a query function to retrieve data from the database. The following items have different caches. Before querying a data record, first check whether the cache contains $ cache-> get () in the Code. If the data is obtained from the cache, you do not need to execute the statement to query the database. If the cache server does not exist, you need to query it from the database. After the query, use the $ cache-> set () method to save the query result to the cache server, and set the cache time. Here it is 7200 seconds. After 7200 seconds, the cache will become invalid and will be updated. Another important thing is to give each cache a name. Different cache names cannot be the same. The Cache Server distinguishes the cache based on the cache name. If the cache name is the same, the data obtained from the cache is not the data we need. For example, there are two competitions in which the ranking page of the competition is cached. If both the cache names are called ranklist, refresh the ranking page of the first competition, then the ranking data of the first game will be stored on the cache server. If you refresh the ranking page of the second game, because the ranklist exists in the cache, it is directly retrieved from the cache, but the retrieved data is not what we need, this problem is still relatively small. If it is different types of data, the website will be messy.

Cache is a good thing, and then we plan to reduce the pressure on the server by adding a large number of caches. But later I found that this would not work. Because our website is an online evaluation system and a real-time dynamic system, when a user submits an answer, he wants to see the judgment result immediately. When the answer is correct, update the ranking information immediately. If the cache is added, the information obtained from the cache is not the latest, and the real-time ranking and other information cannot be displayed. This is like booking a ticket, but the system prompts you that the ticket is successfully booked for a long time. The waiting time in the middle is the most difficult. Another reason is that the use of cache is restricted, that is, for a large number of pages, for example, there are 2000 questions on the server, one question and one page, if a page is cached, the cache performance will be reduced. However, for some data, it does not change much for a long time. For example, you can consider adding a cache for the total number of questions to be ranked. You can set the half-day or one-day survival time, that is, update once every day.

Since a large number of caches are not applicable to systems that are highly real-time, I have not added many caches. The database pressure is still so high. Still unable to find the root cause of the problem.

Find a MySQL performance optimization book, and then start to study, first think of adding indexes, and then use show index from table_name to view it, it is found that the commonly used query words have been indexed, I even started to consider whether the database pressure is increased due to too many indexes? After some tests (deleting some indexes), we found that the problem still persists. It seems that the problem of OJ cannot be solved very well by adding indexes.

Later, I found a statement, "Show processlist". This command is used to show which threads are being executed and how long they have been executed. For more information, see:

The figure above is captured when the load is not very high. We can see that two SQL query statements have been executed for seven seconds, another operation is in the process of copying data to a temporary table. The real competition is much worse than this problem! Several SQL statements have been executed for more than 10 seconds, and the total number of SQL query statements is more than 200. Of course, most threads are blocked. Here we can clearly figure out which SQL statement is a waste of time. Next, we need to find out where the SQL query statement was initiated, find this statement from the PHP source? If it is difficult to find one, here we use the powerful grep command to help us find it.

Obviously, these statements are in a file. Let's open this file. Here is a function that contains the preceding query statement.

Where is this function called? Find

As you can see, there are eight similar statements. What are the functions of these eight statements? Among them, the first item is used to find the total number of questions submitted today from the 0.3 million table, and the second item is used to count the total number of submitted questions today. The following items are used to count the total number of questions submitted this week, the total number of AC and submissions in this year. There are currently about 0.3 million pieces of information in the solution table that shows the submission status and will be updated at any time. As long as someone submits the table, new data will be inserted and some data will be updated. In the database theory, statistics and insert updates conflict with each other. here we need to lock the table when executing the query statement, prevent other data updates from affecting the query results. That is to say, when the preceding query statement is executed, other statements in the database are blocked and forced to sleep. During the submission peak, the number of blocked statements in the database can reach more than 200, and the CPU usage of MySQL can reach 500%. Therefore, the database cannot provide external services!

Next, we need to find out which page contains this statistics and which page calls this function (powerful grep)

When I saw the above results, my heart was messy ~~ Why? This statistics is available on multiple pages. Open the page file and find it.

(It is the query result after I modified it. Most of the results are commented out by me. It turns out there is no comment ~)

In an inconspicuous sidebar, I see the above statistics. I have investigated some users. Have you noticed this statistics? Have you followed this statistics? The answer is the same as I thought. No! No one cares. What is the significance of this statistics? I don't know how the designer thought of this function. This problem was not solved when the data volume was small. As the data volume increased, the problem was exposed. These eight numbers slowed down the speed of the entire website!

The solution is also clear. There are two methods: 1. Delete the statistics and 2. Add cache for the statistics. Based on the comments from various aspects, I decided to keep this statistics only on the overall ranking page. Of course, the cache should be added and updated every six hours. Remove all other pages!

After solving this problem, try the Website access speed again. It's a fast one! Tears rush! The performance problem that has been entangled for so long has finally been solved! Orz

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.