In php, strtotime function performance analysis is recently implemented in the game data statistics background. the most basic function is to analyze the registered login log to display user data. In the company's internal testing, the number of users is very small, so no performance problems are found. However, in the real test environment from the past two days, the number of users rushed in, starting from the afternoon, the number of online users began to get stuck, and data was returned in a few seconds; the query speed of the number of registrants is okay. In the evening, the number of online users cannot be opened after loading times out. Although they do not know what bugs they have on the game end, there are often problems with gamer logins, resulting in a low number of online users and registered users. However, I cannot query the data volume at the same speed. this is embarrassing.
Now they are checking the game BUG. I am also checking the performance of the statistics background code. First, let's explain that the data I use is from the Slave Database. they use the master database for the game. Besides, there are only a few administrators here, which cannot affect the performance of the game server.
Today, the project leader directs all the databases to the servers in the company. I copied a copy to the local machine to see what went wrong with the performance of the statistics platform. Then I found that even the registration statistics were very slow, and the server was returned in about two seconds. the local host took over 20 seconds and often timed out (the default PHP configuration was 30 seconds ); no need to say online statistics. After reading the database, about 3500 registration records (with false data) are recorded every five minutes, and 288 records are recorded every day. Of course, this is certainly not the first round of database queries in a loop, so it will be scolded.
The logic for counting the number of registrars in a time period is also very simple, that is, the data is traversed once in each time period, and the time size is compared, which is equal to + 1. But why is it that the logic is as simple as 1 million loops? how can it take so long for half a minute?
The key issue is time comparison. we all know that timestamp is a relatively scientific method to compare the time, the time recorded in the database is generally in the form of YYYY-mm-dd HH: ii: ss. the strtotime function in PHP is converted into a timestamp. However, after the blessing of 288 for * 3500 foreach, the execution time here is as long as half a minute.
$ NowDayDT = strtotime (date ('Y-m-D'); $ __startt = microtime (TRUE); for ($ I = 0; $ I <$ allTime; $ I + = $ gapTime) {$ count = 0; // $ startDT = $ nowDayDT + $ I for data comparison; $ endDT = $ nowDayDT + $ I + $ gapTime; // Display $ xAxis1 = date ('H: I ', $ nowDayDT + $ I); $ xAxis2 = date ('H: I ', $ nowDayDT + $ I + $ gapTime); foreach ($ rawData as $ line) {$ time = strtotime ($ line ['log _ dt ']); if ($ startDT <= $ time & $ time <$ endDT) {$ count ++ ;}}$ resAr R [] = ['Date' => $ xAxis1 .'~ '. $ XAxis2, 'number' => $ count];} echo microtime (TRUE)-$ __startt;
In this case, there is basically no way to use this strtotime function. what is the difference between the time and the time? The answer is simple and crude. in PHP, you can directly compare two date and time strings! The modified code is as follows. Then the running time is about 0.3 seconds.
$ __Startt = microtime (TRUE); for ($ I = 0; $ I <$ allTime; $ I + = $ gapTime) {$ count = 0; // $ startDT = date ('Y-m-d H: I: S', $ nowDayDT + $ I) for data comparison ); $ endDT = date ('Y-m-d H: I: S', $ nowDayDT + $ I + $ gapTime ); // Display $ xAxis1 = date ('H: I ', $ nowDayDT + $ I); $ xAxis2 = date ('H: I ', $ nowDayDT + $ I + $ gapTime); foreach ($ rawData as $ line) {$ time = $ line ['log _ dt ']; if ($ startDT <= $ time & $ time <$ endDT) {$ count ++ ;}}$ resArr [] = [ 'Date' => $ xAxis1 .'~ '. $ XAxis2, 'number' => $ count];} echo microtime (TRUE)-$ __startt;
Traversal and optimization
You may find a problem: a foreach is nested in the for statement. this is a bit worrying about the performance. is it necessary to completely traverse the foreach in it? It is not necessary. You only need to sort the SQL data by time. The optimized time comparison algorithm is as follows:
For {... foreach ($ rawData as $ line) {$ time = $ line ['log _ dt ']; // strtotime ($ line ['log _ dt']); // The optimization algorithm calculates if ($ time <$ startDT) continue; // if ($ time >=$ endDT) break is skipped if ($ time >=$ endDT) earlier than the start time; // end at $ count ++ if the end time is greater than the end time; // otherwise, the algorithm is qualified // The original algorithm // if ($ startDT <= $ time & $ time <$ endDT) {// $ count ++; //}}...}
Here, we use the continue and break keywords to skip a loop and end the entire loop. In this case, a large part of the subsequent data can be skipped during the first statistical period of the day. The total traversal time is reduced to about 0.12 seconds.
In summary, in large data processing, we should try to avoid data conversion in traversal and avoid using some complex functions. Such as strtotime