Originally I have not known how to better optimize the performance of the Web page, and then recently made Python and PHP similar page rendering speed comparison, unexpectedly found a very simple and stupid but I have not found a good way (have BS myself): Directly like some PHP applications such as Discuz forum, Print out "How many seconds this page generates" in the generated Web page, and then when you're constantly accessing Web tests, it's intuitive to see what actions can lead to bottlenecks and how to solve bottlenecks.
So I found that simplecd in the generation of home, unexpectedly need 0.2 seconds or so, really can not endure: compared to discuz forum home on average generated only 0.02 seconds, and Discuz Forum home page is undoubtedly more complex than SIMPLECD homepage, it makes me love why Ah, Because this is not the Python language caused by the gap, can only be said that I did not optimize and Discuz program optimization of the consequences.
In fact, without analysis can also know that the database is a drag, simplecd in the creation of the first page needs to be in the three database SQLite 42 times, is a historical cause of extremely inefficient design; but these 40 queries, in fact, most of them are very fast queries, A careful analysis of two is a big performance, others are not slow.
The first big thing is: Get the number of data
SELECT Count (*) from VERYCD
This operation takes a lot of time each time, this is because every time the database to lock and then traverse through the primary key statistics, the larger the amount of data time is greater, Time is O (n), n is the database size; it is very easy to solve the problem, as long as there is a random number of the current data, Only change when adding and deleting data, so the time is O (1)
The second big one is: Get the latest update of the 20 data lists
SELECT Verycdid,title,brief,updtime from VERYCD
ORDER by Updtime DESC LIMIT 20;
Because the index is indexed on the updtime, the actual query time is the time to search the index. But why is this operation slow? Because my data is inserted according to publish time, it will be necessary to do I/O at at least 20 different places, so it will be slow. The solution is to let it do I/O in one place. That is, unless the database joins new data/changes the original data, the return result of this statement is cached. That's 20 times times faster:)
Next comes the 20 small case: Get the publisher and the number of clicks.
SELECT owner from LOCK WHERE id=xxxx;
SELECT hits from stat WHERE id=xxxx;
Why not use SQL's join statement to save something here? Because of the architectural reasons, these data are placed in different databases, stat is a kind of database with CTR, because it needs frequent insertions, so it is stored with MySQL, while lock and VERYCD are databases that require a large number of select operations. The MySQL tragedy was stored in the Sqlite3 database because of the index usage and paging efficiency, so it was not possible to join-.-
In short, this is not a problem, just like the solution, all cache
So throughout my example, to optimize the performance of the Web page can be word, cache database query, you can. I believe most Web applications are like this:)
Finally memcached, since the intention to cache, with the file cache, or there is disk I/O, rather than directly cached in memory, memory I/O can be much faster. So memcached as the name implies is this stuff.
Memcached is a very powerful tool, because it can support distributed shared memory cache, major stations use it, for small sites, as long as the memory, it is also good things; the memory buffer size required for the home page is estimated to be no more than 10K, not to mention I am now memory local tyrants, but also care about this?
Configuration run: Because it is not a single machine to match, change the memory and port on the line
Vi/etc/memcached.conf
/etc/init.d/memcached restart
Used in Python's web App
Import Memcache
MC = Memcache. Client ([' 127.0.0.1:11211 '], debug=0)
Memcache is actually a map structure, the most commonly used is two functions:
The first is set (Key,value,timeout), which is very simple to map key to Value,timeout refers to when this mapping fails
The second is the Get (key) function, which returns the value pointed to by key.
So you can do this for a normal SQL query.
sql = ' SELECT COUNT (*) from VERYCD '
c = sqlite3.connect (' verycd.db '). Cursor ()
# The original way of handling
C.execute (SQL)
Count = C.fetchone () [0]
# The way it's handled now
From Hashlib import MD5
KEY=MD5 (SQL)
Count = Mc.get (key)
If not count:
C.execute (SQL)
Count = C.fetchone () [0]
Mc.set (key,count,60*5) #存5分钟
Where the MD5 is to make the key distribution more uniform, the other code is very intuitive I do not explain.
After the optimization of statements 1 and 2, the first page of the average generation time has been reduced to 0.02 seconds, and discuz a magnitude, and then after the optimization of the statement 3, the final result is the first page generation time reduced to about 0.006 seconds, after memcached a few lines of code optimization, performance increased by 3,300%. Finally can look straight up to see DISCUZ)