A tutorial on using memcached to reduce the number of database queries in Python _python

Source: Internet
Author: User
Tags md5 memcached in python

I never knew how to better optimize the performance of the Web page, and then recently do a python and php similar web page rendering speed comparison, accidentally found a very simple and stupid but I have not found a good way (had to BS myself): Directly like some PHP applications such as Discuz forum, In the generated Web page to print out "How many seconds this page is generated," and then in the Non-stop access to the Web test, it is very intuitive to see what operation will lead to bottlenecks, how to solve the bottleneck.

So I found SIMPLECD in the home, unexpectedly need 0.2 seconds or so, really can not endure: Contrast discuz Forum home page average generation only 0.02 seconds, and Discuz Forum's homepage is undoubtedly more complex than the homepage of SIMPLECD; it makes me so sad, Because this is necessarily not the Python language caused by the gap, can only say that I did not do optimization and Discuz program optimization is very good consequences.


In fact, no analysis can be sure that the database is a drag, simplecd in the generation of the home page needs to be in the SQLite three database for more than 42 queries, is a historical cause of the extremely inefficient design; but most of these 40 queries, in fact, are very fast queries, A careful analysis of two is a large performance, the others are not slow.

The first big one is: Get the number of data

SELECT Count (*) from VERYCD

This operation takes a lot of time, this is because every time the database is locked and then traverse the number of primary key statistics, the greater the amount of data time-consuming, the time is O (n), n is the database size; Actually, it's very easy to solve the problem, as long as there's a random number of current data, only when adding and deleting data, the change is OK, so the time is O (1) of the

The second big one is: Get the latest updated list of 20 data

SELECT verycdid,title,brief,updtime from VERYCD to
 
  updtime DESC LIMIT 20;

Because the index is done on the updtime, the real query time is the time to search the index. But why is this operation slow? Since my data was inserted according to publish time, it's definitely necessary to do I/O in at least 20 different places when I display it by update time, so it's slow. The solution is to have it do I/O in one place. That is, the return result of this statement is cached unless the database adds new data/changes the original data. So it's 20 times times faster:

The next 20 small case: Get the Publisher and click the number of information

SELECT owner from LOCK WHERE id=xxxx;
 
SELECT hits from stat WHERE id=xxxx;

Why not use the SQL JOIN statement here to save some business? Because of the schema reason this data is placed in different databases, stat is a kind of database with CTR, because it needs to be frequently inserted so that it is stored in MySQL, while lock and VERYCD are databases that require a large number of select operations, Because of the MySQL tragedy index usage and paging efficiency are stored in the Sqlite3 database, so cannot join-.-

Anyway, this is not a problem, just like the solution, all caching

So looking at my example, optimize the performance of the Web page can be word, cache database query, you can. I believe most web apps are like this:


Finally turn to memcached, since the plan is to cache, with files to do caching or disk I/O, as directly cached in memory, memory I/O can be much faster. So memcached as the name suggests is such dongdong.

Memcached is a very powerful tool, because it can support the distributed shared memory cache, large stations use it, for small sites, as long as the memory, which is also good things; the memory buffer size needed for the home page is estimated to be no more than 10K, not to mention I am now memory tyrants, but also care about this?

Configuration run: Because there is no good match, change the memory and port on the line

vi/etc/memcached.conf
 
/etc/init.d/memcached Restart

Used in Python Web applications

Import memcache
 
mc = memcache. Client ([' 127.0.0.1:11211 '], debug=0)

Memcache is actually a map structure, the most commonly used is two functions:

    1. The first is set (Key,value,timeout), which is simply mapping the key to value,timeout when the mapping fails
    2. The second is the Get (key) function, which returns the value that the key points to

So you can do this for a normal SQL query.

sql = ' SELECT COUNT (*) from VERYCD '
 
c = sqlite3.connect (' verycd.db '). Cursor ()
 
 
 
# Original processing
 
c.execute (SQL)
 
count = C.fetchone () [0]
 
 
 
# The way it is now processed from
 
hashlib import MD5
 
key=md5 (SQL)
 
count = Mc.get (key)
 
if Not count:
 
  c.execute (sql)
 
  count = C.fetchone () [0]
 
  mc.set (key,count,60*5) #存5分钟

Where the MD5 is to make the key distribution more evenly, the other code is very intuitive I do not explain.


After you have optimized statement 1 and Statement 2, the average generation time of the first page has been reduced to 0.02 seconds, and discuz a level, and then after the optimization of statement 3, the final result is the first page generation time reduced to 0.006 seconds, after memcached a few lines of code optimization, performance increased by 3,300%. Finally can straighten up to see DISCUZ)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.