Original: Django-based Disqus how to support 8 billion PV per month
This document is translated by Bó Lè Online-the base Saint OMG. without permission, no reprint!
English Source: Matt Robenolt. Welcome to join the translation team.
Now we disqus be able to process one months and 8 billion PV, processing 45,000 requests per second. When we send comments to a lot of different people, we learn something. It is well known that Disqus is using Django to handle most of the web traffic. When using any web framework, choose between development speed VS performance, Quick start vs customization, and more. Disqus is more prone to rapid development and easy-to-start, while taking into account performance and flexible customization.
So why is the web framework slow?
On the surface, the first impression of the web framework is slow, because there are a lot of code that you don't need in your app, which is a normal impression. In practice, slowness is often not caused by inflated frames and language choices. Slowness should be the result of your request to communicate with other services on your network. In our scenario, these ' other services ' are PostgreSQL, Redis, Cassandra, and memcached. Slow database queries and network latency often drag down the performance of a robust framework like Django.
To circumvent these delays, people use a variety of caching techniques. The most common approach is to use the Django built-in cache library.
The common application cache is as follows:
data = Cache.get ('stuff')if is None: = List ( Stuff.objects.all ()) cache.set ('Stuff', data)return data
If you're familiar with Django, this should be a very common pattern. This form of caching is very simple and straightforward, and in most cases applies. With memcached, it's fast enough, but there's still a lot of work to do in response to a request.
Process 45,000 requests per second
We've already cached things that are slow to handle. However, there is still a lot of work to do when reaching 45,000 requests per second. We may return JSON, render the HTML template, simply parse the HTML or execute the Django middleware. The problem is that we want to be able to get these jobs back more quickly and let Django handle what it does best: only handle unique data.
How many of the 45,000 requests per second are unique? How many requests are returned in this case not the same as the next return? Do you really need to do repetitive work when the results are returned? We need to cache the entire HTTP return so that we don't have to do repetitive work.
Introduction Varnish
Varnish is God horse? Varnish is working between the load balancer and the Django backend as an HTTP cache layer. This means that it will be able to cache the entire HTTP return, so that those not unique requests do not have to hit the Django server.
Before, Varnish made us a black box. We installed and minimized the configuration of it, to be honest, it worked very well. But I think we can do more.
It took me some time to learn more about varnish and what we could use. Over time, we were able to make thousands of requests per second without hitting the Django server. Today, in 45,000 requests per second, only 15,000 requests will hit our application server. The rest is received by varnish and it runs very fast and efficiently.
Because this is very useful for us, but also a very good learning experience, this topic has become my recent lectures on the subject.
Recently, I had a speech at the DjangoCon conference in Chicago. This speech is for people unfamiliar with varnish, inspiring and driving them to learn more with hope. For me, I am excited about this speech, because this topic is rarely mentioned by application developers. This is a speech I'd like to hear a few years ago, hoping to make people understand how HTTP works and how to manage its interactions with tools like varnish. The video link receives an HTTP for great good
Before that, I attended the VUG7 (Varnish user group) in New York and introduced some tips to solve our problems. This speech involved many of the varnish configuration languages we used. Video link interview: Caching is hard:varnish @ Disqus
Learning varnish, it will not solve all your problems, but it is worth your time to learn it and evaluate its value.
If this type of stuff attracts you, and you like me a week at least 5 days to the computer roar, speed contact us, we are hiring!
(Note: The original commentary is also wonderful, worth a look.) )
Additional Information:
Disqus is a third party social review system that provides commentary hosting services primarily for Web site owners. WordPress, Blogger, Tumblr and other third-party blogging platforms provide disqus third-party comment plugins. The main goal of Disqus is to connect the relatively isolated, isolated commenting system of the current Web site to a large network of social features by providing a powerful third-comment system. Through the Disqus comment system with the comment reply notification, comment sharing and hot-text sharing and other social functions, the site owners can effectively improve the user's activity and traffic.
Users use Disqus, comments on different sites, no need to re-register account, just use Disqus account or third-party platform account, you can easily comment, and all comments will be stored, saved in the Disqus account backstage, easy to review, review. Also, when users reply to their comments, they can choose to use the mailbox to receive relevant information, to ensure that all comments follow-up can be mastered at any time. At the same time, disqus the social networking function is also very well integrated into the comment system, when the user on a site to see a similar view of their comments, the reviewer can pay attention to the comments, after the review, all the comments will be displayed in their own account backstage. (Excerpt from Baidu Encyclopedia)
About the base of the holy OMG
How Django-based Disqus support 8 billion PV per month (RPM)