Create a MySQL model _ MySQL that implements the Disqus comment Template

Source: Internet
Author: User
Tags disqus django website
This article mainly introduces how to create a MySQL model that implements the Disqus comment template. the database of the Disqus website uses PostgreSQL, while the author uses MySQL for implementation, for more information, see PostgreSQL, which has been considered a substitute for MySQL for a long time. However, during that period, there was no way to bring him to the level that MySQL could achieve. These problems have not been solved in recent years, and many interesting tools have been generated to make up for PG. We use two Slony and pgbouncer in Disqus. Slony allows us to copy data (and sometimes partition), while pgbouncer solves the problem of retaining links and connection pools for us.

In addition, let's look at their language: I am very happy this week to learn how to use recursive queries in PGSQL8.4. they are so powerful. This is what I really want to discuss with you in this article. MySQL allows us to work well, but you can only do it in the engine structure. Although it is still true in PG, you have more options. Therefore, I want to talk about the clues of the tree.

As we all know, Disqus is not only the biggest Django website (we have nearly 1 million visits each month), but also the biggest online comment system. We have provided many features for thousands of websites. the most basic thing is to make comments as clues to the tree structure.

PostgreSQL provides many clue-based solutions. The most commonly used (and most efficient) method is to improve the version's pre-order traversal. Simply put, it adds a left and right order. they will be updated when you add comments. We also have another standard method (Reddit is very happy to use), that is, "retrieve everything and complete the operation in the memory ". In fact, not only Reddit does this.


Let's take a look at what PGSQL provides for us. we can also find two options (the lowest version is 8.4 ). One of them is the built-in module of PG called ltree. It allows you to store the full path of a node (all parent nodes), and allows you to query them using standard SQL statements. It is useful when you sort by the "earliest release", because it is changed to simply sorting by "ltree-column. However, as in most cases, Disqus is not that simple.

Our second solution is recursive query. It took me a long time to understand how he worked, but when I understood it, I was deeply attracted by his ability. Postgre provides many features not available in MySQL, such as the over () modifier. They are really doing very well.

This is a big problem. Now, Disqus and Reddit are the same method for processing multithreading as other online solutions, which are very simple. I am not talking about poor code writing, but his optimization is not what he should do. It wasn't until some people (you, Mr. Obama) started to use this program and everyone wanted to reply to it that we found the problem. We once again think of Django (even though they are getting bigger) and grouping them through business logic.

Since 8.4, we have been able to use recursive queries to solve this problem (in many cases we have already started doing this ourselves, although it may be a bit complicated.

Therefore, let's give a basic example. We have a comment model that looks a bit like this:

create table comments (  id SERIAL PRIMARY KEY,  message VARCHAR,  author VARCHAR,  parent_id INTEGER REFERENCES comments(id));insert into comments (message, author, parent_id)  values ('This thread is really cool!', 'David', NULL), ('Ya David, we love it!', 'Jason', 1), ('I agree David!', 'Daniel', 1), ('gift Jason', 'Anton', 2),  ('Very interesting post!', 'thedz', NULL), ('You sir, are wrong', 'Chris', 5), ('Agreed', 'G', 5), ('Fo sho, Yall', 'Mac', 5);

What we are doing now is to establish a basic evaluation model. For our messages, the author's parent comments (this is optional ). Now, let's learn how to use recursive queries to easily re-order the IDs in this datd in ascending order.


WITH RECURSIVE cte (id, message, author, path, parent_id, depth) AS (  SELECT id,    message,    author,    array[id] AS path,    parent_id,    1 AS depth  FROM  comments  WHERE  parent_id IS NULL   UNION ALL   SELECT comments.id,    comments.message,    comments.author,    cte.path || comments.id,    comments.parent_id,    cte.depth + 1 AS depth  FROM  comments  JOIN cte ON comments.parent_id = cte.id  )  SELECT id, message, author, path, depth FROM cteORDER BY path;

Sweet, right? Oh, wait. are you confused? So the query I 've been searching for is more complicated than a bunch of amazing bugs.
Pgexperts points us to the right path.

Now, I will not drill too much, because there is a better tutorial to process recursive queries in this mode, but we have finished our results.

We have to deal with a huge information set, and some comments have nearly thousands of replies. If 99% of comments only have 100 replies, it is not a problem to put them into the memory, but when they start to increase, we will end up wasting a lot of time. Recursive Queries in PGSQL allow us to easily hand over this work to the database (sometimes they process much faster than we do ), in addition, it saves us a lot of time and resources for network propagation and web processing.

There is an example that gives you a more intuitive understanding of how efficient it is. we have seen only the SQL processing time of large databases (25 results instead of 1000 results are returned) it saved nearly 500% of the time. This does not even cover our program-level costs. Yes, No. These SQL statements are five times faster than other databases at the database layer.

All in all, as a supporter of MySQL, I am shocked by the performance, scale, and flexibility achieved by Disqus using PostgreSQL. I am very much looking forward to discovering what we can do through this platform and looking for challenges that are still waiting for us.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.