Implementing a High performance counter (Counter) instance in Django

Source: Internet
Author: User
Counter (Counter) is a very common functional component, this blog in the number of unread messages, for example, describes the implementation of a high-performance counters in Django basic points.

The beginning of the story:. Count ()

Suppose you have a notification model class that saves mostly all of the site notifications:

The code is as follows:


Class Notification (models. Model):
"" "a simplified notification class with three fields:

-' user_id ': User ID of the message owner
-' has_readed ': Indicates whether the message has been read
"""

USER_ID = models. Integerfield (Db_index=true)
has_readed = models. Booleanfield (Default=false)


As a matter of course, at first you'll get the number of unread messages for a user with this query:

The code is as follows:


# Gets the number of unread messages for the user with ID 3074
Notification.objects.filter (user_id=3074, Has_readed=false). Count ()


When your notification table is smaller, there is no problem in this way, but slowly, as the volume of business expands. There are billions of data in the message table. Many lazy users have thousands of unread messages.

At this point, you need to implement a counter, so that the counter to count the number of unread messages per user, so that compared to the previous count (), we only need to execute a simple primary key query (or better) to get real-time unread message number.

Better scenario: Set up a counter

First, let's set up a new table to store the number of unread messages per user.

The code is as follows:


Class Usernotificationscount (models. Model):
"" "This model holds the number of unread messages for each user" "

USER_ID = models. Integerfield (Primary_key=true)
Unread_count = models. Integerfield (default=0)

def __str__ (self):
Return ' % (self.user_id, Self.unread_count)

We provide each registered user with a corresponding Usernotificationscount record to save his unread message count. Each time you get the number of unread messages, you only need UserNotificationsCount.objects.get (pk=user_id). Unread_count.

Next, the point of the question comes, how do we know when we should update our counters? What shortcuts does Django offer in this regard?

Challenge: Update your counters in real time

In order for our counters to work properly, we must update it in real time, which includes:

1. When there is a new unread message coming up, for counter +1
2. When the message is deleted abnormally, if the associated message is unread, the counter-1
3. When you are finished reading a new message, for counter-1
Let's resolve these situations one by one.

Before we throw the solution, we need to introduce a feature in Django: Signals, signals is an event notification mechanism that Django provides that allows you to listen to certain custom or pre-defined events, and invoke implementation-defined methods when these events occur.

For example, Django.db.models.signals.pre_save & Django.db.models.signals.post_save represents an event that is triggered before and after a model calls the Save method. It is functionally similar to the trigger provided by database.

For more signals about the official documentation, let's look at what the signals can do for our counters.

1. When a new message comes in, for counter +1

This should be the best thing to do, and with Django's signals, we can implement counter updates in this case with just a few lines of code:

The code is as follows:


From django.db.models.signals import Post_save, Post_delete

def incr_notifications_counter (sender, instance, created, **kwargs):
# only if this instance is newly created and has_readed is the default false to update
If not (created and not instance.has_readed):
Return

# Call the Update_unread_count method to update the counter +1
Notificationcontroller (instance.user_id). Update_unread_count (1)

# Monitor notification model's post_save signal
Post_save.connect (Incr_notifications_counter, sender=notification)


This way, whenever you create a new notification using a method such as Notification.create or. Save (), our notificationcontroller will be notified as counter +1.

Note, however, that since our counters are based on Django signals, if you have a place in your code that uses raw SQL and does not add new notifications via the Django Orm method, our counters are not notified, so it's best to standardize all new notification-building methods, Use the same API, for example.

2. When the message is deleted abnormally, if the associated message is unread, the counter-1

With the first experience, this situation is relatively simple to deal with, only need to monitor the notification post_delete signal can be, the following is an example code:

The code is as follows:


def decr_notifications_counter (sender, instance, **kwargs):
# When the deleted message has not been read out of date, counter-1
If not instance.has_readed:
Notificationcontroller (instance.user_id). Update_unread_count (-1)

Post_delete.connect (Decr_notifications_counter, sender=notification)


At this point, notification's Delete event can also update our counters properly.

3. When reading a new message, for counter-1

Next, when the user reads an unread message, we also need to update our unread message counter. You might say, what's so hard about this? I just need to update my counter manually in the way I read the message, okay?

such as this:

The code is as follows:


Class Notificationcontroller (object):

... ...

def mark_as_readed (self, notification_id):
notification = Notification.objects.get (pk=notification_id)
# There's no need to repeat a notice you've read
If notication.has_readed:
Return

notification.has_readed = True
Notification.save ()
# Update our counters here, yes, I feel great
Self.update_unread_count (-1)


With some simple tests, you can feel that your counter works very well, but there is a very deadly problem with this approach, and there is no way to handle concurrent requests properly.

For example, you have an unread message object with an ID of 100, and at this time there are two requests to mark this notification as read:

The code is as follows:


# Because of two concurrent requests, it is assumed that these two methods are almost simultaneously called
Notificationcontroller (user_id). mark_as_readed (100)
Notificationcontroller (user_id). mark_as_readed (100)

Obviously, these two times will successfully mark this notification as read, because in the case of concurrency, if notification.has_readed such a check does not work properly, so our counter will be wrong-12 times, but in fact we read only one request.

So, how should such a problem be solved?

Basically, there is only one way to resolve the data conflicts that occur with concurrent requests: Lock up and introduce two simpler solutions:

Querying with the Select for Update database

Select ... for update is a scenario that is specifically designed to address concurrent fetch data at the database level, and the main relational database, such as MySQL, PostgreSQL, supports this feature, and the new version of Django ORM even provides the shortcut of this function directly. For more information about it, you can search the documentation for the database you are using.

With the Select for update, our code might turn out like this:

The code is as follows:


From django.db Import Transaction

Class Notificationcontroller (object):

... ...

def mark_as_readed (self, notification_id):
# Manually let the select for update and UPDATE statements occur inside a complete transaction
With Transaction.commit_on_success ():
# Use Select_for_update to ensure that concurrent requests are processed at the same time with only one request, other requests
# Wait for the lock to release
notification = Notification.objects.select_for_update (). Get (PK=NOTIFICATION_ID)
# There's no need to repeat a notice you've read
If notication.has_readed:
Return

notification.has_readed = True
Notification.save ()
# Update our counters here, yes, I feel great
Self.update_unread_count (-1)

In addition to the ability to use "Select for Update", there is a relatively simple way to solve this problem.

Using update to implement atomic modification

In fact, the simpler way, just change our database to a single update can solve the problem of concurrency:

The code is as follows:


def mark_as_readed (self, notification_id):
Affected_rows = Notification.objects.filter (pk=notification_id, has_readed=false) \
. Update (Has_readed=true)
# Affected_rows will return the number of entries modified by the UPDATE statement
Self.update_unread_count (Affected_rows)

In this way, concurrent tag-read operations can also correctly affect our counters.

Performance?

Before we described how to implement an unread message counter that could be updated correctly, we might modify our counters directly using the UPDATE statement, just like this:

The code is as follows:


From Django.db.models import F

def update_unread_count (self, count)
# Use the UPDATE statement to update our counters
UserNotificationsCount.objects.filter (pk=self.user_id) \
. Update (Unread_count=f (' unread_count ') + count)


But in a production environment, this approach is likely to cause serious performance problems, because if our counters are updated frequently, massive updates can put a lot of pressure on the database. So in order to achieve a high performance counter, we need to save the changes and write them to the database in bulk.

With Redis's sorted set, we can do this very easily.

Use sorted set to cache counter changes

Redis is a very useful memory database in which the sorted set is one of the data types it provides: an ordered set, using it, we can cache all the counter changes very simply, and then write back to the database in bulk.

The code is as follows:


Rk_notifications_counter = ' Ss_pending_counter_changes '

def update_unread_count (self, count):
"" "Modified Update_unread_count Method" ""
Redisdb.zincrby (Rk_notifications_counter, str (self.user_id), count)

# at the same time we also need to modify the Get user unread message count method so that it gets those in Redis that are not written back
# buffer data to the database. In this case, the code is omitted.

With the above code, we buffer the update of the counter in Redis, and we also need a script to write the data in the buffer back to the database.

By customizing the command of Django, we can do this very easily:

The code is as follows:


# file:management/commands/notification_update_counter.py

#-*-Coding:utf-8-*-
From django.core.management.base import Basecommand
From Django.db.models import F

# Fix Import Prob
From Notification.models import Usernotificationscount
From notification.utils import Rk_notifications_counter
From Base_redis import Redisdb

Import logging
Logger = Logging.getlogger (' stdout ')


Class Command (Basecommand):
Help = ' Update usernotificationscounter objects, Write changes from Redis to database '

def handle (self, *args, **options):
# First, use the Zrange command to get the buffer for all modified user IDs
For user_id in Redisdb.zrange (rk_notifications_counter, 0,-1):
# Here it is worth noting that, in order to ensure the atomicity of the operation, we used the REDISDB pipeline
Pipe = Redisdb.pipeline ()
Pipe.zscore (Rk_notifications_counter, user_id)
Pipe.zrem (Rk_notifications_counter, user_id)
Count, _ = Pipe.execute ()
count = Int (count)
If not count:
Continue

Logger.info (' Updating unread count user%s:count%s '% (user_id, count))
UserNotificationsCount.objects.filter (pk=obj.pk) \
. Update (Unread_count=f (' unread_count ') + count)


After that, the changes in the buffer can be batched back to the database via a command such as Python manage.py notification_update_counter. We can also configure this command to crontab to define execution.

Summarize

The article here, a simple "high-performance" unread message counter is completed. Speaking so much, in fact, the main point of knowledge is all these:

1. Use Django's signals to get new/deleted action updates for model
2. Use the database's select for update to correctly handle concurrent database operations
3. Use Redis's sorted set to cache counter modification operations
Hope to be of help to you. :)

  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.