Django multi-database processing (vertical and horizontal database sharding)

Source: Internet
Author: User
Tags database sharding

Vertical database sharding refers to database sharding based on applications, such as a blog database or a forum database. Horizontal database sharding refers to the distribution of data in different databases according to certain rules. For example, blog posts are distributed in five databases based on the user ID.

Vertical database shard, You can refer to the following documents.
Easy multi-database support for Django

The main idea of this article is to inherit the manager to create a multidbmanager, replace the default manager with multidbmanager when creating the model, and specify the database during the multidbmanager construction.

The core design idea is to set the database connection value of settings externally and use it inside the DB. The Code is as follows. Set the link value in the settings dictionary to settings. Then call the database connection.

For name, database in settings. databases. iteritems (): <br/> for key, value in database. iteritems (): <br/> setattr (settings, key, value) <br/> do something <br/>

This is actually very difficult, because Django did not take into account the multi-DB problem at the beginning of the design, the connection is to directly read the setting configuration.

Horizontal database sharding, You can refer to the following discussion link, but this guy just gave the idea.
Proposal: user-friendly API for multi-database support

Because my code is messy (mixed with other functions), I only provide ideas and some code. Suppose an application scenario is as follows: some messages of a user (MSG) are distributed in two databases based on the user ID (number), then the user ID can obtain the remainder of 2.
1. Setting. py

Database_engine = 'mysql' <br/> database_name = 'master' <br/> database_user = '001' <br/> database_password = ''<br/> database_host = '2017. 0.0.1 '<br/> database_port = '000000' <br/> databases = dict (<br/> msg_0 = dict (<br/> database_name = 'msga ', <br/>), <br/> msg_1 = dict (<br/> database_name = 'msg1 ', <br/>), <br/>)

2. simulate a settings. py file. There are other methods. I am using a stupid method.
Class dbsetting: <br/> def _ init _ (self, settings, DB): <br/> self. DEBUG = false <br/> self. database_host = settings. database_host <br/> self. database_port = settings. database_port <br/> self. database_name = settings. database_name <br/> self. database_user = settings. database_user <br/> self. database_password = settings. database_password <br/> If DB: <br/> dbdi = settings. databases [dB] <br/> If dbdi. has_key ('database _ host'): <br/> self. database_host = dbdi ['database _ host'] <br/> If dbdi. has_key ('database _ port'): <br/> self. database_port = int (dbdi ['database _ port']) <br/> If dbdi. has_key ('database _ name'): <br/> self. database_name = dbdi ['database _ name'] <br/> If dbdi. has_key ('database _ user'): <br/> self. database_user = dbdi ['database _ user'] <br/> If dbdi. has_key ('database _ password'): <br/> self. database_password = dbdi ['database _ password'] <br/> If dbdi. has_key ('database _ segment '): <br/> self. database_segment = dbdi ['database _ segment ']

3. Modify the Django/DB/backends/_ init _. py file under the Django Installation File and reinstall it after modification. You do not need to modify/usr/lib/python2.5/Site-packages/Django/directly.

Class basedatabasewrapper (local): <br/> # Add a constructor and pass in settings. <br/> def _ init _ (self, settings, ** kwargs ): <br/> self. connection = none <br/> self. queries = [] <br/> self. options = kwargs <br/> self. settings = Settings <br/> #... the code is omitted here... <br/> def cursor (Self): <br/> # pass the passed settings to the subclass. <br/> cursor = self. _ cursor (self. settings) <br/> If self. settings. debug: <br/> return self. make_debug_cursor (cursor) <br/> return cursor

4. Well, a connection that can pass in the Link parameter will be ready. It is similar to the one in the first reference link. Process the database connection. One is queryset and the other is insert.


Class multimanager (models. manager): <br/> # group indicates the vertical database shard ID, such as blog and BBs. <br/> def _ init _ (self, group, * ARGs, ** kwargs): <br/> self. group = group <br/> super (multimanager, self ). _ init _ (* ARGs, ** kwargs) <br/> # The segment indicates the library under the blog. <br/> # For example, settings. in py, blog_0 and blog_1, the segment may be 0 and 1 <br/> def choiceconn (self, segment): <br/> self. segment = segment <br/> def _ getconn (Self): <br/> If self. GROUP: <br/> If self. segment: <br/> key = self. group + '_' + STR (self. segment) <br/> else: <br/> key = self. group + '_ 0' # connect to the first instance by default <br/> conn = connpool. getconn (key) <br/> return conn <br/> else: <br/> return none <br/> def get_query_set (Self): <br/> conn = self. _ getconn () <br/> If Conn: <br/> query = SQL. query (self. model, Conn) <br/> queryset = queryset (self. model, query) <br/> else: <br/> queryset = super (multimanager, self ). get_query_set () <br/> return queryset <br/> def _ insert (self, values, return_id = false, raw_values = false): <br/> conn = self. _ getconn () <br/> If Conn: <br/> query = SQL. insertquery (self. model, Conn) <br/> query. insert_values (values, raw_values) <br/> ret = query.exe cute_ SQL (return_id) <br/> query. connection. _ commit ()

5. the definition of models also requires additional processing. Generally, the call is as follows: usermsg. objects. filter ()/get (). If it is a horizontal database Shard, you must specify the database to connect. Therefore, I have defined an objects (SEG) method to pass in data numbers. At the same time, it is generally like this: usermsg. Save (). If you want to split the database, change it to usermsg. Save (SEG.


Class usermsg (models. model): <br/> msgid = models. autofield (primary_key = true) <br/> userid = models. integerfield () <br/> message = models. charfield (max_length = 128) <br/> _ default_manager = multimanager ('msg ') </P> <p> @ staticmethod <br/> def objects (SEG ): <br/> usermsg. _ default_manager.choiceconn (SEG) <br/> return usermsg. _ default_manager <br/> def save (self, SEG): <br/> usermsg. _ default_manager.choiceconn (SEG) <br/> super (usermsg, self ). save ()


6. There is no big difference between the use of views and the use of Django, and there is a difference between the two methods above.


Further thoughts:

1. process database transactions. Transactions cannot be used for different databases in the case of multiple databases. Therefore, exceptions and compensatory transactions must be considered in the architecture. To process transactions in the same database, you must rewrite the transaction. py, or obtain the current connection before processing the transaction.


2. Expansion problems. After the database is added, data will be re-distributed, which involves data migration. One solution is to distribute rules. Considering the expansion problem, the new rules are compatible with old rules, the old data will not change. In addition, data can be smoothly moved between databases. These two problems are very troublesome.


========================================================== ============================


Id rules. Please refer to my other post

Id rules for horizontal database distribution



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.