How to optimize QuerySet query by the select_related and prefetch_related functions of Django (1), djangoqueryset

Source: Internet
Author: User

How to optimize QuerySet query by the select_related and prefetch_related functions of Django (1), djangoqueryset

When the database has a foreign key, the use of select_related () and prefetch_related () can greatly reduce the number of database requests, thus improving performance. This article uses a simple example to explain the functions of these two functions. Although QuerySet has been described in detail, this article attempts to analyze the working method from the SQL statements triggered by QuerySet, so as to further understand the specific operation method of Django.


I was going to write it into a separate article, but after writing select_related (), I found that the length is a little long. So I wrote it into a series, which may be two to three articles. The link to other articles will be added here.


1. instance background

Assume that the Personal Information System must record the home, residence, and city of each individual in the system. The database design is as follows:


The content of Models. py is as follows:

from django.db import modelsclass Province(models.Model):    name = models.CharField(max_length=10)    def __unicode__(self):        return self.nameclass City(models.Model):    name = models.CharField(max_length=5)    province = models.ForeignKey(Province)    def __unicode__(self):        return self.nameclass Person(models.Model):    firstname  = models.CharField(max_length=10)    lastname   = models.CharField(max_length=10)    visitation = models.ManyToManyField(City, related_name = "visitor")    hometown   = models.ForeignKey(City, related_name = "birth")    living     = models.ForeignKey(City, related_name = "citizen")    def __unicode__(self):        return self.firstname + self.lastname

NOTE 1: The created app is named "QSOptimize"

NOTE 2: For the sake of simplicity, the 'qsoptimize _ province 'table has only two data entries: Hubei province and Guangdong province, and the 'qsoptimize _ City' table has only three data entries: Wuhan, Shiyan city, and Guangzhou city.


2. select_related ()

You can use select_related to optimize the QuerySet for an OneToOneField and a foreign key field (ForeignKey ).


Functions and Methods

After the select_related () function is used for QuerySet, Django will obtain the object corresponding to the corresponding foreign key, so that it does not have to query the database as needed. The above example shows that if we need to print all the cities in the database and their provinces, the most direct approach is:

>>> citys = City.objects.all()>>> for c in citys:...   print c.province...
This will lead to a linear SQL query. If there are too many objects and each object has k foreign key fields, it will lead to n * k + 1 SQL query. In this example, three city objects cause four SQL queries:
SELECT `QSOptimize_city`.`id`, `QSOptimize_city`.`name`, `QSOptimize_city`.`province_id`FROM `QSOptimize_city`SELECT `QSOptimize_province`.`id`, `QSOptimize_province`.`name` FROM `QSOptimize_province`WHERE `QSOptimize_province`.`id` = 1 ;SELECT `QSOptimize_province`.`id`, `QSOptimize_province`.`name` FROM `QSOptimize_province`WHERE `QSOptimize_province`.`id` = 2 ;SELECT `QSOptimize_province`.`id`, `QSOptimize_province`.`name` FROM `QSOptimize_province`WHERE `QSOptimize_province`.`id` = 1 ;

Note: The SQL statement is output directly from Django's logger: 'django. db. backends '.



If we use the select_related () function:

>>> citys = City.objects.select_related().all()>>> for c in citys:...   print c.province...

There is only one SQL query, which obviously reduces the number of SQL queries:

SELECT `QSOptimize_city`.`id`, `QSOptimize_city`.`name`, `QSOptimize_city`.`province_id`, `QSOptimize_province`.`id`, `QSOptimize_province`.`name` FROM`QSOptimize_city` INNER JOIN `QSOptimize_province` ON (`QSOptimize_city`.`province_id` = `QSOptimize_province`.`id`) ;

Here we can see that Django uses inner join to obtain the province information. By the way, the results of this SQL query are as follows:

+ ---- + ----------- + ------------- + ---- + ----------- + | Id | name | province_id | id | name | + ---- + ----------- + ------------- + ---- + ----------- + | 1 | Wuhan | 1 | 1 | Hubei Province | 2 | Guangzhou City | 2 | Guangdong Province | 3 | Shiyan city | 1 | 1 | Hubei Province | + ---- + ----------- + 3 rows in set (0.00 sec)


Function usage supports the following three methods: * FieldsParameters

Select_related () accepts variable length parameters. Each parameter is the field name of the foreign key (content of the parent table) to be obtained, and the field name of the foreign key, and the foreign key of the foreign key .... To select a foreign key, use two underscores (_) to connect.

For example, we can use the following method to obtain the current resident province of Michael JACOB:

>>> Zhangs = Person. objects. select_related ('living _ province '). get (firstname = u "Zhang", lastname = u "3") >>> zhangs. living. province

The SQL query triggered is as follows:

SELECT 'qsoptimize _ person '. 'id', 'qsoptimize _ person '. 'firstname', 'qsoptimize _ person '. 'lastname', 'qsoptimize _ person '. 'hometown _ id', 'qsoptimize _ person '. 'Living _ id', 'qsoptimize _ City '. 'id', 'qsoptimize _ City '. 'name', 'qsoptimize _ City '. 'province _ id', 'qsoptimize _ province '. 'id', 'qsoptimize _ province '. 'name' FROM 'qsoptimize _ person 'inner join 'qsoptimize _ City' ON ('qsoptimize _ body '. 'Living _ id' = 'qsoptimize _ City '. 'id') inner join 'qsoptimize _ province 'ON ('qsoptimize _ City '. 'province _ id' = 'qsoptimize _ province '. 'id') WHERE ('qsoptimize _ person '. 'lastname' = '3' AND 'qsoptimize _ person '. 'firstname' = 'zhang ');

As you can see, Django used two inner join requests to complete the request, obtained the content of the city and province tables, and added them to the corresponding columns of the result table. In this way, zhangs is called. you do not have to perform SQL queries again during living.

+ ---- + ----------- + ---------- + Response + ----------- + ---- + ----------- + ------------- + ---- + ----------- + | id | firstname | lastname | response | living_id | id | name | province_id | id | name | + ---- + ----------- + ---------- + ------------- + ----------- + ---- + ----------- + ------------- + ---- + ----------- + | 1 | Zhang | 3 | 1 | 1 | Wuhan | 1 | 1 | Hubei Province | + ---- + ----------- + ---------- + ----------- + ---- + ----------- + ------------- + ---- + ----------- + 1 row in set (0.00 sec)



However, unspecified Foreign keys are not added to the results. At this time, if you need to obtain the hometown of James, you will perform an SQL query:

>>> zhangs.hometown.province

SELECT `QSOptimize_city`.`id`, `QSOptimize_city`.`name`, `QSOptimize_city`.`province_id` FROM `QSOptimize_city` WHERE `QSOptimize_city`.`id` = 3 ;SELECT `QSOptimize_province`.`id`, `QSOptimize_province`.`name` FROM `QSOptimize_province` WHERE `QSOptimize_province`.`id` = 1 

If no foreign key is specified, the query is performed twice. If the depth is deeper, more queries are performed.



It is worth mentioning that, starting from Django 1.7, The select_related () function has changed its role mode. In this example, if you want to acquire the hometown and current province of Michael jacob at the same time, before 1.7 OnlyIn this way:

>>> Zhangs = Person. objects. select_related ('hometown _ province ', 'living _ province '). get (firstname = u "Zhang", lastname = u "3") >>> zhangs. hometown. province> zhangs. living. province

However, in versions 1.7 and later, you can perform chained operations like other functions of queryset:

>>> Zhangs = Person. objects. select_related ('hometown _ province '). select_related ('living _ province '). get (firstname = u "Zhang", lastname = u "3") >>> zhangs. hometown. province> zhangs. living. province
If you do this in versions earlier than 1.7, you will only get the result of the last operation. In this example, you only have the current location and no hometown. When you print your hometown province, two SQL queries are generated.

Depth Parameters

Select_related () accepts the depth parameter, which can determine the depth of select_related. Django recursively traverses all OneToOneField and ForeignKey in the specified depth. In this example:

>>> zhangs = Person.objects.select_related(depth = d)

D = 1 is equivalent to select_related ('hometown ', 'living ')

D = 2 is equivalent to select_related ('hometown _ province ', 'living _ province ')

No Parameter

Select_related () can also be left blank, which means that Django is required to be as deep as possible select_related. Example: zhangs = Person. objects. select_related (). get (firstname = u "", lastname = u "3 "). Note the following two points:

 

Summary

Ask a question about queryset in django

From itertools import chainfrom operator import attrgetter #... post = Post. objects. get (pk = post_id) # get blog likes = post. like_set.all () # Get favorite information # likes = Like. objects. filter (post = post) reblogs = Post. objects. filter (reblog_from = post) # Get forwarding information # merge favorite and forwarding information, and sort by time in reverse order. notes = sorted (chain (likes, reblogs ), key = attrgetter ('created _ at'), reverse = True )#... use itertools. the chain function merges the iteratable objects, and the query set is the iteratable object:
>>> List (chain ([1, 2, 3], 'abc') >>> [1, 2, 3, 'A', 'B ', 'C'] sort by object attributes using the sorted function ).


DJANGO problem: the delete method of queryset

Try the following statement to see if all the statements will be deleted.
> From models import Entry
> Query = Entry. all ()
> Entries = query. fetch (1)
> Db. delete (entries)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.