Optimization of queryset query by select_related function in Python's Django framework _python

Source: Internet
Author: User

1. Background notes on examples

Suppose a personal information system needs to record the individual's home, place of residence, and city in the system. The database is designed as follows:

models.py contents are as follows:

From django.db Import Models
 
Class province (models. Model):
  name = models. Charfield (max_length=10)
  def __unicode__ (self): return
    Self.name
 
class city (models. Model):
  name = models. Charfield (max_length=5)
  province = models. ForeignKey (province)
  def __unicode__ (self): return
    Self.name
 
class person (models. Model):
  FirstName = models. Charfield (max_length=10)
  LastName  = models. Charfield (max_length=10)
  visitation = models. Manytomanyfield (city, related_name = "Visitor")
  Hometown  = models. ForeignKey (city, related_name = "Birth")
  living   = models. ForeignKey (city, related_name = "Citizen")
  def __unicode__ (self): return
    Self.firstname + self.lastname

Note 1: The app created is named "Qsoptimize"

Note 2: For the sake of simplicity, there are only 2 data in the ' qsoptimize_province ' table: Hubei Province and Guangdong Province, only three data in the ' qsoptimize_city ' table: Wuhan city, Shiyan and Guangzhou
2. select_related ()

For a pair of fields (Onetoonefield) and foreign key fields (ForeignKey), you can use select_related to optimize Queryset
Functions and methods

After using the select_related () function for Queryset, Django obtains the corresponding foreign key object, which eliminates the need to query the database later. The above example shows that if we need to print all the cities in the database and the provinces they belong to, the most straightforward thing to do is:

>>> citys = City.objects.all ()
>>> for C in Citys:
...  Print C.province ...

This results in a linear SQL query that causes N*K+1 SQL queries if there are too many N of objects in each object with K foreign key fields. In this case, there are 3 city objects that cause 4 SQL queries:

SELECT ' qsoptimize_city '. ' id ', ' qsoptimize_city '. ' Name ', ' qsoptimize_city '. ' province_id ' from
' qsoptimize_city '
 
SELECT ' qsoptimize_province '. ' id ', ' qsoptimize_province '. ' Name ' from
' qsoptimize_province '
WHERE ' Qsoptimize_province '. ' id ' = 1;
 
SELECT ' qsoptimize_province '. ' id ', ' qsoptimize_province '. ' Name ' from
' qsoptimize_province '
WHERE ' Qsoptimize_province '. ' id ' = 2;
 
SELECT ' qsoptimize_province '. ' id ', ' qsoptimize_province '. ' Name ' from
' qsoptimize_province '
WHERE ' Qsoptimize_province '. ' id ' = 1;

NOTE: The SQL statement here is directly from the Django logger: ' django.db.backends ' output

If we use the select_related () function:

>>> citys = City.objects.select_related (). All ()
>>> to C in Citys:
...  Print C.province ...

There is only one SQL query, which obviously drastically reduces the number of SQL queries:

SELECT ' qsoptimize_city '. ' id ', ' qsoptimize_city '. ' Name ',
' qsoptimize_city '. ' province_id ', ' qsoptimize_ Province '. ' id ', ' qsoptimize_province '. ' Name ' from
' qsoptimize_city '
INNER JOIN ' qsoptimize_province ' on (' Qsoptimize_city '. ' province_id ' = ' qsoptimize_province '. ' id ');

Here we can see that Django uses the inner join to get information about the province. Incidentally, the results of this SQL query are as follows:

+----+-----------+-------------+----+-----------+
| id | name   | province_id | id | name   |
+----+-----------+-------------+----+-----------+
| 1 | Wuhan  |      1 | 1 | Hubei Province  |
| 2 | Guangzhou |      2 | 2 | Guangdong Province  |
| 3 | Shiyan |      1 | 1 | Hubei Province  |
+----+-----------+-------------+----+-----------+
3 rows in Set (0.00 sec)


How to use
The function supports the following three usages:
*fields Parameters

Select_related () accepts variable-length arguments, each of which is the field name of the foreign key (the content of the parent table) that needs to be fetched, and the field name of the foreign key of the foreign key, the foreign key of the foreign key of the foreign key .... To select the foreign key of a foreign key requires two underscores "__" to be used to connect.

For example, to obtain the John Province of residence, you can use the following methods:

>>> zhangs = Person.objects.select_related (' living__province '). Get (Firstname=u "Zhang", lastname=u "three")
>>> zhangs.living.province

The SQL query that fires is as follows:

SELECT ' Qsoptimize_person '. ' id ', ' qsoptimize_person '. ' FirstName ',
' Qsoptimize_person '. ' LastName ', ' qsoptimize _person '. ' hometown_id ', ' Qsoptimize_person '. ' living_id ',
' qsoptimize_city '. ' id ', ' qsoptimize_city '. ' Name ', ' Qsoptimize_city '. ' province_id ', ' qsoptimize_province '. ' id ',
' qsoptimize_province '. ' Name ' from
' Qsoptimize_person '
INNER JOIN ' qsoptimize_city ' on (' Qsoptimize_person '. ' living_id ' = ' qsoptimize_city '. ' ID ')
INNER JOIN ' qsoptimize_province ' on (' qsoptimize_city '. ' province_id ' = ' qsoptimize_province '. ' id ')
WHERE (' Qsoptimize_person '. ' LastName ' = ' three ' and ' Qsoptimize_person '. ' FirstName ' = ' Zhang ');

As you can see, Django uses 2 INNER joins to complete the request, gets the contents of the city and province tables, and adds them to the corresponding columns in the result table, so you don't have to do SQL queries again when calling Zhangs.living.

+----+-----------+----------+-------------+-----------+----+-----------+-------------+----+-----------+
| id | FirstName | LastName | hometown_id | living_id | ID | Name   | province_id | id | name   |
+----+-----------+----------+-------------+-----------+----+-----------+-------------+----+-----------+
| 1 | Zhang    | three    |      3 |     1 | 1 | Wuhan City  |  1     | 1 | Hubei province  |
+----+-----------+----------+-------------+-----------+----+-----------+-------------+----+-----------+
1 Row in Set (0.00 sec)

Unspecified foreign keys, however, are not added to the result. This time, if you need to get John's hometown will be a SQL query:

>>> zhangs.hometown.province
 
SELECT ' qsoptimize_city '. ' id ', ' qsoptimize_city '. ' Name ',
' Qsoptimize_city '. ' province_id ' from
' qsoptimize_city '
WHERE ' qsoptimize_city '. ' id ' = 3;
 
SELECT ' qsoptimize_province '. ' id ', ' qsoptimize_province '. ' Name ' from
' qsoptimize_province '
WHERE ' Qsoptimize_province '. ' id ' = 1

At the same time, if you do not specify a foreign key, you will be queried two times. If the depth is deeper, the number of queries is more.

It is worth mentioning that, starting with Django 1.7, the select_related () function has changed the way it works. In this case, if you want to obtain both the hometown of John and the province of the current residence, you can only do so before 1.7:

>>> zhangs = Person.objects.select_related (' hometown__province ', ' living__province '). Get (Firstname=u "Zhang", Lastname=u "three")
>>> zhangs.hometown.province
>>> zhangs.living.province

But in versions 1.7 and above, you can do the same as the other functions of Queryset:

>>> zhangs = Person.objects.select_related (' hometown__province '). select_related (' living__province '). Get ( Firstname=u "Zhang", lastname=u "three")
>>> zhangs.hometown.province
>>> zhangs.living.province

If you do this in the 1.7 version, you will only get the result of the last operation, in this case only the place of residence and no home. When you print a native province, you create two SQL queries.
Depth Parameters

Select_related () accepts the depth parameter, and the depth parameter determines the depth of the select_related. Django recursively iterates through all the Onetoonefield and ForeignKey in the specified depth. This example illustrates:

>>> zhangs = Person.objects.select_related (depth = d)

D=1 is equivalent to select_related (' hometown ', ' living ')

D=2 is equivalent to select_related (' hometown__province ', ' living__province ')
No Parameters

Select_related () can also be without parameters, which means that Django is required to be as deep as possible select_related. For example: Zhangs = Person.objects.select_related (). Get (Firstname=u "Zhang", lastname=u "three"). But pay attention to two points:

Django itself has a built-in upper limit, and for a particularly complex table relationship, Django may jump out of recursion somewhere you don't know, and that's not the way you want it to be. I am not sure how the specific restrictions work.
Django doesn't know what fields you actually want to use, so it takes all the fields in, which can cause unnecessary waste and affect performance.


Summary

    1. Select_related the main needle one-to-one and many-to-many relationships are optimized.
    2. select_related uses SQL JOIN statements to optimize and improve performance by reducing the number of SQL queries.
    3. You can specify the name of the field you want to select_related through variable-length parameters. You can also implement the specified recursive query by using the double underline "__" connection field name. Fields that are not specified do not cache, the depth that is not specified is not cached, and if you want to access Django will do the SQL query again.
    4. You can also specify the depth of recursion by using the depth parameter, and Django automatically caches all the fields in the specified depth. If you want to access a field outside the specified depth, Django makes the SQL query again.
    5. Also accepts calls without parameters, and Django queries all fields as deeply as possible. But be aware of the limitations of Django recursion and the waste of performance.
    6. Django >= 1.7, the select_related of chained calls is equivalent to using variable-length parameters. Django < 1.7, chained calls can cause the front select_related to expire, leaving only the last one.


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.