How does the select_related and prefetch_related functions of Django optimize QuerySet queries? (2) djangoqueryset

Source: Internet
Author: User
Tags prefetch

How does the select_related and prefetch_related functions of Django optimize QuerySet queries? (2) djangoqueryset

This is the second article in this series, covering the usage, implementation, and usage of the prefetch_related () function.

The first article in this series is here


3. prefetch_related ()

You can use prefetch_related () to optimize multiple-to-many fields (ManyToManyField) and one-to-many fields. Maybe you will say that there is no such thing as OneToManyField. In fact, ForeignKey is a many-to-one field, and the field associated with ForeignKey is a one-to-many field.


Functions and Methods

Prefetch_related () and select_related () are designed to reduce the number of SQL queries, but they are implemented in different ways. The latter solves the problem in the SQL query through the JOIN statement. However, it is wise to use SQL statements to solve the many-to-many relationship, because the JOIN operation results in a long table, resulting in an increase in the SQL statement running time and memory usage. If there are n objects, each object's many-to-many fields correspond to Mi entries, a result table of Σ (n) Mi rows will be generated.


The solution of prefetch_related () is to query each table separately and then use Python to process the relationship between them. Continue with the example above. If we want to obtain all the cities that James has been to, use prefetch_related () to do this:

>>> Zhangs = Person. objects. prefetch_related ('visitation '). get (firstname = u "Zhang", lastname = u "3") >>> for city in zhangs. visitation. all ():... print city...
The SQL query triggered by the above Code is as follows:

SELECT 'qsoptimize _ person '. 'id', 'qsoptimize _ person '. 'firstname', 'qsoptimize _ person '. 'lastname', 'qsoptimize _ person '. 'hometown _ id', 'qsoptimize _ person '. 'Living _ id' FROM 'qsoptimize _ person' WHERE ('qsoptimize _ person '. 'lastname' = '3' AND 'qsoptimize _ person '. 'firstname' = 'zhang'); SELECT ('qsoptimize _ person_visitation '. 'person _ id') AS '_ prefetch_related_val', 'qsoptimize _ City '. 'id', 'qsoptimize _ City '. 'name', 'qsoptimize _ City '. 'province _ id' FROM 'qsoptimize _ City' inner join 'qsoptimize _ person_visitation 'ON ('qsoptimize _ City '. 'id' = 'qsoptimize _ person_visitation '. 'city _ id') WHERE 'qsoptimize _ person_visitation '. 'person _ id' IN (1 );

The first SQL query is only used to obtain the Person object of Michael Jacob. The second query is critical. It selects the row where 'person _ id' in the relational table 'qsoptimize _ person_visitation 'is Michael, then, inline JOIN with the 'city' table (inner join is also called equivalent JOIN) to obtain the result table.

+ ---- + ----------- + ---------- + ------------- + ----------- + | Id | firstname | lastname | region | living_id | + ---- + ----------- + ---------- + ------------- + ----------- + | 1 | 3 | 3 | 1 | + ---- + ----------- + ---------- + ------------- + ----------- + 1 row in set (0.00 sec) + region + ---- + ----------- + ------------- + | _ prefetch_related_val | id | name | province_id | + region + ---- + ----------- + | 1 | 1 | Wuhan | 1 | 1 | 2 | Guangzhou | 2 | 1 | 3 | Shiyan city | 1 | + ------------------------- + ---- + ----------- + ------------- + 3 rows in set (0.00 sec)
Apparently, John has been to Wuhan, Guangzhou, and Shiyan.



Or, we want to get the names of all cities in Hubei province, as shown in the following code:

>>> Hb = Province. objects. prefetch_related ('city _ set '). get (name _ iexact = u "Hubei Province") >>> for city in hb. city_set.all ():... city. name...
 

SQL query triggered:

SELECT 'qsoptimize _ province '. 'id', 'qsoptimize _ province '. 'name' FROM 'qsoptimize _ province 'WHERE 'qsoptimize _ province '. 'name' LIKE 'hubei province '; SELECT 'qsoptimize _ City '. 'id', 'qsoptimize _ City '. 'name', 'qsoptimize _ City '. 'province _ id' FROM 'qsoptimize _ City' WHERE 'qsoptimize _ City '. 'vince _ id' IN (1 );
The resulting table:
+ ---- + ----------- + | Id | name | + ---- + ----------- + | 1 | Hubei Province | + ---- + ----------- + 1 row in set (0.00 sec) + ---- + ----------- + ------------- + | id | name | province_id | + ---- + ----------- + ------------- + | 1 | Wuhan | 1 | 3 | Shiyan city | 1 | + ---- + ------------- + ------------- + 2 rows in set (0.00 sec)

We can see that prefetch uses the IN statement. In this way, when the number of objects in QuerySet is too large, different database features may cause performance problems.



Usage


* Lookups Parameters

Prefetch_related () is used only in Django <1.7. Like select_related (), prefetch_related:

>>> Zhangs = Person. objects. prefetch_related ('visitation _ province '). filter (firstname _ iexact = u 'zhang') >>> for I in zhangs :... for city in I. visitation. all ():... print city. province...
SQL triggered:
SELECT 'qsoptimize _ person '. 'id', 'qsoptimize _ person '. 'firstname', 'qsoptimize _ person '. 'lastname', 'qsoptimize _ person '. 'hometown _ id', 'qsoptimize _ person '. 'Living _ id' FROM 'qsoptimize _ person' WHERE 'qsoptimize _ person '. 'firstname' LIKE 'zhang '; SELECT ('qsoptimize _ person_visitation '. 'person _ id') AS '_ prefetch_related_val', 'qsoptimize _ City '. 'id', 'qsoptimize _ City '. 'name', 'qsoptimize _ City '. 'province _ id' FROM 'qsoptimize _ City' inner join 'qsoptimize _ person_visitation 'ON ('qsoptimize _ City '. 'id' = 'qsoptimize _ person_visitation '. 'city _ id') WHERE 'qsoptimize _ person_visitation '. 'person _ id' IN (1, 4); SELECT 'qsoptimize _ province '. 'id', 'qsoptimize _ province '. 'name' FROM 'qsoptimize _ province 'WHERE 'qsoptimize _ province '. 'id' IN (1, 2 );
Result:
+ ---- + ----------- + ---------- + ------------- + ----------- + | Id | firstname | lastname | region | living_id | + ---- + ----------- + ---------- + ------------- + ----------- + | 1 | 3 | 3 | 1 | 4 | sheets | 6 | 2 | 2 | + ---- + ----------- + ---------- + ------------- + ----------- + 2 rows in set (0.00 sec) + region + ---- + ----------- + ------------- + | _ prefetch_related_val | id | name | province_id | + region + ---- + ----------- + | 1 | 1 | Wuhan | 1 | 1 | 2 | Guangzhou | 2 | 4 | 2 | Guangzhou | 2 | 1 | 3 | Shiyan city | 1 | + ----------------------- + ---- + ----------- + ------------- + 4 rows in set (0.00 sec) + ---- + ----------- + | id | name | + ---- + ----------- + | 1 | Hubei Province | 2 | Guangdong Province | + ---- + ----------- + 2 rows in set (0.00 sec)


It is worth mentioning that the chain prefetch_related will add these queries, just like select_related in 1.7.


Note that when QuerySet is used, once the database Request is changed in the chained operation, the data cached with prefetch_related will be ignored. This will cause Django to request the database again to obtain the corresponding data, resulting in performance problems. The change of database requests mentioned here refers to the operation of various filters () and exclude () that will eventually change the SQL code. And all () does not change the final database request, so it will not cause a new request to the database.

For example, to obtain a city with the word "city" in the city visited by all users, this will lead to a large number of SQL queries:

Plist = Person. objects. prefetch_related ('visitation') [p. visitation. filter (name _ icontains = u "") for p in plist]
Four members in the database cause 2 + 4 SQL queries:

SELECT 'qsoptimize _ person '. 'id', 'qsoptimize _ person '. 'firstname', 'qsoptimize _ person '. 'lastname', 'qsoptimize _ person '. 'hometown _ id', 'qsoptimize _ person '. 'Living _ id' FROM 'qsoptimize _ person'; SELECT ('qsoptimize _ person_visitation '. 'person _ id') AS '_ prefetch_related_val', 'qsoptimize _ City '. 'id', 'qsoptimize _ City '. 'name', 'qsoptimize _ City '. 'province _ id' FROM 'qsoptimize _ City' inner join 'qsoptimize _ person_visitation 'ON ('qsoptimize _ City '. 'id' = 'qsoptimize _ person_visitation '. 'city _ id') WHERE 'qsoptimize _ person_visitation '. 'person _ id' IN (1, 2, 3, 4); SELECT 'qsoptimize _ City '. 'id', 'qsoptimize _ City '. 'name', 'qsoptimize _ City '. 'province _ id' FROM 'qsoptimize _ City' inner join 'qsoptimize _ person_visitation 'ON ('qsoptimize _ City '. 'id' = 'qsoptimize _ person_visitation '. 'city _ id') WHERE ('qsoptimize _ person_visitation '. 'person _ id' = 1 AND 'qsoptimize _ City '. 'name' LIKE '% city %'); SELECT 'qsoptimize _ City '. 'id', 'qsoptimize _ City '. 'name', 'qsoptimize _ City '. 'province _ id' FROM 'qsoptimize _ City' inner join 'qsoptimize _ person_visitation 'ON ('qsoptimize _ City '. 'id' = 'qsoptimize _ person_visitation '. 'city _ id') WHERE ('qsoptimize _ person_visitation '. 'person _ id' = 2 AND 'qsoptimize _ City '. 'name' LIKE '% city %'); SELECT 'qsoptimize _ City '. 'id', 'qsoptimize _ City '. 'name', 'qsoptimize _ City '. 'province _ id' FROM 'qsoptimize _ City' inner join 'qsoptimize _ person_visitation 'ON ('qsoptimize _ City '. 'id' = 'qsoptimize _ person_visitation '. 'city _ id') WHERE ('qsoptimize _ person_visitation '. 'person _ id' = 3 AND 'qsoptimize _ City '. 'name' LIKE '% city %'); SELECT 'qsoptimize _ City '. 'id', 'qsoptimize _ City '. 'name', 'qsoptimize _ City '. 'province _ id' FROM 'qsoptimize _ City' inner join 'qsoptimize _ person_visitation 'ON ('qsoptimize _ City '. 'id' = 'qsoptimize _ person_visitation '. 'city _ id') WHERE ('qsoptimize _ person_visitation '. 'person _ id' = 4 AND 'qsoptimize _ City '. 'name' LIKE '% city % ');


Analyze these request events in detail.

As we all know, QuerySet is lazy and will access the database only when it is used. When the second line of Python code is run, the for loop regards plist as an iterator, which triggers database queries. The first two SQL queries are caused by prefetch_related.

Although the query results contain all the required city information, the filter operation is performed on Person. visitation in the loop body, which obviously changes the database Request. Therefore, these operations will ignore the previously cached data and re-query the SQL statements.


But what should I do if I have such a requirement? In Django> = 1.7, you can use the Prefetch object in the next section. If your environment is Django <1.7, you can perform this operation in Python.

Plist = Person. objects. prefetch_related ('visitation') [[city for city in p. visitation. all () if u "city" in city. name] for p in plist]


Prefetch object

In Django> = 1.7, you can use the Prefetch object to control the behavior of the prefetch_related function.

Note: Because I have not installed the Django environment of version 1.7, this section is written by reference to the Django document and has not been tested.


Features of the Prefetch object:

Continue with the example above to find cities with the word "Wu" and "State" in the cities visited by all users:

Wus = City. objects. filter (name _ icontains = u "") zhous = City. objects. filter (name _ icontains = u "") plist = Person. objects. evaluate (Prefetch ('visitation', queryset = wus, to_attr = "wu_city"), Prefetch ('visitation', queryset = zhous, to_attr = "zhou_city"),) [p. wu_city for p in plist] [p. zhou_city for p in plist]

Note: This code has not been tested in the actual environment. If it is incorrect, correct it.


By the way, Prefetch objects and string parameters can be mixed.

None

You can pass in None to clear the previous prefetch_related. Like this:

>>> prefetch_cleared_qset = qset.prefetch_related(None)


Summary




Ask a question about queryset in django

From itertools import chainfrom operator import attrgetter #... post = Post. objects. get (pk = post_id) # get blog likes = post. like_set.all () # Get favorite information # likes = Like. objects. filter (post = post) reblogs = Post. objects. filter (reblog_from = post) # Get forwarding information # merge favorite and forwarding information, and sort by time in reverse order. notes = sorted (chain (likes, reblogs ), key = attrgetter ('created _ at'), reverse = True )#... use itertools. the chain function merges the iteratable objects, and the query set is the iteratable object:
>>> List (chain ([1, 2, 3], 'abc') >>> [1, 2, 3, 'A', 'B ', 'C'] sort by object attributes using the sorted function ).


DJANGO problem: the delete method of queryset

Try the following statement to see if all the statements will be deleted.
> From models import Entry
> Query = Entry. all ()
> Entries = query. fetch (1)
> Db. delete (entries)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.