Use Django QuerySets effectively

Source: Internet
Author: User
The effective use of Django's QuerySets object relational ORM ing (ORM) makes it easier to interact with the SQL database, but it is also considered to be less efficient and slower than the original SQL.

To effectively use ORM, you need to understand how it queries the database. This article focuses on how to effectively use the Django ORM system to access large datasets.

Django's queryset is inert.

Django's queryset corresponds to several database records (row), which are filtered by optional queries. For example, the following code will get all the persons whose names are 'Dave 'in the database:

person_set = Person.objects.filter(first_name="Dave")

The above code does not run any database query. You can use person_set to add some filtering conditions to it, or pass it to a function. these operations will not be sent to the database. This is correct, because Database query is one of the factors that significantly affect web application performance.

To really get data from the database, you need to traverse queryset:

for person in person_set:    print(person.last_name)
Django's queryset has a cache

When you traverse queryset, all matching records will be obtained from the database and then converted to the Django model. This is called evaluation ). These models will be saved in the built-in cache of queryset, so that if you traverse this queryset again, you do not need to re-run the general query.

For example, the following code only executes a database query once:

pet_set = Pet.objects.filter(species="Dog")# The query is executed and cached.for pet in pet_set:    print(pet.first_name)# The cache is used for subsequent iteration.for pet in pet_set:    print(pet.last_name)
If statement triggers queryset execution

The most useful part of queryset cache is that it can effectively test whether queryset contains data. only when there is data will it traverse:

Jsonant_set = Restaurant. objects. filter (cuisine = "Indian") # The 'if' statement triggers queryset execution. If your ant_set: # The data in the cache is used for restaurant in your ant_set: print (restaurant. name)
If you do not need all the data, the queryset cache may be a problem.

Sometimes, you may just want to know whether there is data, instead of traversing all the data. In this case, simply using the if statement for judgment will completely execute the entire queryset and put the data into the cache, although you do not need the data!

City_set = City. objects. filter (name = "Cambridge") # The 'if' statement runs queryset .. If city_set: # We don't need all the data, but the ORM will still get all records! Print ("At least one city called Cambridge still stands! ")

To avoid this, you can use the exists () method to check whether data exists:

Tree_set = Tree. objects. filter (type = "deciduous") # 'exists () 'check prevents data from being put into the queryset cache. If tree_set.exists (): # No data is obtained from the database, saving bandwidth and memory print ("There are still hardwood trees in the world! ")
When queryset is very large, the cache will become a problem

It is a waste to load thousands of records into the memory at a time. Even worse, a huge queryset may lock system processes and cause your program to crash.

To avoid generating queryset cache while traversing data, you can use the iterator () method to obtain data and discard it after processing the data.

Star_set = Star. objects. all () # 'iterator () 'can retrieve only a small amount of data from the database at a time, which can save memory for star in star_set.iterator (): print (star. name)

Of course, using the iterator () method to prevent the generation of cache means that the query will be executed repeatedly when the same queryset is traversed. So when using iterator (), be careful to make sure that your code does not repeatedly execute the query when operating a large queryset.

If the query set is large, the if statement is a problem.

As mentioned above, the query set cache is powerful for combining if statements and for statements. it allows Conditional loops in a query set. However, for a large query set, the query set cache is not suitable.

The simplest solution is to use exists () and iterator () in combination to avoid using the query set cache by using two database queries.

molecule_set = Molecule.objects.all()# One database query to test if any rows exist.if molecule_set.exists():    # Another database query to start fetching the rows in batches.    for molecule in molecule_set.iterator():        print(molecule.velocity)

A more complex solution is to use the "advanced iteration method" of Python to check the first element of iterator () before starting the loop and determine whether to perform the loop.

atom_set = Atom.objects.all()# One database query to start fetching the rows in batches.atom_iterator = atom_set.iterator()# Peek at the first item in the iterator.try:    first_atom = next(atom_iterator)except StopIteration:    # No rows were found, so do nothing.    passelse:    # At least one row was found, so iterate over    # all the rows, including the first one.    from itertools import chain    for atom in chain([first_atom], atom_set):        print(atom.mass)
Prevent improper optimization

The cache of queryset is used to reduce queries to the database by programs. in normal use, it is ensured that the database is queried only when necessary.

The exists () and iterator () methods can be used to optimize the memory usage of programs. However, they do not generate queryset cache, which may cause additional database queries.

So pay attention to coding. if the program starts to slow down, you need to check the bottleneck of the code and whether there will be some small optimizations that can help you.

The above is the details about how to use Django's QuerySets effectively. For more information, see other related articles in the first PHP community!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.