SQL queries the database, you can take a series of ways to improve the speed and performance of queries. For example, use case instead of update, using temporary tables and batch updates, and so on. This article introduces 7 ways to improve the query speed, please refer to the reader.
When SQL queries the database, proper adherence to some principles can make the job easier, and this article lists 7 principles that can be used flexibly to help you improve your SQL queries, and of course you can consult your DBA for more information.
1. Replace Update with Case
To update a record, we immediately think of update, which is very common and many developers often overlook this principle because using update looks very natural and very logical.
Suppose you pull a record from the Customer table and you want to mark an order of more than $100,000 as "Preferred", so you'll think of updating the Customerrank column to "Preferred" with an UPDATE statement, the problem is that the UPDATE statement has a log , which means that each record will be written two times, the solution to this problem is to embed the case statement in the SQL query, before writing the "Preferred" flag to the table, it will use the Order amount conditions to check each row, meet the criteria will be updated, the performance of the upgrade is amazing.
2. Do not blindly reuse code
This problem is also very common, in the work directly with someone else to write code is a happy thing, you know that the code can query the data you need, but the problem is often some data is not what you need, but we often do not want to make changes, so the returned dataset is often a superset, It is possible to solve the problem with more than one outer join or a WHERE clause, so it is best to check it when reusing the code, and if necessary, adapt it slightly.
3. Extract only the columns you need
This question is a bit similar to 2, but this time it is the specific column specified. Maybe we feel very happy when we use SELECT *, how easy it is! If you want to write out each column name, it is too much trouble, this is a lot of people's ideas, but this idea is wrong, because this will take out the extra data columns, I have seen the code of this error countless times, there was a developer on a 120 column , the millions of rows of data tables use SELECT * Query, but he will only use 35 of the columns, which is a huge waste of resources, we recommend refusing to write select *, you want to query what, the extra return result is useless for you, although does not affect the functionality you want to implement, but has a great impact on database performance.
4, as far as possible only to query the big table
This is also a mistake that I see a lot of people, for example, a stored procedure fetches data from a large table on a millions record, and the developer wants to extract customer information that resides in California and earns more than $40,000, so it first takes customers residing in California to a temporary table, Then we look at customers with more than $40,000 in revenue, put the query results in another temporary table, and finally, he connects the two temporary tables to find out the final results.
Maybe some people think I'm kidding, right? But the fact is that someone does, this should be done in a query, but the query of two large tables.
A slightly different situation is that when multiple steps in a process require a subset of large tables, each step may have to query a large table. The way to avoid multiple queries is to persist a subset of the first query, and then point to the persisted subset of the steps that follow.
5. Using temporary tables
This problem may be slightly troublesome to solve, but the effect is obvious, in fact, in many cases you can use temporary table, through temporary table can effectively reduce the operation of large tables, if you have to connect a table to a large table, and on the large table has the conditions, then you can export the data needed in the large table in the temporary table, The temporary table is then used to connect, so that the query speed can be significantly improved. You can also use temporal tables if you have multiple queries in your stored procedure that need to be connected to the same table.
6. Pre-stored data
This is my favorite, because it is a very old technology, often overlooked, if you have a report or stored procedures need to connect large tables, early extraction of data from large tables, persisted to another table, the report can use the stored data set, so as to improve overall performance efficiency.
Not all the time you have the opportunity to take advantage of this technology, but once you can make use of it, you will find it is an effective way to save server resources.
Unfortunately, many developers are trying to avoid this technology, actually only need to create a view to solve the problem, but this method of the problem is that every report needs it to run once, but for the same report, assuming that 10 minutes before the run, and now someone to run the report, Then the connection operation to the large table can be avoided. I recommend using this technique to pre-store data for tables that are frequently queried, which can save a lot of server resources.
7. Delete and update in batches
It's also an easy-to-ignore technique to do data deletion or update operations on a large table, which can be a nightmare if you're not doing it, the problem is that both operations are single transactions, and if you need to kill them, or if they run into problems at the time of execution, they must roll back the entire transaction, which can be very long, This is why when we delete hundreds of thousands of records, if the attempt to kill the process is almost useless, these operations will also affect other transactions, do not cause a dead loop, so should be used with caution.
The solution to this problem is a small number of deletions or updates, first of all, regardless of the reason for the end of the transaction, only need to roll back a small number of rows, in addition, small batches of data written to disk, the requirements for I/O are lower, concurrency can be greatly improved.
It is also a reminder that the delete and update operations should be performed as far as possible during off-peak hours.
Summarize
Follow these methods can always receive results, but in practice, we should choose one or several of the best solutions to improve the speed of our query execution, so as to save query time and improve query efficiency.