SQL statement optimization in the 2014 summary, and SQL statement in the 2014 Summary
I took over a small program for data statistics processing. It was logically nothing, but the data volume was a little large. Some tables had tens of millions of data .. because it is statistics, it is inevitable to connect various queries. It takes one hour to run this applet once it is written .. this is indeed a little unexpected, so proceed to the Optimization ..
1. Add an index
When you do not pay attention to this problem or when the data volume is small, it may be ignored. there is basically no difference in addition or not. However, when the data volume is large, the difference is very obvious. Without an index, the time difference in queries for data of less than 10 thousand and more than 0.1 million can be identified. note the following two points:
1) by default, we create a table, and the database automatically adds a default index to us. However, when the create table name as select statement is used to create a table, the database does not automatically create indexes. In this case, you need to manually add indexes.
2) which fields need to be indexed? Indexes must be added for fields that play a bridge (connection) role in our query statement, because the index speed can be improved only when the query condition contains an index field.
It can be said that index is the easiest way to increase the query speed. It can make a query that is several minutes completed within milliseconds, and the efficiency improvement is not half past one ..
2. narrow down the query set
Since it is a query, there must be a set to be queried, that is, the set following from. if this set can be reduced, the query Traversal Time will also be reduced a lot, and the overall query time will naturally decrease. there are two main methods:
1) create a temporary query table. if a table has a large amount of data, but we only query it in a dataset that meets certain conditions, then we can first query all the datasets that meet the specific conditions from this table and create a table. A typical application is create table xx as select XX where xx, in this way, we can get a temporary table that is much smaller than the original table, and all subsequent queries will be done from this new table ..
2) Optimize the query statement sequence. the condition statement after where is executed from the right to the left, so we can place the conditions that can remove more data to the right, so that the remaining data sets will be smaller after execution, the next condition query is faster. in the same way, when we use subqueries, removing more data from some subqueries will also improve the overall efficiency.
3) deduplication is a time-consuming operation. However, if a set is used repeatedly, deduplication will also improve the efficiency.
After optimization, the applet will be executed within 5 minutes .. it can be seen that the improvements brought about by the optimization of SQL statements are considerable .. of course, because my small program was not very good at the beginning, it was improved a lot after optimization, but it also shows that if you pay attention to the optimization of SQL statements, if you pay more attention to the processing of the program, you can avoid many performance problems to some extent ..