Optimization
For years, commercial database manufacturers have been working to improve database query performance. Despite their constant effort and hard work, we cannot see a significant improvement in the performance of relational database systems (RDBMS). Users continue to endure poorly performing SQL statements, and database experts continue to struggle with the quagmire of SQL statement optimization.
This article will introduce how the database query optimizer works, why the problem persists, and explore the future development direction of database query optimizer in depth.
Database query optimizer
The database query optimizer is an integral part of the RDBMS server. For cost based optimization, the task of the database query optimizer is to optimize an SQL statement by generating an alternative execution plan to find the execution plan for the lowest estimated cost. It plays a crucial role in the performance of SQL statements.
When an SQL statement is sent to the RDBMS server, it is parsed and submitted to the database query optimizer. The query optimizer will perform query rewriting and expression evaluations to produce an alternative execution plan. The number of execution plans that can be selected depends on the size of the planned space (Planspace) defined in the RDBMS. For each execution plan to be selected, the cost estimate is calculated and the execution plan with the lowest cost is selected to execute the SQL statement. There are two unresolved problems with this approach: the inability to produce all possible optional execution plans and inaccurate cost estimates.
New SQL Optimization concept-external SQL rewrite
It is common for an experienced programmer to increase the speed of a problematic SQL statement by thousands of times times. With the understanding of specific database features, the syntax of SQL statements is reconstructed. A developer can help the query optimizer get a better execution plan, which is the result of the query optimizer's inability to work independently of itself. This gives us an indication of whether we can achieve the same goal through computer simulations of human behavior.
Shows the concept of an external SQL rewrite (EXTERNALSQLREWRITER--ESR) that can produce SQL statements that are semantically equivalent but differ in syntax based on the characteristics of an existing database. Overriding the resulting SQL statement replaces the source SQL statement in the program's source code. This way, when the program runs (using the new SQL statement), the query optimizer will be affected to produce a better performance plan. In fact, experienced programmers and database administrators have been looking forward to this concept. The novelty of this concept is that by merging human knowledge and intelligence into a computer algorithm, it can break the barriers of planning space (planspace) constraints and avoid inaccurate cost estimates when looking for the best SQL statements.
The advantage of ESR is that its optimization is not a time based on real-time, which can be used to rewrite and optimize SQL statements, and is theoretically unlimited. It can try more than the query optimizer in the database to select the SQL statements. Users spend up to one day dealing with the performance issues of critical SQL statements, while ESR can look for possible execution plans. Another advantage of ESR is the ability to achieve improved SQL performance without the need for additional statistics to overload the database query optimizer. Therefore, ESR makes it easy and safe to improve the performance of the entire database. Finally, ESR is more flexible relative to the query optimizer inside the database: The SQL rewrite repository can be easily expanded, and search space can be extended with hardware upgrades without limitations and drawbacks.
How does ESR work?
In order for ESR to optimize SQL statements, you must avoid the obstacles created by the query optimizer restrictions within the database. In addition, ESR must have the intelligence to rewrite the SQL statements and know the characteristics of your database to provide the SQL statements to be selected. Although it is difficult to achieve the above requirements, it is not impossible to develop such an algorithm. The domestic market has been by the Ying Hua Construction company agent launched a fairly mature product--leccotechnology Leccosqlexpert.
The SQL statement can be entered directly, or a problematic SQL statement is caught by examining the source code. When the ESR receives the SQL statement, a recursive conversion algorithm is applied to rewrite the source SQL statements, producing every possible semantic equivalent of the SQL statement to be selected. The recursive SQL conversion engine, which has built-in artificial intelligence and SQL conversion rules, reconstructs the SQL statements according to the characteristics of the specific database.
Scalable planning Space
User-customizable quotas allow users to control search space. If the user is dissatisfied with the results produced within the defined limit, or cannot find a better performance-ready SQL statement, the user can increase the amount of the search space until more of the SQL statements are found. Therefore, the expansion of the ESR plan space is much more resilient than the database query optimizer. Another thoughtful feature is that ESR can be used internally to exclude SQL statements with recurring execution plans. This not only reduces the cost of repeated attempts, but also ensures that each SQL statement generated by the rewrite has different execution plans and performance.
Although ESR expands its planning space and produces more of the SQL statements to be selected, it is confronted with inaccurate cost estimates, as does the internal query optimizer in the database. Without actually executing every rewritten SQL statement, it is not possible to know which of the best performance is in all of the SQL statements to be selected. Therefore, the actual approach is that the test runs all the pending SQL statements. The ESR example in Figure 2 uses this method and provides a test option to find the SQL statement with the best performance. The user can choose when to start the test and how long it will be allowed to test. In addition, the importance of ESR is that the entire optimization process does not require the user to have experience with database experts. As we know, different SQL syntax can cause the database query optimizer to produce different execution plans, so ESR must also have the ability to simulate manual rewriting of SQL statements. Recursive SQL syntax conversion technology is precisely used to simulate a human SQL conversion method. It merges a series of useful transformation rules and converts the SQL statements using a method that triggers one at a time. All conversion rules are not dependent on each other, like a closed door, which is opened only if all necessary conditions are met. This ensures that the rewritten SQL statement is semantically equivalent to the original SQL statement.
The prospect of ESR technology
With hardware upgrades, ESR can incorporate more SQL transformation rules to deal with new features of the database and more complex SQL statements, in theory, there is no limit to the number of rules. In the future, ESR will gain more control over the optimization strategy as database manufacturers become more open to users impacting the database optimizer. There is no doubt that ESR will play an increasingly important role in sharing the workload of the database query optimizer.