[Speech Recording] RWP team talked about SQL optimization and recorded rwp team SQL
Speaking of SQL optimization, what is the first reaction in your mind as a reader? Index? Hint? Partition? Parameter? Execution Plan? Haha, let's talk about it. Today, Qu Zhuo from Oracle shared the overall idea of SQL optimization at the seventh data technology carnival, I hope to inspire you.
Author Profile:
Qu Zhuo (Christine), department manager of Oracle RWP in China and expert of Oracle database performance optimization, has a deep understanding of Oracle database performance optimization and has solved thorny performance problems for many domestic and foreign customers.
1Set a high goal
If you optimize an SQL statement from one hour to one minute, will you stop working? Will you consider optimizing it to 1 second?
At work, everyone is under pressure, and under pressure, it is easy to think less. The duration of running an SQL statement depends on the hardware and software environment it runs in. Whether an SQL statement can run faster is essentially:Can it make full use of hardware resources and software capabilities?.
It is very important to set a high goal for yourself for SQL optimization!
2Optimize those good SQL statements
With a high goal, we need to find those good SQL statements for optimization. So what is good SQL?
(1) valid SQL statements
The database is designed to execute SQL statements, not invalid SQL statements that report errors upon execution.
If you run an SQL statement and report a ORA error, this is an invalid SQL statement. It should not exist in your system, but should not be the optimization object.
If you run an SQL statement and report a ORA error, a failure parse is returned in the database. If you have failure parse in the AWR report, you should note that the consequences may be very serious.
(2) SQL statements that you know business meaning
In many cases, some SQL and PL/SQL stored procedures do not need to be executed at all. However, for various reasons, those SQL and PL/SQL stored procedures may have existed for a long time in the system, those who write the SQL and PL/SQL stored procedures may have been switching for a long time. For the so-called "stability", no one has moved those SQL and PL/SQL stored procedures. It makes no sense to optimize these SQL and PL/SQL stored procedures that do not need to be executed at all.
Therefore, before optimizing any SQL statement, you should first understand the business meaning of the SQL statement, determine that it actually needs to be executed, and then optimize it.
(3) Structured SQL
If an SQL statement contains the IN list and the IN list contains hundreds of values, the hundreds of values may come from another SQL statement rather than manual input. Since the number of values IN the IN list has an allowed upper limit, some SQL statements may look like the following:
If the number of hundreds, thousands, and tens of thousands of values IN the IN list is poor in SQL construction, should we first change it to a JOIN and then consider other values?
(4) No incorrect SQL statements are written.
If N tables are joined, there should be a N-1 JOIN condition.. If the JOIN condition is smaller than the N-1, cartesian join appears, and a duplicate value exists in the result set. Adding DISTINCT to the select list usually gives the SQL statement the correct result set. This is like taking the money from a bank. Actually, you only need to pay 1000 yuan, but you first get 2000 yuan and then save the remaining 1000 yuan, although the actual result is correct, you actually paid 1000 yuan.
When the amount of data processed by SQL is small, this option does not have a great impact on the response time. However, when the amount of data processed by SQL is large, this impact will be fully highlighted. In another example, if you only need to pay 1000 yuan, but you first get 10001000 yuan, and then save the remaining 10000000 yuan. At last, you will get 1000 yuan, but the bank staff will pay 10001000 yuan for the amount of time when they get the money for you, and the amount of time when they save the money back will increase by 10000000 yuan, it takes a long time for you to start your business.
The data type of the values in the WHERE condition in the SQL statement should be consistent with the data type of the corresponding column.. Otherwise, although the SQL statement does not report an error, the function is implicitly used to convert the column to the same data type as the corresponding value to execute the SQL statement. This implicit data type conversion may cause a ORA-01722 error and may cause the index of the corresponding column not to be used, this may cause situations where partition pruning is clearly available but not available, and the response time may be several orders of magnitude different.
3Give SQL a good execution environment
SQL must be executed in a good environment to achieve better performance. So what is a good execution environment?
Correct software patching is the first step in creating a good execution environment. Obviously, you have spent money to buy software. Obviously, software manufacturers have installed patches to make the software run better and faster. Why not install patches? Of course, patching is a technical activity. To correct software patches, you must follow the instructions of the software manufacturer or consult the software manufacturer.
Using the default init. ora parameter settings is also an important part of creating a good execution environment. Using the default init. ora parameter settings means that you are using the software designed by the Oracle internal R & D team. This means that you are using the software strictly tested by the Oracle internal testing team. Of course, there are some specific application software, such as Oracle's EBS, which requires modifying the init. ora parameter. This situation needs to be modified because those modifications have been strictly tested by the application software manufacturer.
If some parameters need to be modified as a temporary solution in case of a bug, you should change the corresponding parameters in time after the bug is fixed. Otherwise, the consequences may be very serious.
In addition, if you modify the init. ora parameter at will, it may cause after-sales problems.
4Optimize SQL from the perspective of Database Design
Currently, Oracle database software uses Cost Based Optimizer (CBO) and Cost-Based Optimizer.
Essentially, the optimizer is a series of algorithms. The optimizer generates an SQL Execution Plan based on input information. The entered information includes:
(1) Statistical information
The statistical information includes two aspects: the statistical information of the system and the statistical information of the actual user data.
System statistics. We recommend that you use the default settings. The statistical information of actual user data must be representative and be able to reflect the characteristics of the data.
(2) Constraints
Not null, PK, FK, UK, and other constraints. If the actual data needs to comply with the constraints, those constraints should exist in the database and the optimizer should know the existence of these constraints.
For example. Multiple tables are joined. If a table is only joined
The customer table only appears in the JOIN part, but does not appear in SELECTlist, nor in the query conditions, or in the group by and order by sections. If there is a foreign key constraint on the JOIN key (lo_custkey) in the lineorder table, the optimizer will know that the join of lo_custkey = c_custkey is always able to JOIN, in actual execution, the table 'customer 'will not be joined.
The execution plan can be as follows:
Have you kept your eyes open? the root of the customer table is not in the execution plan! The fastest JOIN you can do is not JOIN. Hahaha. This situation is called JOIN elimination. The prerequisite is the existence of related constraints.
(3) Schema Design
Schema Design, including data models, indexes, partitions, compression, and clustering (data is physically stored together based on the corresponding KEY values) has a very important impact on SQL Performance.
In some SQL statements, a table can be joined with itself for dozens of times because of poor data model design. At this time, if you only focus on SQL itself, the performance improvement that can be achieved will be very limited.
Schema Design is a university question. Every aspect can have a few orders of magnitude impact on SQL Performance. To optimize SQL statements, you must pay attention to schema design.
5Optimize SQL statements from execution perspective
To optimize SQL statements from execution perspective, consider the following aspects:
Access method: whether to Access data through indexing or full table scan.
The Join method is Nested Loop Join, Hash Join, or Merge Join.
The Join sequence is table A Join table B, then Join Table C, or vice versa.
During parallel execution, the data distribution method between the producer process group and consumer process group is hash, broadcast, or other distribution methods.
Whether the data is skewed, whether some KEY values correspond to a large amount of data, and other KEY values correspond to a very small amount of data.
With the development of technology, the value of data in enterprises has become increasingly prominent. The data technology carnival sponsored by ACoG and Yun and emo shares cutting-edge information and dry goods technologies around core technologies in the data and database fields, the road to enterprise transformation and strategic direction invite you to explore the value of data and create a better future together! The 8 th data technology carnival will kick off on March 13, November 16, 2018. It's a wonderful time to come!
Related reading:
Smart evolution of Bohai property and casualty data architecture
Smart cloud exploration and best practices in the financial industry
China Telecom's Oracle Sharding Architecture Application Case Study
Multi-tenant data integration practices from AT&T to Qinghai mobile
Meituan reviews the evolution and ideas of the database high availability architecture
Download Resources
Public Account: Data and cloud (OraNews) reply keyword acquisition
'Dedicated DTC', Dig DTC Conference PPT
'Dbalife', "DBA's Day" Poster
'Dba04', DBA Manual 4, classic chapter ebook
'Internal', Oracle RAC PPT
'122arch', Oracle 12.2 architecture Diagram
'Drawing oow', Oracle OpenWorld documents
'Prection', Lecture hall Course Materials