First, the optimizer basic knowledge
Before Oracle executes a SQL, it first analyzes the execution plan of the statement and then executes the execution plan. The execution plan for parsing statements is performed by the optimizer (Optimizer). In different cases, a single SQL may have multiple execution plans, but at some point there must be only one execution plan that is optimal and spends the least time.
I believe you will use Pl/sql Developer, Toad and other tools to see the execution plan of a statement, but you may have questions about rule, Choose, first rows, and all rows, because I was like this, At that time I also wondered why the different items were chosen, and the execution plan changed.
1. Optimization method of optimizer
Oracle Optimizer optimization method:
Rbo Way
The optimization method based on rules (rule-based optimization, referred to as RBO). When the optimizer analyzes SQL statements, it follows some of the rules that are predetermined within Oracle . For example, we usually go to the index when a column in a WHERE clause has an index.
CBO Way
Cost-based Optimization (cost-based optimization, referred to as CBO). According to the meaning of the word, it is to see the cost of the statement (cost), where the costs are mainly CPU and memory. When the optimizer determines whether to use this method, the main reference is the statistics of tables and indexes. Statistics give information about the size of the table, the number of rows, the length of each line, and so on. These statistics are not in the library at first, it is you do analyze after the appearance, a lot of time and the expiration of statistics will make the optimizer made a wrong execution plan, because we should update this information in a timely manner. In Oracle8 and later versions, theOracle Column recommends the CBO approach.
We want to understand, not necessarily go index is excellent, such as a table with only two rows of data, one IO can complete the whole table retrieval, and then go to the index when the need for two times IO, then the table to do a full table scan (scan) is the best.
The new version of Oracle is gradually abandoning support for rule, and even the rule approach, the final measure of SQL execution efficiency is how much resources are consumed by SQL execution? The cost of the optimization method, need table, index statistics, Requires regular analysis of multiple tables and indexes on a daily basis, but the statistics are also historical, sometimes not necessarily optimal, statistical information is equivalent to a person's experience, based on previous experience to determine how SQL should be executed (get optimized SQL execution path), so the specific optimization execution, the first manual analysis of SQL, See is the use of Rbo mode of consumption, or CBO consumption; The DBA's job is to make a variety of adjustments based on the current Oracle 's run-time logs, making the current Oracle operational efficiency as optimal as possible. During run time, Adopt hint to adopt the optimization method flexibly.
CHOOSE (optional)
Setting the default Oracle Optimizer allows you to use various declarations of optimizer_mode parameters in the Init.ora file, such as Rule,cost,choose,all_rows,first_rows. You will of course overwrite it at the SQL sentence level or at the session level.
In order to use the cost-based optimizer (CBO, cost-based Optimizer), you must frequently run the Analyze command to increase the accuracy of object statistics in the database (objects statistics).
If the optimizer mode of the database is set to selectivity (CHOOSE), the actual Oracle Optimizer pattern will be related to whether the Analyze command has been run. If the table has been analyze, the optimizer pattern will automatically become a CBO, whereas the database will use the rule-form optimizer .
By default,Oracle uses the Choose Optimizer , and in order to avoid unnecessary full table scans (scan), you must try to avoid using the Choose Optimizer , A rule-based or cost-based Oracle Optimizer is used directly.
2, Optimizer optimization mode (optermizer mode)
The optimization model includes the four Rule,choose,first rows,all rows, which we mentioned above. I would like to explain the following:
Rule: Needless to say, it's a rule-based approach. (Rbo optimization method)
Choolse: That's what we should be watching, and that's the way Oracle uses it by default. Refers to when a table or index has statistics, the way of the CBO, if the table or index does not have statistical information, the table is not particularly small, and the corresponding columns indexed, then go to the index, walk Rbo way.
By default,ORACLE uses the Choose Optimizer , and in order to avoid unnecessary full table scans (scan), you must try to avoid using the Choose Optimizer , A rule-based or cost-based optimizer is used directly.
First rows: It's similar to the Choose approach, and the difference is that when a table has statistics, it will be the quickest way to return the initial rows of the query, reducing the response time overall. (CBO optimizes the way to provide one of the fastest response times, depending on system requirements, usage)
All rows: That is what we mean by cost, when a table has statistics, it returns all the rows of the table in the quickest way, increasing the throughput of the query as a whole. There is no statistical information to go the rule-based approach. (CBO optimization, to provide maximum throughput, is to maximize the total execution) [Page]
First rows and all rows are conflicting. If you want to return to the user as soon as possible, there is no way to deliver more results, which is to return the rows (or records) that were retrieved initially, and all rows is to return all results to the user as much as possible, The user will not get the return result very quickly. Just like the empty car can run very fast, heavy loading can only run slowly;
3, how to set the choice of which optimization mode
A, instance level
We can set Optimizer_mode=rule, Optimizer_mode=choose, Optimizer_mode=first_rows, OPTIMIZER_ in the Init<sid>.ora file Mode=all_rows to choose 3 Four ways, if you do not set the Optimizer_mode parameter is the default choose this way.
Both Init.ora and Init<sid>.ora are in the $Oracle_home/dbs directory and can be used to find $Oracle_home-name init*. Ora View the init files in the directory.
Init.ora is valid for all instances; Init<sid>.ora is valid only for the specified instance.
B, Sessions level
Set by sql> ALTER session set optimizer_mode=<mode>;. You will overwrite the optimized mode set by Init.ora,init<sid>.ora, or you can use hint to force the selected optimization mode in the SQL statement as follows:
C, statement level
These need to use hint, such as:
Sql> SELECT A.userid,
2 B.name,
3 B.depart_name
4 from Tf_f_yhda A,
5 Tf_f_depart B
6 WHERE A.userid=b.userid;
Here, hint is used to enforce the rule based Optimization model.
Hint grammar, end, and middle fill in the mandatory optimization mode.
4, why sometimes a table of a field is clearly indexed, when the observation of some language implementation plan does not go to the index. How to solve it.
A, do not go to the index generally have the following several reasons
♀ you're using the All_rows method at the instance level.
♀ the statistics of your table (most probable cause)
♀ your table is small, as mentioned above, theOracle Optimizer does not think it is worthwhile to go to the index.
B, Solution
♀ can modify the Optimizer_mode parameter in Init<sid>.ora, change it to rule or choose, and restart the database. You can also use the hint mentioned in 4.
♀ deletion of statistical information
Sql>analyze table table_name Delete statistics;
♀ table Small not go index is right, do not need to tune.
other reasons for not taking the index:
1. Set up the composite index, but the query predicate does not use the first column of the combined index, there is an index SKIP scan concept here.
2. An index is established on a table column that contains a null value, and no index is used when using the Select COUNT (*) from table.
3, use the function on the index column does not use the index, if you must use the index can only establish a function index.
4. Indexes are not used when the indexed columns are implicitly type-converted. For example: SELECT * from t where Indexed_column = 5, while the Indexed_column column is indexed but the type is character type, Oracle produces an implicit type conversion, and the converted statement is similar to select * From T where To_number (indexed_column) = 5, it is similar to CASE3 when the index does not go. There are similar problems with date conversions, such as SELECT * from T where trunc (date_col) = Trunc (sysdate) date_col as indexed columns, so that writing does not go to the index, it can be written as a select * from T where date _col >= trunc (sysdate) and Date_col < Trunc (sysdate+1), this query is indexed.
5, not all cases using the index will speed up the query speed, the full scan table is sometimes faster, especially when the amount of data in the query is larger than the overall table, because scan table is more than read, when the Oracle optimizer Do not use an index without immediate coercion when you do not choose to use it, to prove that using the index does make it easier to use the mandatory index.
6, <>
7, like ' ' hundred semicolon in front.
5, Other related
A, how to see whether a table or index is statistical information
Sql>select * from User_tables 2 WHERE table_name=<table_name>
3 and num_rows is not null;
Sql>select * from User_indexes
2 WHERE table_name=<table_name>
3 and num_rows is not null;
B, if we first use the CBO approach, we should update the table and index statistics in time to avoid the form of unrealistic implementation plans.
sql> ANALYZE TABLE table_name COMPUTE STATISTICS;
sql> ANALYZE INDEX index_name estimate STATISTICS;
Two, 26 parameters of the optimizer
Impact system Performance Class variable parameters
(1) Checkpoint_process This parameter is set to TRUE or false based on whether the checkpoint is required. The checkpoint process (CHPT) establishes a static point when information about all buffers is written to disk. Making a mark in the archive log file indicates that a checkpoint has occurred. Checkpoints occur when the archive log is converted or when the number of blocks defined by the Log_checkpoint_interval is reached. When this parameter is set to True, the background process chpt to work. During the checkpoint, if the performance of the log write process (LGWR) is reduced, the CHPT process can be improved.
(2) Db_block_checkpoint_batch the value of this parameter is set to a large size, you can speed up the completion of the checkpoint. When the specified value is larger than the parameter db_block_checkpoint_batch, the effect is the same as the specified maximum value.
(3) Db_block_buffers This parameter is the number of database blocks that can be buffered in the SGA. This parameter determines the size of the SGA, which is a decisive factor for database performance. If you take a larger value, you can reduce the number of I/O, but requires a large memory space. The size of each buffer is determined by the parameter db_block_size.
(4) Db_block_size This parameter represents the size of the Oracle database block, in bytes, with a typical value of 2048 or 4096. This value cannot be changed once it is set. It affects the maximum value of the freelists parameter of the table and index.
(5) Db_files This parameter is the maximum number of data files that can be opened when the database is run.
(6) Db_file_multiblock_read_count This parameter represents the maximum number of blocks that can be read by an I/O operation at the time of the sequential scan, depending on the operating system, and the value of 4 to 16 or 32 is better.
(7) D1screte_transaction_enabled This parameter implements a simpler, faster rollback mechanism to improve the performance of certain transaction types. When set to true, some types of transactional performance can be improved.
(8) Log_archive_buffer_size the value of this parameter depends on the operating system, which, together with the Log_archive_buffer parameter, is used to adjust the running of the archived logs so that it runs as fast as possible, but not fast to degrade performance. You need to increase the value of these parameters only if you are archiving directly to a tape device, and the redo log buffer waits for the archive log buffer to become available.
(9) Log_archive_buffer This parameter specifies the number of buffers to use for archived logs.
(10) Log_buffer This parameter indicates the number of bytes allocated to the log buffer in the SGA, which reduces the number of log I/O when the value of the parameter is large. It is not advisable to use a value greater than or equal to 64K for a busy system. Default value-4 times times the size of the database block.
(11) Log_checkpoint_timeout This parameter indicates the time interval between two checkpoints, and a time based checkpoint is not allowed if 0 o'clock is specified.
(12) Log_checkpoint_interval This parameter is used to determine how often the checkpoint process is performed. This value sets the number of redo buffer blocks processed before the checkpoint.
(13) Log_files This parameter specifies the number of log files that the database can open during run time. You can reduce this value if you need a larger SGA space without multiple log files.
(14) Log_simultaneous_copies This parameter is the maximum number of log buffer replica latches, which is used to write log entries at the same time. To improve performance, you can set this parameter to twice times the number of CPUs to a single process system, which is mostly set to 0, when the latch is disconnected.
(15) Log_small_entry_max_size This parameter is used in conjunction with the Log_simultaneous_copies parameter. If the log entry is greater than this, the user process frees the log replication latch after allocating space to the buffer and obtaining the log replication latch.
(16) Optimizrer_mode if the value of this parameter is rule, the Oracle Optimizer chooses rule-based optimizations; If set to cost and there is statistics in the data dictionary, the Oracle optimizer Select the cost Based optimization method for optimization.
(17) Sequence_cache_entries This parameter indicates the number of cache sequences that can be cached in the SGA for direct access. The cache area is managed based on the least recently used (LRU) algorithm. Higher concurrency can be achieved if this value is set to a higher level.
(18) Sequence_cache_hash_buckets This parameter is used to speed up viewing the number of bucket addresses for the latest sequence of recent requests in the buffer zone, which is 8 bytes per barrel address. The buffer area is arranged in a hash table, which should be a prime number.
(19) Serializeable This parameter is used to ensure consistency of duplicate reads. When it is set to true, the query ensures that table-level reads are consistent to prevent modifications before the query is submitted.
(20) Shared_pool_size This parameter specifies the size of the shared pool, including shared cursors and stored procedures. In a multiuser system, a larger shared_pool_size value improves the execution performance of SQL statements, but a smaller value saves memory.
(21) Small_table_threshold This parameter determines the number of buffers used in the SGA for scanning, and if the number of tables is less than that value, the table can be read into the cache area as a whole. If the table is greater than this value, reuse the buffer immediately. The default value is generally used to make the best performance.
(22) Sort_area_tetained_size This is the maximum number of session memory for memory sorting. When the last-row is presented from the sort space, the memory is freed. If you want large memory to be sorted, assign a temporary segment, and the sort can be done on the disk. The maximum amount used for sorting can be specified by sort_area_size, without this parameter. You can assign more than one sort space of the same size, but it is generally required for complex queries.
(23) Sort_area_size This parameter specifies the maximum amount of PGA memory, in bytes, required for an external sort (disk). When the sort row is written to disk, the memory is freed. Increasing the value of this parameter can improve the sorting efficiency. This parameter is not normally adjusted unless the order size is large.
(24) Sort_spacemp_size This parameter is adjusted only when the amount of the order is large. You can set this parameter in the following way so that the sorting best uses disk space.
(25) SQLTrace This parameter is set to True, it can be tracked for improved performance information. Because tracing increases overhead, it is generally only true when information is collected. You can overwrite it with the alter session command when you are actually using it.
(26) Transaction This parameter sets the maximum number of concurrent transactions. If this value is large, you need to increase the SGA space and the number of rollback segments allocated. Recursive transactions can be allowed when the default value is greater than the process.
There are two ways to optimize Oracle Optimizer: rule-based optimization RBO(rule-based optimization) and cost-based optimization CBO(cost-based optimization), Among them the Rbo way has been abandoned in the oracle10g.
Rbo is a rule-based optimization approach in which the optimizer follows some rules that oracle internally predetermined when parsing SQL statements, such as indexing when a column in a WHERE clause has an index. Rbo chooses access plans based on the level of access paths and access paths available, and the higher the access path runs the slower the SQL speed, and if there are multiple paths, the lower-ranked path is selected.
CBO is a cost-or cost-optimized approach, cost or cost that is memory and CPU consumption. Optimization selects a cost-only access path based on the available access path, table, or index statistics. Therefore, in time to update the statistics, to avoid outdated statistics so that the optimizer select a wrong execution plan.
The CBO consists primarily of query converters (queries Transformer), evaluator (estimator), and plan Builder (planning generator).
First, query optimizer , because the form of query SQL statement may affect the resulting execution plan, query converter is to change the form of query statement to obtain more efficient execution plan, Oracle mainly provides four kinds of conversion technology:
1. View Mergin: When a query statement contains a view, it produces an independent "view query block" that affects the integrity of the statement, resulting in an undesirable execution plan. A view merge removes the view query block, merges the view into a whole query block, and improves the integrity of the execution plan.
2. Predicate propulsion (predicate pushing): The technique is to push the predicate of a view that cannot be merged into a view query block. These predicates are usually indexable or have strong filtering.
3. Non-nested subqueries (subquery unnesting): Subqueries and views are put into separate query blocks, and the query converter converts most subqueries into joins into the same query block, which cannot be merged in an efficient manner.
4. Materialized view query rewriting : When Query_rewrite_enabled=true, the converter looks for the materialized view associated with the statement and overwrites the statement.
Second, the evaluator evaluates the total cost of the plan by calculating three values: selectivity (selectivity), cardinality (cardinality), cost.
1. Selectivity : is a number greater than 0 less than 1, 0 means no records are selected, and 1 indicates that all records are selected. Statistical information and histograms are related to the accuracy of selectivity values. For example: Name= ' Davis ', if no statistical information evaluator would specify a default selectivity value based on the predicate being used, the evaluator would always assume that the selectivity of the equality predicate is smaller than the inequality predicate, and that the selectivity value is 1/count if there is statistical information and no histogram exists. Distinct name); If there is a statistic, there is a histogram, and the selectivity value is count (name) where name= ' Davis '/count (name) where name is not NULL.
2. Cardinality : Usually the number of rows in a table is called the base cardinality (base cardinality), and the number of rows left after filtering the conditions in the where is called the "valid cardinality" (Effective cardinality) ; The number of result set rows produced after the join operation is called the connection cardinality (join cardinality), and the number of rows after a field distinct is called the distinct cardinality, and so on.
3. Cost : is the unit that measures resource consumption. It can be understood as consuming I/O, CPU, memory for performing table scans, index scans, connections, sorting, and so on.
third, the program builder, Finally, the role of the Plan Builder is to generate a large number of execution plans, and then select one of the lowest overall cost, and if it finds that the cost of the current execution plan is already low, it will stop experimenting, and it will continue to experiment with other execution plans if the cost of the current plan is high.