SQL Index Optimization

Source: Internet
Author: User
Tags case statement create index

Preface
There are many ways to optimize the database, in the application layer, mainly based on index optimization. Based on the actual work experience, the author has made some expansion on the basis of the existing methods, summed up the optimization of the index-based SQL statement 18 Palm, hoping that one day you will be able to use a palm to tame the "dragon" in the service business
Master
Create the necessary indexes
This teaching of the Dragon 18 Palm, the master's only sentence: the establishment of the necessary index, which is the back of the Dragon 18 palm of the internal strength of the foundation. This seems to be easy and practical but difficult. It is difficult to determine which indexes are necessary and which are unnecessary. The final criterion for judging is whether these indexes are useful for our database performance. Specifically to the method, you must be familiar with all the SQL statements in the database application, from which you can count some of the common SQL that might affect performance, analyze and summarize the fields and their combinations as the WHERE condition clauses; on this basis, it is possible to preliminarily determine which fields of the tables should be indexed. Second, you must be familiar with the application. You must know which tables are tables with frequent data operations, which tables are often connected to other tables, which tables have large amounts of data, and for tables with large data volumes, the data distribution of each field, and so on. These tables, which meet the above criteria, must be focused, because the indexes on those tables will have a significant impact on the performance of the SQL statement. However, the following is a summary of a descent Dragon 18 palm internal strength of the basic foundation, indexing commonly used rules are as follows:

1, the table's primary key, the foreign key must have the index;

2, the data volume of more than 300 of the table should be indexed;

3. Tables that are often connected to other tables should be indexed on the connection field;

4. Fields that often appear in the WHERE clause, especially for large tables, should be indexed;

5, the index should be built on the field of high selectivity;

6, the index should be built on the small section, for large text fields or even long fields, do not build index;

7, the establishment of composite index needs careful analysis; try to consider using single-field indexes instead:

A, the correct choice of composite index of the main column field, generally is a better choice of fields;

B, how many fields of a composite index often appear in the WHERE clause at the same time? Is there very little or no single-field query? If it is, you can create a composite index, otherwise consider the single-field index;

C, if the composite index contains a field that often appears separately in the WHERE clause, it is decomposed into multiple single-field indexes;

D, if the composite index contains more than 3 fields, then carefully consider its necessity, consider reducing the composite field;

E, if the existing single-field index, and the number of composite indexes on these fields, you can generally delete the composite index;

8, frequent data operation of the table, do not set too many indexes;

9. Delete useless indexes and avoid negative impact on execution plan;

These are some common criteria for establishing an index. Word, the establishment of the index must be cautious, the need for each index should be carefully analyzed, to establish the basis. Because too many indexes and inadequate, incorrect indexes are not good for performance: Each index established on the table increases the storage overhead, and the index increases processing overhead for insert, delete, and update operations. In addition, too many composite indexes, in the case of single-field index, generally have no value; Conversely, it also reduces performance when data is being deleted, especially for tables that are frequently updated, with greater negative impact.

The first palm avoids the action on the column


Any action on a column can result in a full table scan, where the so-called operations include database functions, calculation expressions, and so on, to move the operation to the right of the equation whenever possible, or even to remove the function.


Example 1: The columns in the following SQL conditional statements are properly indexed, but the execution speed is very slow in the case of 300,000 rows of data:


SELECT * FROM record where SUBSTRB (cardno,1,4) = ' 5378 ' (13 seconds)


SELECT * from record where amount/30< 1000 (11 seconds)


SELECT * FROM record where TO_CHAR (Actiontime, ' yyyymmdd ') = ' 19991201 ' (10 seconds)


Because any action on a column in the WHERE clause results in a row-wise calculation of the SQL runtime, it has to perform a table scan without using the index above the column, and if the results are available at query compile time, it can be optimized by the SQL optimizer, using the index, avoiding table scans, Therefore, rewrite the SQL as follows:


SELECT * from the record where Cardno like ' 5378% ' (< 1 seconds)


SELECT * FROM record where amount < 1000*30 (< 1 seconds)


SELECT * from record where actiontime= to_date (' 19991201 ', ' YYYYMMDD ') (< 1 seconds)


The difference is very obvious!


The second palm avoids unnecessary type conversions


It is important to avoid potential data type conversions as much as possible. If you compare character data with numeric data, Oracle automatically converts the character type with the To_number () function, which results in a full table scan.


Example 2: Column col1 in table Tab1 is a character type (char), the following statement has a type conversion:


Select Col1,col2 from Tab1 where col1>10,


Should be written as: Select Col1,col2 from Tab1 where col1> ' 10 '.


Third Palm increase the scope limit of the query


Increase the scope of the query to avoid a full range of searches.


Example 3: The following query table record in time Actiontime less than March 1, 2001 data:


SELECT * FROM record where Actiontime < To_date (' 20010301 ', ' yyyymm ')


The query plan indicates that the above query has a full table scan of the table, and if we know that the earliest data in the table is January 1, 2001, then you can add a minimum time to make the query within a full scope. Modify the following: SELECT * from record where


Actiontime < To_date (' 20010301 ', ' yyyymm ')


and Actiontime > To_date (' 20010101 ', ' yyyymm ')


The latter SQL statement will take advantage of the index on the Actiontime field to improve query efficiency. Replace ' 20010301 ' with a variable, which can be more than half the chance to improve efficiency, depending on the probability of the value being taken. Similarly, for queries that are larger than a certain value, you can add "and column name <max (maximum)" in the WHERE clause if you know the current maximum possible value.


Try to remove "in", "or" from the palm of the four


WHERE clauses with "in", "or" often use worksheets to invalidate indexes, and if you do not produce a large number of duplicate values, consider taking the sentence apart; the disassembled clause should contain an index.


Example 4:select count (*) from stuff where id_no in (' 0 ', ' 1 ') (23 seconds)


You might consider separating the OR clauses:


Select COUNT (*) from stuff where id_no= ' 0 '


Select COUNT (*) from stuff where id_no= ' 1 '


Then make a simple addition, the query is faster than the original SQL statement.

Try to remove the "<>" from the palm

Remove the "<>" as much as possible, avoid full-table scans, and modify the "OR" if the data is an enumeration value with a fixed range of values.

Example 5:

UPDATE serviceinfo SET state=0 WHERE state<>0;

Because the above statement contains "<>", the execution plan uses a full table scan (table Accessfull) and does not use the index on the state field. In practice, because of the limitations of business logic, the field state is an enumeration value and can only be equal to 0, 1, or 2, and the value equals =1,2, so you can remove the "<>" and use the index to improve efficiency.

Modified to: UPDATE serviceinfo SET state=0 WHERE state = 1 OR state = 2. Further modifications can be referred to the 4th method.

The six palms remove is null and is not NULL in the WHERE clause

The is null and is not NULL in the WHERE clause will not use the index instead of a full table search, so it is necessary to remove the is null and is not NULL in the WHERE clauses by changing the query mode and discussing the situation.

Seventh Palm Index improves query efficiency when data distribution is uneven

Indexes can still be used to improve efficiency when the selectivity of the index is low, but the value distribution of the data varies widely. A, the data distribution is not uniform in the special case, the selective index is also to be created.

The amount of data in table serviceinfo is large, assuming there are 1 million rows, one of which has a field disposalcourseflag, and the value range is the enumeration value: [0,1,2,3,4,5,6,7]. According to the rules established by the previous index, "a low-selectivity field should not be indexed, the field has only 8 values, the index value is highly repeatable, the index selectivity is significantly lower, and therefore no index is built." However, the distribution of data values on this field is very special, as shown in the following table:


Range of Values 6 7

Percentage of total data 1% 98% 1%


Moreover, in the common query, the query disposalcourseflag<6 the situation is many and frequent, undoubtedly, if can establish the index, and is applied, then will greatly improve the query efficiency of this situation. Therefore, we need to make an index on this field.

Eighth Palm use hint to force the specified index

Use hint to force a specified index when the Oracle Optimizer is unable to use a reasonable index.

Continuing with the above 7 example, the Oracle default determines that the values of the columns in the table are evenly distributed across all data rows, that is, at 1 million data volumes, each of the Disposalcourseflag values has 125,000 rows corresponding to them. Assuming that the SQL search condition disposalcourseflag=2, using the index on the Disposalcourseflag column for data search efficiency, often not higher than the full table scan, Oracle therefore "blind" to the index, so that in the query path selection, Use indexes on other fields or even full table scans. According to our above analysis, the distribution of data values is very special, serious uneven. In order to use the index to improve efficiency, at this time, on the one hand can be separately on the field or the table with the Analyze statement analysis, to collect enough statistics on the column, so that Oracle can use the index when querying higher selectivity, on the other hand, you can take advantage of the hint hint, after the SELECT keyword, Add the "/*+index (table name, index name) */" method to force the Oracle Optimizer on that index.

For example: SELECT * from ServiceInfo where disposalcourseflag=1;

The above statement, the actual execution of Oracle with a full table scan, plus the blue hint section, used to index query. As follows:

Select/*+ INDEX (serviceinfo,ix_s_disposalcourseflag) */*

from ServiceInfo where disposalcourseflag=1;

Note that this approach can make code maintenance more difficult, and after the name of the index on that field is changed, the hint code for all the specified indexes must be synchronized, otherwise the hint hint will be ignored by Oracle.

Nineth Palm Masking Useless index

Continuing with the above 8 example, there are queries involving disposalcourseflag=6 in the actual query, and it would be very unwise and inefficient to use the index on that field at this point. So in this case, we need to mask the index in a special way so that Oracle chooses the index on the other field. For example, if the field is numeric, just after the field name of the expression, add "+ 0" for the character type and the string above: "| |" ""

such as: SELECT * from serviceinfo where disposalcourseflag+ 0 = 6 and Workno = ' 36 '.

However, do not screen out the index that is used, otherwise it will produce a low-efficiency full-table scan.
The tenth Palm decomposes complex queries, substituting constants for variables

For complex where condition combinations, where contains multiple indexed fields, consider using an if statement for discussion, and remove unnecessary external parameter conditions to reduce complexity so that the indexes on different fields are used in different situations.

Proceed to the above 9 example for the containing

Where (Disposalcourseflag < V_disposalcourseflag) or (V_disposalcourseflag is null) and .... Query, (here V_disposalcourseflag is an input variable, the value range may be [null,0,1,2,3,4,5,6,7]), you can consider the situation with the IF statement for discussion, similar to:

IF V_disposalcourseflag =1 Then

Where Disposalcourseflag = 1 and ....

elsif V_disposalcourseflag =2 Then

Where Disposalcourseflag = 2 and ....

。。。。。。

11th palm like clause as far as possible front-end matching

Because the like parameter is used very frequently, it is highly efficient to use the index for the LIKE clause.

Example 6:select * from the city where name like '%s% '

The execution plan for the above query uses a full table scan (table ACCESS complete), if it can be modified to:

SELECT * from the city where name like ' s% '

The execution plan for the query will then become (index RANGE SCAN), successfully using the index of the name field. This means that the Oraclesql optimizer recognizes the LIKE clause for the index, as long as the matching end of the query is a specific value. So when we do like queries, we should try to make the matching end of the query is a specific value, that is, using like ' s% '.

12th Palm with Case statement merging multiple scans

We often have to calculate different aggregates based on multiple sets of data tables. For example, the following example passes three independent queries:

Example 8:1) SELECT COUNT (*) from EMP where sal<1000;

2) Select COUNT (*) from EMP where Sal between and 5000;

3) Select COUNT (*) from EMP where sal>5000;

So we need to make three full table queries, but if we use case statements:

Select

Count (Sale when Sal <1000 then 1 else null end) Count_poor,

Count (sale when between and 1 else null end) Count_blue,

Count (Sale when Sal >5000 then 1 else null end) Count_poor

from EMP;

The results of this query are the same, but the execution plan only makes one full table query.

13th Palm Use Nls_date_format

Example 9:

SELECT * FROM record where TO_CHAR (Actiontime, ' mm ') = ' 12 '

The execution plan for this query will be a full table query, if we change Nls_date_format,

Sql>alert session Set nls_date_formate= ' MM ';

Now re-modify the query above:

SELECT * from record where actiontime= ' 12 '

This allows you to use the index on the Actiontime, and its execution plan will be (index RANGE SCAN).
14th Palm using a function-based index

The previous action on any column may result in a full table scan, for example:

SELECT * from emp where substr (ename,1,2) = ' SM ';

But this kind of query is often used in the customer service system, and we can create a function-based index with the SUBSTR function.

Create INDEX Emp_ename_substr on eemp (substr (ename,1,2));

This function-based index will come in handy when executing the above query, and the execution plan will be (index RANGE SCAN).

15th Palm Function-based index requires equality match

In the example above, we created the function-based index, but if we execute the following query:

SELECT * from emp where substr (ename,1,1) = ' S '

The resulting execution plan will still be (TABLE accessfull), because the function-based index can only take effect if the data column is able to match the equation, which requires a high level of planning and maintenance of the index. Note that adding an index to a table is a very risky operation because it causes many query execution plan changes. However, if we use a function-based index, this is not a problem because Oracle uses this type of index only if the query uses a matching built-in function.

16th Palm Use partition index

When you parse a partitioned index with the parse command, the range information for the data values for each partition is placed in the Oracle data dictionary. Oracle can use this information to extract data partitions that are only related to SQL queries.

For example, suppose you have defined a partitioned index, and an SQL statement needs an index scan in an index partition. Oracle accesses this index partition only, and a fast full scan of this index range is raised on this partition. Because you do not need to access the entire index, you increase the speed of the query.

17th Palm Use bitmap index

A bitmap index can essentially increase the query speed of a data column that uses less than 1000 unique data values because the retrieval in the bitmap index is done in RAM and is always faster than the traditional B-tree index. Creating bitmap indexes on data columns with fewer than 1000 unique data values can make execution more efficient.

The 18th Palm decides whether to use full table scan or index

As with all the tips, the last move will go back to the beginning, and finally we'll talk about whether or not you need to index, perhaps a full table scan faster. In most cases, full table scans may result in more physical disk input and output, but full table scans can sometimes be performed faster because of the existence of a high degree of parallelism. If the queried table is completely out of order, a query that returns fewer than 10% records may read most of the data blocks in the table, so using an index can make the query more efficient. However, if the table is very sequential, it may be faster to use a full table scan if the number of records queried is greater than 40%. Therefore, the overall principle of having an index range scan is:

1) for the original sorted table, only a query that reads fewer than 40% of the table records should use an index range scan. Conversely, a query that reads 40% more records than the number of table records should use a full table scan.

2) for unordered tables, only queries that read fewer than 7% of the table records should use index range scanning. Conversely, a query that reads 7% more records than the number of table records should use a full table scan.

Summarize

The above-mentioned moves can be used in combination with each other. And the various methods of interaction, close contact. This connection is both consistent and may bring conflict, when the conflict occurs, the need to choose according to the actual situation, there is no fixed pattern. The final determinant of SQL optimization is the mastery of Oracle's internal skill.

In addition, it is worth noting that Oracle's execution plan for SQL statements changes over time and with the accumulation and change of data, such as: cost-based optimization methods, with the increase in data volume, the optimizer may incorrectly select the index and adopt a full table scan. This may be due to the fact that the statistical information is outdated and does not parse the table in time after the data volume has changed, but it is necessary to use hint hints for SQL statements to enforce a reasonable index if the table is not properly indexed after analysis. But this hint hint can not be abused, because this method is too complex, lack of versatility and adaptability, but also increase the cost of maintenance, in contrast, based on the right shift of the function, remove "in, or,<>, are not NULL", decomposition of complex SQL statements and so on, but it is " Put it all the same ", can be assured of the use of bold.

At the same time, optimization is not "once and for all", it must be adjusted as the situation changes. When the database design changes, including changing the table structure: the addition, deletion, or renaming of fields and indexes; business logic changes: such as query mode, value range change and so on. In this case, the original optimization must also be adjusted to meet the demand for efficiency.

SQL Index Optimization

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.