SQL statement optimization for database performance optimization (go to Java companion)

Source: Internet
Author: User

First, the question of the proposed

In the early stage of application system development, because the database data is relatively small, for querying SQL statements, complex views of the writing of the SQL statement can not be used to write a variety of performance advantages and disadvantages, but if the application system submitted to the actual application, with the data in the database increases, The response speed of the system is one of the most important problems that the system needs to solve at present.

An important aspect of system optimization is the optimization of SQL statements. For the massive data, the speed difference between the inferior SQL statement and the high-quality SQL statement can reach hundreds of times, it can be seen that a system is not simply able to achieve its function, but to write high-quality SQL statements, improve the availability of the system.

In most cases, Oracle uses indexes to traverse tables more quickly, and the optimizer improves performance primarily based on defined indexes. However, if the SQL code written in the WHERE clause of the SQL statement is not reasonable, it will cause the optimizer to delete the index and use a full table scan, which is generally referred to as the poor SQL statement. When writing SQL statements, we should be aware of the principles by which the optimizer removes the index, which helps to write high-performance SQL statements.

Second, the SQL statement writing attention issues

The following is a detailed description of the issues that need to be noted in writing the WHERE clause of some SQL statements. In these where clauses, even if there are indexes on some columns, because poor SQL is written, the system cannot use the index while running the SQL statement, and it also uses a full table scan, which results in a very slow response.

1. Operator optimization

(a) in operator

The advantages of SQL in write are easier to write and easy to understand, which is more suitable for modern software development style. But SQL performance with in is always lower, and the steps taken from Oracle to parse SQL with in is the following differences from SQL without in:

Oracle attempts to convert it into a connection to multiple tables, and if the conversion is unsuccessful, it executes the subquery in the inside, then queries the outer table record, and if the conversion succeeds, it directly uses the connection method of multiple tables. This shows that using in SQL at least one more conversion process. General SQL can be converted successfully, but for the inclusion of grouping statistics and other aspects of SQL cannot be converted.

Recommended scenario: In a business-intensive SQL, try not to use the in operator, instead of using the EXISTS scheme.

(b) Not in operator

This action is not recommended for strong columns because it cannot apply the index of the table.

Recommended scenario: Replace with not EXISTS scheme

(c) is null or is not NULL operation (determines whether the field is empty)

Determining whether a field is empty generally does not apply an index because the index is not an index null value. You cannot use NULL as an index, and any column that contains null values will not be included in the index. Even if the index has more than one column, the column is excluded from the index as long as there is a column in the column that contains null.

This means that if a column has a null value, even indexing the column does not improve performance. Any statement optimizer that uses is null or is not NULL in the WHERE clause is not allowed to use the index.

Recommended scenario: Replace with other operations with the same function, such as: A is not null changed to A>0 or a> ", etc. The field is not allowed to be empty, but instead of a null value with a default value, such as the Status field in the requisition is not allowed to be empty, the default is the request.

(d) > and < operator (greater than or less than operator)

The greater than or less than the operator generally does not need to adjust, because it has an index will be indexed to find, but in some cases it can be optimized, such as a table has 1 million records, a numeric field A, 300,000 records of a=0,30 Records of the A=1,39 million records of a=2,1 Records of the a=3. There is a big difference between performing a>2 and a>=3, because Oracle finds the index of records for 2 and then compares them, while A>=3 Oracle locates the records index of =3 directly.

(e) Like operator

The LIKE operator can apply a wildcard query, where the wildcard combination may reach almost arbitrary queries, but if used poorly it can produce performance problems, such as the "%5400%" query does not reference the index, and the "x5400%" reference to the scope index.

A practical example: Use the user identification number behind the business number in the YW_YHJBQK table to query the business number YY_BH like '%5400% ' this condition will result in a full table scan, if changed to yy_bh like ' x5400% ' or yy_bh like ' b5400% ' will benefit The performance of the two-range query with YY_BH Index is certainly greatly improved.

A like statement with a wildcard character (%):

This is also the case with the above example. The current demand is such that the workers ' table should be queried for the person whose name contains Cliton. You can use the following query SQL statement:

select * from employee where last_name like ‘%cliton%‘;

This is because the wildcard character (%) appears at the beginning of the search term, so the Oracle system does not use the last_name index. In many cases it may not be possible to avoid this, but be sure to be in the bottom of your mind, so using a wildcard will slow down the query. However, when wildcards appear elsewhere in a string, the optimizer can take advantage of the index. The indexes are used in the following query:

select * from employee where last_name like ‘c%‘;

(f) UNION operator

The Union will filter out duplicate records after the table link is made, so the resulting set of results will be sorted after the table is connected, the duplicate records are deleted and the results returned. Most of the actual applications do not produce duplicate records, the most common being the process table and the History table Union. Such as:

select * from gc_dfys 
union 
select * from ls_jg_dfys

This SQL takes out the results of two tables at run time, then sorts the duplicate records with the sort space, and finally returns the result set, which may cause the disk to be sorted if the table data volume is large.

Recommended Scenario: Use the union ALL operator instead of union because the union all operation simply merges two results and returns.

select * from gc_dfys 
union all 
select * from ls_jg_dfys

(g) Join columns

For a joined column, the optimizer does not use the index, even if the last join value is a static value. Let's take a look at an example, assuming that there is a staff table (employee), for a worker's surname and name in two columns (First_Name and last_name), now to query a Bill Clinton Cliton.

Here is an SQL statement that takes a join query:

select * from employss where first_name||‘‘||last_name =‘Beill Cliton‘;

The above statement can be used to find out if there is a bill Cliton this employee, but it is important to note that the System optimizer does not use an index created based on last_name. When written in this SQL statement, the Oracle system can take an index created based on last_name.

where first_name =‘Beill‘ and last_name =‘Cliton‘;

(h) Order by statement

The order BY statement determines how Oracle will sort the returned query results. The ORDER BY statement has no special restrictions on the columns to be sorted, or it can be added to a column (like joins or additions). Any non-indexed item in the ORDER BY statement, or a computed expression, will slow down the query.

Double-check the order BY statement to find non-indexed items or expressions that degrade performance. The solution to this problem is to rewrite the order BY statement to use the index, or you can establish another index for the column you are using, and you should absolutely avoid using an expression in the ORDER BY clause.

(i) not

We often use logical expressions in the WHERE clause when querying, such as greater than, less than, equal to, and not equal to, and can also use and (with), or (or), and not (non). Not can be used to negate any logical operation symbol. The following is an example of a NOT clause:

where not (status =‘VALID‘)

If you want to use not, you should precede the phrase with parentheses and precede the phrase with the NOT operator. The NOT operator is included in another logical operator, which is the not equal to (<>) operator. In other words, the not is still in the operator, even if the not word is not explicitly added to the query where clause, see the following example:

where status <>‘INVALID‘;

For this query, it can be rewritten to not use not:

select * from employee where salary<3000 or salary>3000;

Although the results of these two queries are the same, the second query scenario is faster than the first query scenario. The second query allows Oracle to use indexes on salary columns, while the first query cannot use indexes.

2. Impact of SQL Writing

(a) The effect of SQL on the same performance of the same function.

As a SQL in a programmer wrote for Select * from Zl_yhjbqk

B programmer writes for Select * from Dlyx.zl_yhjbqk(prefixed with table owner)

C Programmers write for Select * from Dlyx. ZLYHJBQK(uppercase table name)

The D programmer writes for Select * from Dlyx. Zlyhjbqk(more spaces in the middle)

The result of the above four SQL is the same as the execution time after the Oracle analysis, but from the Oracle shared memory SGA, it can be concluded that Oracle performs a single analysis of each SQL and consumes shared memory.

If you write the SQL string and format exactly the same, then Oracle will only parse once, and the shared memory will only leave a single analysis, not only to reduce the time to analyze SQL, but also to reduce the duplication of shared memory information, Oracle can accurately count the frequency of SQL execution.

(b) The order of conditions behind the where is affected

The condition order after the WHERE clause has a direct effect on the query of the large data scale. Such as:

Select * from zl_yhjbqk where dy_dj = ‘1KV以下‘ and xh_bz=1 
Select * from zl_yhjbqk where xh_bz=1 and dy_dj = ‘1KV以下‘

The above two SQL DY_DJ (voltage level) and XH_BZ (PIN household sign) Two fields are not indexed, so the execution is full table scan, the first SQL DY_DJ = ' 1KV below ' condition in the recordset ratio is 99%, and xh_bz=1 ratio is only 0.5%, At the time of the first SQL 99% records are compared Dy_dj and xh_bz, while in the second SQL 0.5% records are DY_DJ and xh_bz comparisons, so that the second SQL CPU utilization is significantly lower than the first one.

(c) Impact of query table order

The order of the list in the table following the from will have a performance impact on SQL, and with no indexes and no statistical analysis of the tables by Oracle, Oracle will be linked in the order in which the tables appear, so that the order of the tables is not the same as the data that is consuming the server resource. (Note: If the table is statistically analyzed, Oracle will automatically link the small table and then the large table)

3. Utilization of SQL statement indexes

(a) Some optimizations for the condition fields

Fields that use function processing cannot take advantage of indexes such as:

substr(hbs_bh,1,4)=’5400’
优化处理:hbs_bh like ‘5400%’
trunc(sk_rq)=trunc(sysdate)
优化处理:sk_rq>=trunc(sysdate) and sk_rq

Fields that have an explicit or implicit operation cannot be indexed, such as:

ss_df+20>50
优化处理:ss_df>30
‘X’ || hbs_bh>’X5400021452’
优化处理:hbs_bh>’5400021542’
sk_rq+5=sysdate
优化处理:sk_rq=sysdate-5
hbs_bh=5401002554
优化处理:hbs_bh=’ 5401002554’

Note: This condition implicitly converts the HBS_BH to To_number because the Hbs_bh field is a character type.

A field operation that includes multiple tables in the condition cannot be indexed, such as:

hbs_bh=5401002554
优化处理:hbs_bh=’ 5401002554’

Note: This condition implicitly converts the HBS_BH to To_number because the Hbs_bh field is a character type.

A field operation that includes multiple tables in the condition cannot be indexed, such as:

ys_df>cx_df
无法进行优化 
qc_bh || kh_bh=’5400250000’
优化处理:qc_bh=’5400’ and kh_bh=’250000’

SQL statement optimization for database performance optimization (go to Java companion)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.