Mysql Query statement optimization tips _ MySQL

Source: Internet
Author: User
This article mainly introduces Mysql Query statement optimization techniques. For more information, see index optimization, query optimization, query cache, server setting optimization, operating system and hardware optimization, application-level optimization (web server, cache) and so on. The record optimization skills here are more suitable for developers who collect and sort records on the network, mainly the optimization on query statements, and other optimization skills are not recorded here.

Overhead indicators:

Execution time

Number of checked rows

Number of returned rows

Several criteria for creating an index:

(1) reasonable index creation can accelerate data reading efficiency. improper index creation will slow down the response speed of the database.

(2) the more indexes, the slower the data update speed.

(3) try to use indexes when using MyIsam as the engine (because MySQL stores indexes with BTree), rather than InnoDB. However, MyISAM does not support

Transcation.

(4) when your program and database structure/SQL statements have been optimized to a level that cannot be optimized, and the program bottleneck cannot be solved smoothly, it is time to consider using a distributed cache system such as memcached.

(5) analyze the performance of your SQL statements by using EXPLAIN.

I. Optimization of count

For example, a city with a computing id greater than 5

select count(*) from world.city where id > ;select (select count() from world.city) – count() from world.city where id <= ;

When the number of rows in statement a exceeds 11, the number of rows to be scanned is more than that in Statement B. Statement B scans six rows. in this case, statement B is more efficient than statement. If there is no where statement, directly select count (*) from world. city is faster, because mysql always knows the number of rows in the table.

2. avoid using incompatible data types

For example, float and int, char and varchar, binary, and varbinary are incompatible. Data type incompatibility may make the optimizer unable to perform some optimization operations that can be performed originally.

In the program, ensure that the number of accesses to the database is minimized based on the implementation of the function; by searching parameters, the number of access rows to the table is minimized, and the result set is minimized, thus reducing the network burden; operations that can be separated should be processed separately to improve the response speed each time. when using SQL in the data window, try to place the indexes in the selected first column; the algorithm structure should be as simple as possible; during query, do not use wildcards such as SELECT * FROM T1. SELECT COL1 and COL2 FROM T1 if you want to use several columns; if possible, try to limit the number of rows in the result set as much as possible, for example, select top 300 COL1, COL2, COL3 FROM T1, because in some cases users do not need that much data. Do not use database cursors in applications. cursors are very useful tools, but they require more overhead than general set-oriented SQL statements. extract data searches in a specific order.

III. Index field operations will invalidate the index

Avoid performing function or expression operations on fields in the WHERE clause. this will cause the engine to stop using the index for full table scanning. For example:

SELECT * FROM T1 WHERE F1/2 = 100 should be changed to: SELECT * FROM T1 WHERE F1 = 100*2

4. avoid using it! = Or <>, is null, is not null, IN, not in, and so on.

Because this will make the system unable to use the index, but can only directly search the data in the table. Example: SELECT id FROM employee WHERE id! = The "B %" optimizer cannot use indexes to determine the number of rows to be hit. Therefore, you need to search all rows in the table. Exists can be used in the in statement instead of exists.

5. use numeric fields whenever possible

Some developers and database administrators prefer fields that contain numerical information.

Designed to be optimized, this reduces query and connection performance and increases storage overhead. This is because the engine compares each character in the query processing and connection back one by one, and only one comparison is required for the number type.

6. use the EXISTS and not exists clauses reasonably

As follows:

 SELECT SUM(T1.C1) FROM T1 WHERE (SELECT COUNT(*)FROM T2 WHERE T2.C2=T1.C2>0) SELECT SUM(T1.C1) FROM T1WHERE EXISTS(SELECT * FROM T2 WHERE T2.C2=T1.C2)

The two produce the same results, but the latter is obviously more efficient than the former. Because the latter will not generate a large number of locked table scans or index scans. If you want to check whether a record exists in the table, do not use count (*) as inefficient and waste server resources. It can be replaced by EXISTS. For example:

IF (select count () FROM table_name WHERE column_name = 'XXX') can be written as: if exists (select from table_name WHERE column_name = 'XXX ')

7. if you can use BETWEEN, do not use IN.

8. if you can use DISTINCT, you do not need to use group.

9. try not to use the select into statement.

The select into statement will lock the table and prevent other users from accessing the table.

10. force the query optimizer to use an index if necessary

SELECT * FROM T1 WHERE nextprocess = 1 AND processid IN (8, 32, 45) changed to: SELECT * FROM T1 (INDEX = IX_ProcessID) WHERE nextprocess = 1 AND processid IN (8, 32, 45)

The query optimizer forcibly uses the index IX_ProcessID to execute the query.

11. eliminate sequential access to data in large table rows

Although all check columns are indexed, some forms of WHERE clauses force the optimizer to use sequential access. For example:

SELECT * FROM orders WHERE (customer_num=104 AND order_num>1001) OR order_num=1008

The solution can be to use the union to avoid sequential access:

SELECT * FROM orders WHERE customer_num=104 AND order_num>1001 UNION SELECT * FROM orders WHERE order_num=1008

In this way, you can use the index path to process queries. [If there are many jacking data result sets, but the query condition is limited and the result set is not large, the subsequent statements are fast]

12. try to avoid using non-headers for searching in indexed character data.

This also makes the engine unable to use the index

See the following example:

SELECT * FROM T1 WHERE NAME LIKE ‘%L%'SELECT * FROM T1 WHERE SUBSTING(NAME,2,1)='L'SELECT * FROM T1 WHERE NAME LIKE ‘L%'

Even if the NAME field has an index, the first two queries still cannot use the index to accelerate the operation. the engine has to perform operations on all data in the table one by one to complete the task. In the third query, indexes can be used to speed up operations. do not habitually use '% L %' (resulting in full table scan ), it is better to use 'L %;

13. although the UPDATE and DELETE statements are basically fixed, some suggestions are provided for the UPDATE statements.

(1) try not to modify the primary key field.

(2) when modifying a VARCHAR field, try to replace it with a value of the same length.

(3). minimize the UPDATE operations on tables containing UPDATE triggers.

(4) avoid columns to be copied to other databases by UPDATE.

(5) avoid updating columns with many indexes.

(6) avoid updating columns in the WHERE clause condition.

14. do not use UNION if union all is used.

Union all does not execute the select distinct function, which reduces unnecessary resources.

Using UNION across multiple different databases is an interesting optimization method. UNION returns data from two unrelated tables, which means no duplicate rows will appear, data must also be sorted. we know that sorting is resource-consuming, especially for large tables.

Union all can greatly speed up. if you already know that your data does not include duplicate rows, or you don't care whether duplicate rows will appear, in both cases, union all is more suitable. In addition, some methods can be used in the application logic to avoid repeated rows. in this way, the results returned by union all and union all are the same, but union all is not sorted.

15. field data type optimization

(1 ). avoid using the NULL type: NULL requires special processing for most databases, and MySQL is no exception. it requires more code, more checks, and special index logic, some developers do NOT realize that NULL is the default value when creating a table, but not null should be used in most cases, or a special value, such as 0 and-1, should be used as the default value.

(2 ). use a smaller field as much as possible. After MySQL reads data from the disk, it is stored in the memory, and then reads it using the cpu cycle and disk I/O, this means that the smaller the data type, the smaller the space occupied, the better the efficiency from disk reading or packaging to memory, but do not persistently reduce the data type, if the application changes in the future, there will be no space. Modifying a table requires reconstruction, which may indirectly lead to code changes. this is a headache, so we need to find a balance point.

(3). The fixed length type is preferred.

17. Optimization of the limit distribution of large data volumes(When the offset is very large, the limit efficiency will be very low)

Attach a simple technique to improve the limit efficiency and cover the index (in general, the covering index only reads the index and obtains the data during the select statement, without the need to perform a secondary select related table) instead of offset the data in the whole row. You can join the data extracted from the covered Index and the full row data, and then obtain the required columns, which will be more efficient. let's look at the following query:

mysql> select film_id, description from sakila.film order by title limit 50, 5;

If the table is very large, it is best to write this query as follows:

mysql> select film.film_id, film.description from sakila.filminner join(select film_id from sakila.film order by title liimit 50,5) as film usinig(film_id);

18. if multiple data entries are inserted to the same table at one time in the program

For example, the following statement:

insert into person(name,age) values(‘xboy', 14);insert into person(name,age) values(‘xgirl', 15);insert into person(name,age) values(‘nia', 19);

It is more efficient to construct a statement.

insert into person(name,age) values(‘xboy', 14), (‘xgirl', 15),(‘nia', 19);

19. do not place an index on the selected column. this is meaningless.

The index should be properly placed on the statement selected by the condition, such as where and order.

SELECT id,title,content,cat_id FROM article WHERE cat_id = 1;

In the preceding statement, it is meaningless to place an index on id/title/content without any optimization for this statement. However, if you place an index on the foreign key cat_id, the function will be quite significant.

20. MySQL optimization of order by statements

(1). order by + LIMIT combination index optimization. If an SQL statement is like:

SELECT [column1],[column2],…. FROM [TABLE] ORDER BY [sort] LIMIT [offset],[LIMIT];

This SQL statement is easy to optimize. you can create an index on the [sort] field.

(2). The index optimization of the WHERE + order by + LIMIT combination, such:

SELECT [column1],[column2],…. FROM [TABLE] WHERE [columnX] = [VALUE] ORDER BY [sort] LIMIT [offset],[LIMIT];

If you still use the index creation method in the first example, you can use the index, but the efficiency is not high. A more efficient method is to create a composite index (columnX, sort)

(3). The index optimization of the WHERE + IN + order by + LIMIT combination, such:

SELECT [column1],[column2],…. FROM [TABLE] WHERE [columnX] IN ([value1],[value2],…) ORDER BY [sort] LIMIT [offset],[LIMIT];

If you use the index creation method in the second example, this statement will not achieve the expected results (only using index in [sort], WHERE where is using WHERE; using filesort ), the reason is that columnX values correspond to multiple values.

At present, you can find a better way to wait for expert advice.

(4). WHERE + order by multiple columns + LIMIT, for example:

SELECT * FROM [table] WHERE uid=1 ORDER x,y LIMIT 0,10;

For this statement, you may add an index as follows: (x, y, uid ). But in fact, the better effect is (uid, x, y ). This is caused by the sorting mechanism of MySQL.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.