SQL optimization-Database SQL optimization--using exist instead of in

Source: Internet
Author: User

Database SQL optimization--using exist instead of in

1, query optimization, should try to avoid full table scan

To optimize queries, avoid full table scans, and first consider indexing on the columns involved in the Where and order by

. Try the following tips to avoid the optimizer mistakenly selecting a table scan:

· Use analyze tabletbl_name to update the keyword distribution for scanned tables.

· Using Forceindex to tell MySQL about scanned tables is time-consuming relative to using a given index table scan.

    SELECT * FROM t1, t2 FORCE INDEX (index_for_column)   WHERE t1.col_name=t2.col_name;

· Start mysqld with the –max-seeks-for-key=1000 option or use set max_seeks_for_key=1000 to tell the optimizer to assume that the keyword scan will not exceed 1,000 keyword searches.

1). Avoid null values of fields in the WHERE clause as far as possible

否则将导致引擎放弃使用索引而进行全表扫描,如:select id from t where num is nullNULL对于大多数数据库都需要特殊处理,MySQL也不例外,它需要更多的代码,更多的检查和特殊的索引逻辑,有些开发人员完全没有意识到,创建表时NULL是默认值,但大多数时候应该使用NOT NULL,或者使用一个特殊的值,如0,-1作为默  认值。不能用null作索引,任何包含null值的列都将不会被包含在索引中。即使索引有多列这样的情况下,只要这些列中有一列含有null,该列    就会从索引中排除。也就是说如果某列存在空值,即使对该列建索引也不会提高性能。 任何在where子句中使用is null或is not null的语句优化器是不允许使用索引的。此例可以在num上设置默认值0,确保表中num列没有null值,然后这样查询: select id    from t where num=0

2). You should try to avoid using the! = or <> operator in the WHERE clause

 否则将引擎放弃使用索引而进行全表扫描。 MySQL只有对以下操作符才使用索引:<,<=,=,>,>=,BETWEEN,IN,以及某些时候的LIKE。  可以在LIKE操作中使用索引的情形是指另一个操作数不是以通配符(%或者_)开头的情形。例如: SELECT id FROM  t WHERE col LIKE ‘Mich%‘; #  这个查询将使用索引, SELECT id FROM  t WHERE col  LIKE ‘%ike‘;   #这个查询不会使用索引。

3). You should try to avoid using or in the WHERE clause to join the condition

否则将导致引擎放弃使用索引而进行全表扫描,如:select id from t where num=10 or num=20可以 使用UNION合并查询: select id from t where num=10 union all select id from t where num=20


In some cases, the or condition can avoid full-table scanning.

1 .where 语句里面如果带有or条件, myisam表能用到索引, innodb不行。   2 .必须所有的or条件都必须是独立索引

MySQL or condition can use indexes to avoid full table

4). In and not in also be cautious, otherwise it will cause a full table scan,

如:select id from t where num in(1,2,3)对于连续的数值,能用 between 就不要用 in 了:Select id from t where num between 1 and 3

5). The following query will also cause a full table scan:

select id from t where name like ‘%abc%‘ 或者select id from t where name like ‘%abc‘ 或者若要提高效率,可以考虑全文检索。而select id from t where name like ‘abc%‘ 才用到索引

7). If you use a parameter in the WHERE clause, it also causes a full table scan.

Because SQL resolves local variables only at run time, the optimizer cannot defer the selection of access plans to run time; it must be selected at compile time. However, if an access plan is established at compile time, the value of the variable is still unknown and therefore cannot be selected as an input for the index. The following statement will perform a full table scan:

Select ID from t where [email protected]

You can force the query to use the index instead: Select ID from the T with (index name) where [email protected]

8). You should try to avoid expression operations on the fields in the WHERE clause,

This causes the engine to discard the full table scan using the index. Such as:

Select ID from t where num/2=100

Should read: Select ID from t where num=100*2

9). You should try to avoid function operations on the fields in the WHERE clause,

   这将导致引擎放弃使用索引而进行全表扫描。如:  select id from t where substring(name,1,3)=‘abc‘   --name  select id from t where datediff(day,createdate,‘2005-11-30‘)=0--‘2005-11-30’   生成的id 应改为:  select id from t where name like ‘abc%‘  select id from t where createdate>=‘2005-11-30‘ and createdate<‘2005-12-1‘

10). Do not perform functions, arithmetic operations, or other expression operations on the left side of the "=" in the WHERE clause.

  否则系统将可能无法正确使用索引。

11). The index field is not the prefix index of the composite Index

   例如 在使用索引字段作为条件时,如果该索引是复合索引,那么必须使用到该索引中的第一个字段作为条件时才能保证系统使用该索引,否则该索引将不会起作用

2. Some other attention optimizations:
12). Do not write some meaningless queries,

    如需要生成一个空表结构:    select col1,col2 into #t from t where 1=0    这类代码不会返回任何结果集,但是会消耗系统资源的,应改成这样: create table #t(...)

13). A lot of times replacing in with exists is a good choice:

   select num from a where num in(select num from b)   用下面的语句替换:   select num from a where exists(select 1 from b where num=a.num)

14). Not all indexes are valid for the query.

   SQL是根据表中数据来进行查询优化的,当索引列有大量数据重复时,SQL查询可能不会去利用索引,如一表中有字段sex,male、female几乎各一半,那么即使在sex上建了索引也对查询效率起不了作用。

15). The index is not the more the better,

   索引固然可以提高相应的 select 的效率,但同时也降低了 insert 及 update 的效率,因为 insert 或 update 时有可能会重建索引,所以怎样建索引需要慎重考虑,视具体情况而定。一个表的索引数最好不要超过6个,若太多则应考虑一些不常使用到的列上建的索引是否有必要。

16). Avoid updating clustered index data columns whenever possible.

   因为 clustered 索引数据列的顺序就是表记录的物理存储顺序,一旦该列值改变将导致整个表记录的顺序的调整,会耗费相当大的资源。若应用系统需要频繁更新 clustered 索引数据列,那么需要考虑是否应将该索引建为 clustered 索引。

17). Use numeric fields as much as possible,

  若只含数值信息的字段尽量不要设计为字符型,这会降低查询和连接的性能,并会增加存储开销。这是因为引擎在处理查询和连接时会逐个比较字符串中每一个字符,而对于数字型而言只需要比较一次就够了。

18). Use Varchar/nvarchar instead of Char/nchar as much as possible,

  因为首先变长字段存储空间小,可以节省存储空间,其次对于查询来说,在一个相对较小的字段内搜索效率显然要高些。

19). It is best not to use "" to return all: Select from T,

 用具体的字段列表代替“*”,不要返回用不到的任何字段。

Issues with temporary tables:
20). Try to use table variables instead of temporary tables.

If the table variable contains a large amount of data, be aware that the index is very limited (only the primary key index).

21). Avoid frequent creation and deletion of temporary tables to reduce the consumption of system table resources.

22). Temporary tables are not unusable.

 适当地使用它们可以使某些例程更有效,例如,当需要重复引用大型表或常用表中的某个数据集时。但是,对于一次性事件,最好使用导出表。

23). When you create a temporary table, if you insert a large amount of data at one time, you can use SELECT INTO instead of CREATE table to avoid creating a large number of logs to increase the speed;

 如果数据量不大,为了缓和系统表的资源,应先create table,然后insert。

24). If a temporary table is used, be sure to explicitly delete all temporary tables at the end of the stored procedure, TRUNCATE table first, and then drop table, which avoids longer locking of the system tables.

Problems with cursors:
25). Avoid using cursors as much as possible,

Because cursors are inefficient, you should consider rewriting if the cursor is manipulating more than 10,000 rows of data.

26). Before using a cursor-based method or a temporary table method,

  应先寻找基于集的解决方案来解决问题,基于集的方法通常更有效。

27). As with temporary tables, cursors are not unusable.

 对小型数据集使用 FAST_FORWARD 游标通常要优于其他逐行处理方法,尤其是在必须引用几个表才能获得所需的数据时。在结果集中包括“合计”的例程通常要比使用游标执行的速度快。如果开发时间允许,基于游标的方法和基于集的方法都可以尝试一下,看哪一种方法的效果更好。

28). Set NOCOUNT on at the beginning of all stored procedures and triggers, set NOCOUNT OFF at the end.

  无需在执行存储过程和触发器的每个语句后向客户端发送 DONE_IN_PROC 消息。

Issues with the transaction:
29). Try to avoid large transaction operation and improve the system concurrency ability.
Problem with Data volume
30). Avoid returning large amounts of data to the client, if the amount of data is too large, you should consider whether the corresponding requirements are reasonable.
Count Optimization:
COUNT (*) is better than count (1) and COUNT (Primary_key)
Many people use COUNT (1) and COUNT (Primary_key) instead of Count () in order to count the number of records, andthey think this is a better performance, in fact this is a myth. For some scenarios, this is more likely to be possible, andsome special optimizations should be made for the count () count operation for the database.
Count (column) and COUNT (*) are not the same
This myth is common even among many senior engineers or DBAs, and many people will take it for granted. In fact, Count (column) and COUNT (*) are a completely different operation and represent a completely different meaning.
Count (column) is a record that indicates how many column fields in the result set are not empty
COUNT (*) is a representation of how many records are in the entire result set

1) InnoDB engine in terms of statistics and MyISAM is different, MyISAM built-in a counter,

Count () when using the Select COUNT () from table without a query condition, the MyISAM directly extracts the data from the counter. and InnoDB must scan the whole table once to get the total number.

  1. But when there are query conditions, the query efficiency is the same.

  2. The primary key index is slow when the count (*)

    1. Optimize the order BY statement  
      based on the sort   of the index;
      one of MySQL's weaknesses is its sort. Although MySQL can query about 15,000 records in 1 seconds, MySQL will only use one index at a time when querying. Therefore, if the where condition already occupies the index, then the index is not used in the sort, which greatly reduces the speed of the query. We can look at the following SQL statement:  
      SELECT * from SALES where name = ' name ' ORDER by Sale_date desc; 
      has used n in the WHERE clause of SQL above The index on the AME field, so the index is no longer used when sorting the sale_date. To solve this problem, we can create a composite index of the sales table:  
      ALTER table SALES DROP index NAME, ADD index (name,sale_date)  
      This allows the speed of the first mate to be raised when queried using the SELECT statement above. Note, however, that when using this method, make sure that there are no sort fields in the WHERE clause, in the example above, you cannot query with sale_date, otherwise the query will slow down if the sort is fast, but there is no separate index on the Sale_date field.

      In some cases, MySQL can use an index to satisfy the ORDER BY clause without requiring additional sorting. The Where condition and order by use the same index, and the order by IS in the same sequence as the index, and the order by field is ascending or descending. For example, the following SQL can use an index.
      SELECT * from T1 ORDER by Key_part1,key_part2,...;
      SELECT * from T1 WHERE key_part1=1 ORDER by key_part1 desc, Key_part2 desc;
      SELECT * from T1 ORDER by key_part1 desc, Key_part2 desc;
      However, indexes are not used in the following cases:
      SELECT * from T1 ORDER by key_part1 desc, key_part2 asc; –order by Field mix ASC and DESC
      SELECT * from T1 WHERE key2=constant order by Key1; – the keyword used to query the row is not the same as that used in order by
      SELECT * from T1 ORDER by Key1, Key2; – Use ORDER by for different keywords:

    2. Optimize GROUP BY
      By default, MySQL sorts all GROUP by col1, col2, ..... The method of querying is like specifying ORDER by col1, col2, ... in a query. If you explicitly include an ORDER by with the same column
      clause, MySQL can optimize it without slowing down, although it is still sorted. If the query includes GROUP by but you want to avoid the consumption of sort results, you can specify order by NULL to prohibit sorting.
      For example:
      INSERT into Foo SELECT A and COUNT (*) from Bar GROUP by a ORDER by NULL;

    3. Optimize OR
      1. If there is an OR condition inside the where statement, the MyISAM table can use the index, InnoDB not.




SQL optimization-Database SQL optimization--using exist instead of in

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.