Comparison of efficiency under query commands using or, in and union in Mysql _mysql

Source: Internet
Author: User
Tags mysql version

What is the efficiency of the or, in, and union ALL queries?
A lot of the sound on the web is that union all is faster than or, in, because or, in can cause full table scans, they give a lot of examples.
But really union all is really faster than or, in?

EXPLAIN SELECT * FROM employees where Employees.first_name = ' Georgi ' UNION all SELECT * FROM Employees where Employees.fir St_name = ' Bezalel '

This statement executes the result 481, the execution time is 0.35s

PRIMARY employees all 300141 the using where

union employees all 300141 the using where

 Union result <union1,2> all< C5/>explain SELECT * FROM Employees WHERE Employees.first_name in (' Georgi ', ' Bezalel ')

The execution result time of this statement is 0.186s

Simple  Employees  all  300141 the  Using where

explain SELECT * FROM employees where employees.first_ name = ' Georgi ' or employees.first_name= ' Bezalel '

The execution of this statement is the same as the result in

Is it wrong on the internet? Does it have something to do with the index? An index was established on the FirstName

Re-executing

The execution plan of Union is as follows, the execution time is 0.004s

PRIMARY  Employees  ref  Index_firstname  index_firstname  -  const  253  Using where
UNION  Employees  ref  Index_firstname  index_firstname  -  const  228  Using where
UNION result  <union1,2>  all  

The execution plan in is as follows, the execution time is also 0.004s

Simple  Employees  range  index_firstname  index_firstname  481  Using where

The execution plan for or is as follows, and the execution time is 0.004s

Simple  Employees  range  index_firstname  index_firstname  481  Using where

Feel the same performance. But note that the type,ref in the execution plan is better than range oh (ref is a non unique index scan, range is an index range scan)
Suddenly feel like talking to the Internet, but the first statement to walk two ref scan will not be more efficient than the scan to go a range of low.

Or I try the primary key again, this is the only, will not and online effect has been?

EXPLAIN SELECT * FROM Employees WHERE employees. emp_no=100001 UNION All SELECT * Employees WHERE employees. emp_no=101100

The Union's execution plan is as follows

PRIMARY  Employees  Const  PRIMARY  PRIMARY  4  const  1  
UNION  Employees  Const  PRIMARY  PRIMARY  4  const  1  
UNION result  <union1,2>  

All EXPLAIN SELECT * FROM Employees WHERE employees. Emp_no in (100001, 101100)

The implementation plan in IS as follows

Simple  Employees  range  PRIMARY  PRIMARY  4  2  Using where

EXPLAIN SELECT * From Employees WHERE employees. emp_no=100001 OR emp_no=101100

The execution plan for or is as follows

Simple  Employees  range  PRIMARY  PRIMARY  4  2  Using where

The sensory results are similar to the second experiment.


Here's an example of how the efficiency comparisons between them are based on actual query commands.
1: Create a table, insert data, the amount of data is 10 million "or effect is not obvious."

drop table if EXISTS BT; 
CREATE TABLE BT ( 
  ID int () NOT NULL, 
  vname varchar DEFAULT "NOT NULL, 
  PRIMARY key (ID) 
) engine=inn ODB; 

The table has only two field IDs as the primary Key "index page is similar" and one is a normal field. (Lazy to use simple table structure?)
Insert 10 million data into the BT table
Here I wrote a simple stored procedure "so your MySQL version is at least greater than 5.0, my version is 5.1", the code is as follows.
Note: the best

  INSERT into BT (id,vname) VALUES (i, CONCAT (' M ', i));---1

Amended to

  INSERT into BT (id,vname) VALUES (i, CONCAT (' M ', I, ' TT '));---2

Reason for modification in
Non-indexed columns and VName use a federated for full scan use 1.
Non-indexed columns and vname use a full table scan please use 2.

DROP PROCEDURE IF EXISTS test_proc; 
CREATE PROCEDURE Test_proc () 
BEGIN 
declare i int default 0; 
Set autocommit = 0; 
While i<10000000 does 
INSERT into BT (id,vname) VALUES (i, CONCAT (' M ', i)); 
Set i = i+1; 
If i%2000 = 0 then 
commit; 
End If; 
End While; 
End; 

Don't write notes, it's quite simple.
Stored procedures are best set under the Innob related parameters "primary and log, write cache related to this faster insertion", I did not set insert 10 million data inserted 6 minutes.
Some of the data are as follows: 10 million data similar

2: Actual combat
2.1: Use or, in, union All, respectively, on the index column
The table we created has only primary key indexes, so we can only query with IDs. We look for three data with ID 98,85220,9888589 as follows:

Time is 0.00, how can this be, hehe all queries are at the millisecond level.
I use other tools--ems SQL Manager for MySQL
Query shows time as
MS, 94ms,93 MS, the time difference between how much can almost ignore.
And we're looking at our respective execution plans.

The field type and ref field to note here
We find that the type "type" used by union all is the same as the "ref" or and in "range" ref connection type for the display connection is better than range, but not much, but the query rows are the same. "See Rows fields are all 3".
From the whole process, there is not much difference between using the constant or and in and union all queries in an indexed column.
But why is it that in some complex queries, the indexed column uses or and in more slowly than the union all, which may be that your query is not written reasonably well enough to allow MySQL to discard the index and perform a full table scan.
2.2: Use or, in, and union all in a non indexed column.
We check VName for m98,m85220,m9888589 three data each time is as follows:

We find out why union ALL query time is almost three times times that of OR and in.
This is why, we do not say, first look at three of the query plan.

Here we find that the plan is almost the same.
But we need to be aware that the scan at this time for or and in the table scan only once that rows is listed as 9664782.
For union all, the table is scanned three times, and the rows are 9664782*3.
That's why we see the union all as being almost three times times the reason.
Note: If you use a stored procedure to use the second SQL for all of the type listed in this execution plan, this is what I would like to demonstrate, but it is now almost finished to find the problem mistake.

3: summary
3.1: Do not superstition union all is faster than or and in, to combine the actual situation to analyze what kind of situation to use.
3.2: It is best to use union ALL for indexed columns, because complex query "include operations" will cause or, in discard index and full table scan, unless you can determine if or, in will use the index.
3.3: For a non-indexed field to say you are honest with or or in, because the non-indexed fields would have to be full table scan and union ALL only multiply the number of table scans.
3.4: For and Indexed Fields "indexed fields are valid" and contains non-indexed fields, you may also use or, in, or union ALL to
But I recommend using or, in.
such as the following query:

SELECT * from BT where BT. VName = ' M98 ' or bt.id = ' 9888589 ' 
 
select * from BT where BT. VName = ' M98 ' 
UNION all 
select * from BT where bt.id = ' 9888589 ' 

The difference in the speed of the two queries depends largely on the length of the indexed column query, such as if the index column query is too long, and you use either or in instead.
3.5: The above is mainly for a single table, and multiple table joint query, consider the place is more, such as the connection mode, query table data distribution, index, and then combined with a single table strategy to choose the right keyword.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.