Comparison of the efficiency of using the or, in, and unionall query commands in MySQL _ MySQL

Source: Internet
Author: User
This article mainly introduces the comparison of the efficiency of using or, in, and unionall in the query command in MySQL, and demonstrates that unionall is not necessarily faster than or and in, if you need a friend, refer to OR, in, and union all queries. which one is faster?
Many voices on the Internet say that union all is faster than or and in, because or and in will cause full table scans, and they provide many examples.
But is union all really faster than or in?

EXPLAIN SELECT * from employees where employees.first_NAME ='Georgi' UNION ALL SELECT * from employees where employees.first_NAME ='Bezalel'

The statement has 481 execution results and the execution time is 0.35 s.

PRIMARY employees ALL 300141 Using whereUNION employees ALL 300141 Using where UNION RESULTALL explain SELECT * FROM employees WHERE employees.first_name IN ('Georgi','Bezalel')

The execution result of this statement is 0.186 s.

SIMPLE  employees  ALL  300141  Using whereexplain SELECT * FROM employees WHERE employees.first_name ='Georgi' or employees.first_name='Bezalel'

The execution result of this statement is similar to that of in.

Is the online statement incorrect? Is it related to indexes? An index is created on firstname.

Re-execute

The execution Plan of union is as follows, and the execution time is 0.004 s.

PRIMARY  employees  ref  index_firstname  index_firstname  44  const  253  Using whereUNION  employees  ref  index_firstname  index_firstname  44  const  228  Using whereUNION RESULTALL

The execution plan of in is as follows, and the execution time is also 0.004 s.

SIMPLE  employees  range  index_firstname  index_firstname  44  481  Using where

Or, the execution plan is as follows, and the execution time is also 0.004 s.

SIMPLE  employees  range  index_firstname  index_firstname  44  481  Using where

The performance is almost the same. But pay attention to the type in the execution plan. ref is better than range (ref is a non-unique index scan, and range is an index range scan)
Suddenly it seems like it is similar to what I said on the Internet, but will the first statement be less efficient than a range scan after two ref scans.

Otherwise, I will try the primary key again. this is the only one. Will it continue to work with the Internet?

EXPLAIN SELECT * FROM employees WHERE employees.EMP_NO=100001 UNION ALL SELECT * FROM employees WHERE employees.EMP_NO=101100

The execution Plan of union is as follows:

PRIMARY  employees  const  PRIMARY  PRIMARY  4  const  1  UNION  employees  const  PRIMARY  PRIMARY  4  const  1  UNION RESULTALL  EXPLAIN SELECT * FROM employees WHERE employees.EMP_NO IN (100001 ,101100)

The execution plan of in is as follows:

SIMPLE  employees  range  PRIMARY  PRIMARY  4  2  Using whereEXPLAIN SELECT * FROM employees WHERE employees.EMP_NO=100001 OR emp_no=101100

Or's execution plan is as follows:

SIMPLE  employees  range  PRIMARY  PRIMARY  4  2  Using where

The result is similar to that of the second experiment.


Next, we will use examples to discuss how to compare the efficiency of the query commands.
1: Create a table, insert data, the data volume is 10 million [otherwise the effect is not obvious ].

drop table if EXISTS BT; create table BT(   ID int(10) NOT NUll,   VName varchar(20) DEFAULT '' NOT NULL,   PRIMARY key( ID ) )ENGINE=INNODB;

This table only has two field IDs as the primary key [similar to the index page], and one is a common field. (Just use a simple table structure when you are lazy)
Insert 10 million data records into the BT table
Here I wrote a simple stored procedure [so your mysql version must be at least 5.0, and your mysql version is 5.1]. the code is as follows.
Note: Best

  INSERT INTO BT ( ID,VNAME ) VALUES( i, CONCAT( 'M', i ) );---1

Change

  INSERT INTO BT ( ID,VNAME ) VALUES( i, CONCAT( 'M', i, 'TT' ) );---2

The reason for modification is
Use 1 for full scanning of non-index columns and vnames.
If the non-index column and VNAME use full table scan, use 2.

DROP PROCEDURE IF EXISTS test_proc; CREATE PROCEDURE test_proc() BEGIN declare i int default 0; set autocommit = 0; while i<10000000 do INSERT INTO BT ( ID,VNAME ) VALUES( i, CONCAT( 'M', i ) ); set i = i+1; if i%2000 = 0 then commit; end if; end while; END;

It's easy to leave comments alone.
The stored procedure is best to set innob related parameters [mainly related to logs and write cache to speed up insertion]. I didn't set to insert 10 million pieces of data for 6 minutes.
Some data is as follows: 10 million data is similar

2: Practice
2.1: use or, in, union all
The table we created only has primary key indexes, so we can only use ID for queries. The time consumed for the three data IDs: 98,85220 and 9888589 is as follows:

The time is 0.00. how can this happen? all queries are in milliseconds.
I use other tools-EMS SQL Manager for mysql
The query display time is
93 ms, 94 ms, 93 ms, time difference how much can be ignored.
Then we are looking at our respective execution plans.

The field type and ref must be noted here.
We found that the type used by union all [type is used to display the type used for the connection] is ref, and the or and in are range [ref connection type is better than range, there is not much difference ], the number of rows to be queried is the same. [the rows field is 3 ].
From the whole process, the use of constants or and in and union all queries in the index column is not much different.
However, in some complex queries, the use of or and in for index columns is much slower than that for union all. This may be because your query is not properly written, let mysql discard the index and perform a full table scan.
2.2: use or, in, and union all in non-indexed columns.
We can check the time consumption of the three data with VNAME M98, M85220, and M9888589 as follows:

We found that the query time for union all is almost three times that of or and in.
Why? let's not talk about it first. let's take a look at the three query plans.

The plan is almost the same here.
However, we should note that at this time, for or and in, only one table scan is performed, that is, the rows column is 9664782.
For union all, the table is scanned three times, that is, the sum of rows is 9664782*3.
This is why we can see that union all is almost three times.
Note: If you use the stored procedure to use the second SQL statement, all the type columns of the execution plan are all. In fact, this is what I want to demonstrate most, but now I have finished writing it, and I will find the problem will be wrong.

3: Summary
3.1: do not believe that union all is faster than or in. analyze which situation is used based on the actual situation.
3.2: it is best to use union all for index columns. because complex queries such as [include operations] will cause or and in to discard the index and scan the entire table, unless you are sure that or and in will use indexes.
3.3: for non-index fields only, you can honestly use or in, because non-index fields would have to be scanned in the entire table, and union all only doubles the number of table scans.
3.4: You can also use or, in, or union all for indexed and indexed fields that contain non-indexed fields,
However, I recommend or in.
Perform the following query:

select * from bt where bt.VName = 'M98' or bt.id ='9888589'  select * from bt where bt.VName = 'M98' UNION ALL select * from bt where bt.id = '9888589'

The difference between the two queries mainly depends on the query duration of the index column. if the query time of the index column is too long, use or in instead.
3.5: The above is mainly for a single table, while for multi-table joint queries, there are a lot of considerations, such as the connection method, query table data volume distribution, index, etc, select the appropriate keywords based on the single-table policy.

The above section compares the efficiency of using the or, in, and union all query commands in MySQL _ MySQL content. For more information, see PHP Chinese network (www.php1.cn )!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.