MySQL under distinct and group by difference contrast

Last Update:2017-01-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The bibliography used in the user verification is recorded in the datasheet, and now I want to take out all the bibliographies, using distinct and group by to get the results I want, but I find that the return results are arranged in a different order, and the distinct will be displayed in the sequence of data, and group By Will do a sort (typically ASC).

DISTINCT is actually very similar to the implementation of a group by operation, except that only one record is taken out of each group by after the groups by. Therefore, the implementation of the DISTINCT and GROUP by implementation is almost the same, there is not much difference, the same can be done through a loose index scan or a compact index scan.

which distinct and group by which efficiency is higher?

The distinct operation only needs to find out all the different values. The group by operation also prepares for other aggregate functions. From this point on, the GROUP by operation should do more work than distinct.
But in fact, GROUP by efficiency will be higher, why? For the distinct operation, it reads all the records, and group by needs to read as many records as there are groups, which means a lot less than the actual number of records.

Example AA Table A B
123 10
123 12
1234 11
1234 14
First group is used to group, not to filter duplicates. Duplicate DELETE statement distinct use this. Select DISTINCT (a) from AA
The result is a.
123
1234

Group BY for grouping

Select a, sum (b) from AA GROUP by a

Sum means the sum. The result is
A b
123 22
1234 25
The purpose of the statement is to use a as a target to know the number of items in column B that have the same name in total

Select A,count (b) from AA GROUP by a

Count meaning line sum result is
A b
123 2
1234 2

The object of the statement is a few lines with the same name

Distinct and group by performance comparisons in MySQL

Test process:

Prepare a test sheet

CREATE TABLE ' Test_test ' (
' id ' int (one) not NULL auto_increment,
' num ' int (one) not NULL default ' 0 ',
PRIMARY KEY (' id ')
) Engine=myisam DEFAULT Charset=utf8 auto_increment=1;

Create a stored procedure to insert 10W data into a table

CREATE PROCEDURE p_test (PA int (11))
Begin

DECLARE max_num int (one) default 100000;
declare i int default 0;
declare rand_num int;

Select COUNT (ID) into the max_num from Test_test;

While I < PA do
If Max_num < 100000 then
Select CAST (rand () *100 as unsigned) into rand_num;
INSERT into test_test (num) values (rand_num);
End If;
Set i = i +1;
End while;
End

Call stored Procedure Insert data

1 call P_test (100000);

Start test: (without index)

SELECT distinct num from test_test;
Select num from test_test GROUP by NUM;

[SQL] SELECT distinct num from test_test;
Affected rows: 0
Time: 0.078ms

Select num from test_test GROUP by NUM;
Affected rows: 0
Time: 0.031ms

Two, the NUM field to create the index

ALTER TABLE ' test_test ' ADD INDEX ' num_index ' (' num ');

Query again

SELECT distinct num from test_test;
Select num from test_test GROUP by NUM;
[SQL] SELECT distinct num from test_test;
Affected rows: 0
Time: 0.000ms

Select num from test_test GROUP by NUM;
Affected rows: 0
Time: 0.000ms

At this point we found that the time is too small for 0.000 seconds can not be accurate.

We go to the command line to test

Mysql> set profiling=1;
Mysql> SELECT DISTINCT (num) from Test_test;
Mysql> Select num from test_test GROUP by NUM;
Mysql> Show Profiles;
+----------+------------+----------------------------------------+
| query_id | Duration | Query |
+----------+------------+----------------------------------------+
| 1 | 0.00072550 | SELECT DISTINCT (num) from Test_test |
| 2 | 0.00071650 | Select num from test_test GROUP by num |
+----------+------------+----------------------------------------+

After indexing, distinct is 107 times times faster than the distinct without index.

GROUP BY IS 43 times times faster than group by without index after indexing.

Compare: Distinct and GROUP by

No index group BY is faster than distinct. Therefore, it is recommended to select group by when using

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

MySQL under distinct and group by difference contrast

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

MySQL under distinct and group by difference contrast

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support