The two should not be comparable in nature, distinct take out the unique column, group by is a grouping, but sometimes in optimization, when there is no aggregation function, they find the same results.
For example, it might be a little easier.
Table A
ID num
A 1
B 2
C 3
A 4
C 7
D 3
E 5
If only the ID column is selected, the same as distinct and group by.
SELECT DISTINCT (ID) from A;
Id
A
B
C
D
E
Select ID from the A Group by ID;
Id
A
B
C
D
E
The difference may be that group by has a sort function.
But if you need to add another column of num, the result is different.
Group BY is a grouping statement, if
Select Id,num from A Group by Id,num;
The result in this case is the same as no group BY, because Num is different.
But if
Select Id,num from A Group by ID;
Note that the statement is an error statement because NUM does not use a clustered function, for example: sum (SUM), AVG (averaging)
Select Id,sum (num) from A Group by ID;
ID sum (num)
A 5
B 2
C 10
D 3
E 5
Duplicate rows are not displayed with distinct.
In this example
Select distinct id,num from A; The results are also consistent with the non-distinct.
Because Id,num does not have duplicate rows, instead of just looking at the ID.
The group by feature is more powerful, and group by is also recommended.
Because distinct can cause a full table scan, and group by if the index is built
If appropriate, there will be a performance improvement.
which distinct and group by which is more efficient?
The distinct operation only needs to find out all the different values. The group by operation also prepares the other aggregation functions. From this point on, the GROUP by operation should do more work than distinct.
But in fact, GROUP by efficiency will be higher, why? For the distinct operation, it reads all the records, and the number of records that the group by needs to read is as many as the number of groups grouped, that is, much less than the number of records that actually exist.
(go) Differences between database distinct and GROUP by