This section describes collection operations in MySQL and MySQL collection operations.
What is a set operation?
Generally, the join operation is regarded as a horizontal operation between tables because the virtual table generated by this operation contains columns in two tables. The Set Operations I have summarized here generally regard these operations as vertical operations. The MySQL database supports two set operations: union distinct and union all.
Like join operations, set operations operate on two inputs and generate a virtual table. In join operations, the input table is generally referred to as left input and right input. The two inputs of the set operation must have the same number of columns. If the data type is different, MySQL automatically performs implicit conversion. At the same time, the name of the result column is determined by the Left input.
Preparations
Prepare test tables table1 and table2:
create table table1 (aid int not null auto_increment, title varchar(20), tag varchar(10), primary key(aid)) engine=innodb default charset=utf8;create table table2 (bid int not null auto_increment, title varchar(20), tag varchar(10), primary key(bid)) engine=innodb default charset=utf8;
Insert the following test data:
insert into table1(aid, title, tag) values(1, 'article1', 'MySQL');insert into table1(aid, title, tag) values(2, 'article2', 'PHP');insert into table1(aid, title, tag) values(3, 'article3', 'CPP');insert into table2(bid, title, tag) values(1, 'article1', 'MySQL');insert into table2(bid, title, tag) values(2, 'article2', 'CPP');insert into table2(bid, title, tag) values(3, 'article3', 'C');
UNION DISTINCT
Union distinct combines two inputs and uses DISTINCT to filter duplicate items. Generally, the DISTINCT keyword can be omitted and UNION is used directly.
The UNION syntax is as follows:
SELECT column,... FROM table1 UNION [ALL]SELECT column,... FROM table2...
In multiple SELECT statements, the corresponding column should have the same field attribute, and the field name used in the first SELECT statement is also used for the result field name.
Run the following SQL statement:
(select * from table1) union (select * from table2);
The following result is displayed:
+-----+----------+-------+| aid | title | tag |+-----+----------+-------+| 1 | article1 | MySQL || 2 | article2 | PHP || 3 | article3 | CPP || 2 | article2 | CPP || 3 | article3 | C |+-----+----------+-------+
We found that duplicate data items in table 1 and table 2:
| 1 | article1 | MySQL |
It only appears once, which is the effect of UNION.
Currently, the implementation of union distinct in MySQL is as follows:
- Create a temporary table, that is, a virtual table;
- Add a unique index to the column of this temporary table;
- Insert the input data into a temporary table;
- Returns the virtual table.
Because a unique index is added, duplicate data items in the set can be filtered out. Here, repetition means that the fields selected by the SELECT statement are exactly the same, so they are counted as duplicates.
UNION ALL
Union all means that repeated data items are not excluded. For example, I run the following SQL statement:
(select * from table1) union all (select * from table2);
You will get the following results:
+-----+----------+-------+| aid | title | tag |+-----+----------+-------+| 1 | article1 | MySQL || 2 | article2 | PHP || 3 | article3 | CPP || 1 | article1 | MySQL || 2 | article2 | CPP || 3 | article3 | C |+-----+----------+-------+
Duplicate data is not filtered out.
When union distinct is used, the insert speed is obviously affected because a unique index is added to the temporary table. If you confirm that there are no repeated options in the UNION operation, the most effective method is to use union all.