An analysis of the differences between Union and join statements used in Mysql tutorial _mysql

Source: Internet
Author: User
Tags joins mysql tutorial one table

Union and join is the need to unite more than one table common related words, the specific concept I do not say, want to know online search on the line, because I also remember inaccurate.
First of all, the difference: union the operation of two tables is to combine the number of data bars, which is equal to the portrait, which requires that both table fields must be the same (Schema of both sides of Union should match.). In other words, if there are three data in table A and there are two data in table B, then a union B will have five data. Explain the difference between Union and union all, and if the same data records are merged for Union, and union all does not merge the same data records, how many records there will be. For example, execute the following statement under MySQL:

SELECT * from Tmp_libingxue_a;
Name Number
Libingxue 1001
Yuwen 1002
select * from Tmp_libingxue_b;
Name Number
Libingxue 1001
Feiyao 1003
SELECT * FROM tmp_libingxue_a Union SELECT * FROM tmp_libingxue_b;
   
    libingxue 1001
Yuwen 1002
Feiyao 1003
SELECT * FROM tmp_libingxue_a UNION ALL SELECT * FROM Tmp_libingxue_ b;
Libingxue 1001
Yuwen 1002
libingxue 1001 Feiyao
1003

   

However, this is not possible in the hive and executes the SELECT * FROM tmp_libingxue_a UNION ALL SELECT * from Tmp_libingxue_b; failed,hive the Union must be in a subquery. Such as

SELECT * FROM (SELECT * To tmp_yuwen_a UNION ALL SELECT * Tmp_yuwen_b) t1;

Note that it must be union all, and alone with union it will prompt you to lack all, and the following T1 must be written, you can write a or B, but be sure to write, do not write will be wrong.
The join is biased to the horizontal union, only in favor of, and so on detailed description. A join is looser than a union, a two-table field is not required, a join with no restrictions is equal to the Cartesian product of two tables, all joins need to be constrained, and a restricted join is horizontal expansion. Joins that meet the constraints are extracted and are not satisfied to be filtered directly. The usage can be very flexible, and here are two simple examples:

SELECT * FROM (SELECT * to tmp_yuwen_a) T1 join (SELECT * from Tmp_yuwen_b) T2;
SELECT * from tmp_yuwen_a T1 join (SELECT * from Tmp_yuwen_b) T2; 

The left outer join is similar to the right outer join usage, except that the left-hand outer join selects all the fields from the table on the other side, and the fields in the right-hand table select the criteria, and all the unsatisfied empty, that is, the table on the left. The right outer join is the same as a reference to the right-hand table. The difference between these three joins has been repeated many times, and there are more detailed explanations on the web, not to mention.
The same point: In certain cases, you can use join to implement the Union ALL function, this condition is conditional, when this happens, choose union All or group by can see the situation or look at the consumption of both the decision. SQL although in so few key words, but changeable, powerful, as long as you can achieve the desired function, how to use whatever you want. Requirements situation SQL Simple reproduce the following

 drop table Tmp_libingxue_resource Create external table if not exists Urce (user_id string, shop_id string, auction_id string, search_time string) partitioned by (PT string) row form

At delimited fields terminated by ' \ t ' lines terminated by ' \ n ' stored as sequencefile;
drop table Tmp_libingxue_result; Create external table if not exists Tmp_libingxue_result (user_id string, shop_id string, auction_id string, sear  Ch_time string) Partitioned by (PT string) row format delimited fields terminated by ' \ t ' lines terminated by ' \ n ' stored

As Sequencefile;

Insert Overwrite table Tmp_libingxue_result where (pt=20041104) select * from Tmp_libingxue_resource; 
Sudo-u Taobao Hadoop dfs-rmr/group/tbads/warehouse/tmp_libingxue_result/pt=20041104
sudo-u Taobao Hadoop Jar/hom E/taobao/dataqa/framework/dailyreport.jar Com.alimama.loganalyzer.tool.SeqFileLoader Tmp_libingxue_resource.txt hdfs://v039182.sqa.cm4:54310/group/tbads/warehouse/tmp_libingxue_result/pt=20041104/part-00000 

Hive> select * from Tmp_libingxue_resource;
OK
2001 0  20041104
2002 0  102  20041104
Hive> select * from Tmp_libingxue_result;
OK
2001 0  20041104
2002 0  20041104

Select User_id,shop_id,max (auction_id), Max (Search_time)
from
(SELECT * FROM Tmp_libingxue_resource 
UNION ALL
SELECT * from tmp_libingxue_result) T1
Group by user_id,shop_id;
2001 0
2002 0  104

Select T1.user_id,t1.shop_id,t2.auction_id,t2.search_time
from
(SELECT * from Tmp_libingxue_resource) t1
Join
(SELECT * from Tmp_libingxue_result) T2 on
t1.user_id=t2.user_id and t1.shop_id=t2.shop_id;
2001 0
2002 0  104



With the preceding introduction, using union to work with the result set of a table and to connect multiple tables using joins, the two are fundamentally different.
An example of an operation using the union operator to connect two table records is given below.
A typical union operation for two-table records

Assume that there are two tables Table3 and Table4, and that the columns and data that they contain are shown below.

Table1 database table

Table2 database table

Table1 tables and Table2 tables have the same column structure, so you can use the union operator to connect a two-table recordset, and the resulting connection results are shown in the following table.

Use Union to connect Table3 table and Table4 table records

The implementation code for the above connection process can be expressed as follows:

SELECT * from
Table1
UNION
select *
Table2

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.