Connection policy for hive join Strategies Hive

Source: Internet
Author: User

Common Join

The most common join strategy, not affected by the size of the data, can also be called the reduce side join, the least efficient way to join. It is done by a mapreduce job.

First, the big table and the small table respectively map operation, in the phase of the map shuffle each map output key into Table_name_tag_prefix + Join_column_value, but in the process of partition it still Use only join_column_value for hashing.

Each reduce accepts all map-transmitted split, and in the shuffle phase of the REDUCCE, it compares the table_name_tag_prefix in front of the map output key to discard. Because the number of reduce can be determined by the size of the small table , for each node, reduce must be able to put the small table split into memory into Hashtable. Each record of the large table is then compared in one line.

The real join is in the reduce phase .

Mapjoin

The calculation step of Map Join is divided into two steps, the data of the small table becomes Hashtable broadcast to all map end, the data of large table is properly segmented, then the data of large table is used to detect (probe) the Hashtable of small table in the map stage. If join key is equal, it is written to HDFs.

The map join is called a map join because all of its work is computed on the map side.

Hive has made several optimizations on the map join:

In hive 0.6, the default is that a large table is behind a select, a small table in front, or a/*+mapjoin (map_table) */hint to set it. In hive 0.7, this calculation is automated, it first automatically determines which is the small table, which is the large table, this parameter is controlled by (hive.auto.convert.join=true). The size of the small table is then controlled by the (hive.smalltable.filesize=25000000l) parameter (the default is 25M), and when the small table exceeds this size, hive is converted to common join by default. You can view HIVE-1642.

First, the map phase of the small table transforms itself into a mapreduce Local Task, then takes all the data from HDFs, converts itself into a Hashtable file and compresses the package into Distributedcache.

The current hive map join has several limitations, one is that it intends to use Bloomfilter to implement Hashtable, Bloomfilter is probably 8-10 times more memory than Hashtable province, but the bloomfilter size is more difficult to control.

Now Distributedcache inside the Hashtable default copy is 3 copies, for a large table with 1000 map, this number is too small, most map operations are waiting for Distributedcache replication.

Optimized Map-join

Converting Common join into MAP join

Judge who is the big table who is the small table (the standard of the small table is the value of size less than hive.mapjoin.smalltable.filesize )

Hive generates a conditional task for each common join during the compile phase, and for each join table, the table is assumed to be a large table, generating a mapjoin task, And put these mapjoin tasks into

Conditional task (LIST<TASK<? Extends Serializable>> restasks), which also maps the alias of the large table and the corresponding Mapjoin task. At runtime, resolver reads the input file size for each table alias, if the file size of the small table is lower than the set threshold ( Hive.mapjoin.smalltable.filesize, the default value is 25M), then the converted Mapjoin task is executed. For each mapjoin task, a backup task is also set, which is the previous common join task, and once the Mapjoin task execution fails, the backup task is enabled

Performance bottleneck

Performance bottlenecks

1. Distributed Cache is the potential performance bottleneck

Distributed cache is a potential performance bottleneck

A, Large Hashtable file would slow down the propagation of distributed Cache

Large hashtable files will slow down the spread of the distributed cache

B, Mappers is waiting for the hashtables file from distributed Cache

Mapper queued to get hashtables from the distributed cache (because the default one hashtable cache is three copies, if the number of mappers is too many to wait for one)

2, Compress and archive all the Hashtable file into a tar file.

Compress and archive all the Hashtable files as a tar file.

Bucket Map Join

Why:

Total table/partition size was big, not good for mapjoin.

How:

Set hive.optimize.bucketmapjoin = true;

1. Work together with map join

2. All join tables is bucketized, and each small table?s bucket number can is divided by big Table?s bucket number.

All join tables are bucketized and the bucket count of the small table is an integral multiple of the number of large table buckets

3. Bucket columns = = Join columns

Hive supports hash partitioning by specifying the clustered by (col_name,xxx) into number_buckets buckets keyword when building tables.

When join key of the two tables connected is the bucket column, you can

hive.optimize.bucketmapjoin= true

To control hive execution of the bucket map join, it is important to note that the number_buckets of your small table must be a multiple of the large table. This condition must be met regardless of how many tables are connected. (In fact, if all according to 2 of the number of times to divide the bucket, the large table can also be a multiple of the small table, but the middle of the need to calculate more, valid for int, long and string is not clear)

Bucket map Join Execution plan in two steps, the first step is to make the small table map operation into Hashtable and then broadcast to the map of all large tables, the map end of the large table received Number_buckets A small table of Hashtable does not need to synthesize a large hashtable, direct map operation, map operation will produce Number_buckets split, each split mark is the same as the Hashtable tag of the small table, When performing the projection operation, only one hashtable of the small table can be put into memory, and then the corresponding split of the large table is taken out for judgment, so its memory is limited to the size of the largest hashtable in the small table.

The bucket map Join is also an implementation of the map Side join, all calculations are done on the map side, no reduce is called the map Side Join, the bucket is just a hash partition implementation of Hive, another The type of course is the value partition.

CREATE Table A (XXX) partition by (Col_name)

In general, however, two tables in hive do not necessarily have the same partition key, even if there is not necessarily a join key. So hive does not have this value-based map side join, and the list partition in hive is primarily used to filter data instead of partitions. Two main parameters are (HIVE.OPTIMIZE.CP = True and Hive.optimize.pruner=true)

The implementation of the map side join is provided by default in the Hadoop source code, and you can find several related classes under the Src/contrib/data_join/src directory of the Hadoop source. Where taggedmapoutput can be used to implement a hash can also be implemented list, see you decide how to partition. The Hadoop Definitive Guide 8th chapter on map side join and side data distribution also has an example of how to implement a map side join for value partitioning.

Explanation: Table B is a large table, A,c is a small table and is an integer multiple, add the A,c table to the memory first join and then to each B table map to do matching.

Sort Merge Bucket Map Join

Why:

No limit on file/partition/table size.

How:

Set hive.optimize.bucketmapjoin = true;

Set hive.optimize.bucketmapjoin.sortedmerge = true;

Set Hive.input.format=org.apache.hadoop.hive.ql.io.bucketizedhiveinputformat;

1.Work together with bucket map join

Adding buckets to the map join

2.Bucket columns = = Join Columns = = Sort Columns

Bucket Map? Join? map?join The small table must be fully loaded into the memory limit If you want to be in a reduce The large tables and small tables of the nodes do not have to be loaded into memory, you must make two tables in join?key sorted by join?key? or use index

practice or both sides to do Hash bucket bucket Span style= "font-family: Arial" > interior to be sorted. This way when both sides bucket to do local join just use a similar merge sort algorithm in merge operation put two bucket sequential traversal can be done, so that even do not have a bucket full load into hashtable , which can be a great help for performance improvements.

Set hive.optimize.bucketmapjoin?= true;

Set hive.optimize.bucketmapjoin.sortedmerge = true;

Set Hive.input.format=org.apache.hadoop.hive.ql.io.bucketizedhiveinputformat;

Bucket columns = =? Join?columns = = Sort Columns

so that the data of the small table can be read only a portion at a time, and then a large table row by line to match, so that the join? No limit on the size of memory . and can also perform a full outer join .

Skew Join

Join bottlenecked on the reducer who gets the

Skewed key

Set hive.optimize.skewjoin = true;

Set Hive.skewjoin.key = Skew_key_threshold

Connection policy for hive join Strategies Hive

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.