JOINs is essential operations in relational databases. They create a link between rows based on common values and allow the meaningful combination of these rows. Cratedb supports joins and due to their distributed nature allows you to work with large amounts of data.
In this document we'll present the following topics. First, an overview of the existing types of joins and algorithms provided. Then a description of what cratedb implements them along with the necessary optimizations, which allows us-work with hug e datasets.
Table of Contents
Types of Join
A join is a relational operation that merges the data sets based on certain properties. Join Types (Inspired by this article) shows which elements appear in which join.
Join Types
From Bottom:left joins, right join, Inner joins, outer join, and Cross join of a set L and R.cross Jo Inch
A Cross join returns the Cartesian product of, or more relations. The result of the Cartesian product on the relation L and R consists Of all possible permutations of any tuple of the relation L with every tuple of the Relat Ion R.
Inner Join
an inner join is a join of of the or more relations this returns only tuples that satisfy th E join condition.
Equi Join
an equi join was a subset of an inner join and a comparison-based join, that uses equality Compa Risons in the join condition. The equi join of the relation L and R combines tuple L of relation l with a tuple R of The relation R If the join attributes of both tuples is identical.
Outer Join
an outer joins returns a relation consisting of tuples that satisfy the join condition and Dangling tuples from both or one of the relations, respectively to the outer join type.
An outer join has following types:
- left outer join returns tuples of the relation L matching Tuples of the relation R and dangling tuples of the relation < /span> R padded with null values.
- right outer join returns tuples of the relation R matching tuples of the relation L and dangling Tuples from the relation , L padded with null values.
- full outer join returns matching tuples of both Relations and dangling tuples produced by left and right outer joins.
Joins in Cratedb
Cratedb supports (a) cross joins, (b) INNER joins, (c) Equi joins, (d) left joins, (e) Right joins and (f) full join. All of these join types is executed using the nested loop join algorithm except for the equi Joinswhich is executed using the hash join algorithm. Special optimizations, according to the specific use cases, is applied to improve execution performance.
Nested Loop Join
the nested loop join is the simplest join algorithm. One of the relations is nominated as the inner relation and the other as the outer relation. Each tuple of the outer relation is compared with each tuple of the inner relation and if the join condition is satisfied, The tuples of the relation L and R is concatenated and added into The returned virtual relation:
For each tuple l∈l does for each tuple r∈r does if l.aθr.b put tuple (L, R) in Q
Listing 1. Nested loop join algorithm.
Primitive Nested Loop
For joins on some relations, the nested loop operation can is executed directly on the handler node. Specifically for queries involving a cross JOIN or joins on system tables /information_schema E Ach shard sends the data to the handler node. Afterwards, this node runs the nested loop, applies limits, etc. and ultimately returns the results. Similarly, joins can be nested, so instead of collecting data from shards the rows can be the result of a previous join or table function.
Distributed Nested Loop
relations is usually distributed to different nodes which Require the nested loop to acquire, the data before being able to join. After finding the locations of the required shards (which was done with the planning stage), the smaller data set (based on t He row count) is broadcast amongst all the nodes holding the shards they was joined with. After this, each of the receiving nodes can start running a nested loop on the subset it has just received. Finally, these intermediate results is pushed to the original (handler) node-to-merge and return the results to the Reque Sting client (see nodes that is holding the smaller shards broadcast the DAT A to the processing nodes which then return the results to the requesting node. ).
Nodes that is holding the smaller shards broadcast the data to the processing Nodes which then return the results to the Requesting node.
pre-ordering and Limits optimization
Queries can is optimized if they contain (a) ORDER by, (b) LIMIT, or (c) if Inner/equi JOIN. In any of these cases, the nested loop can be terminated earlier:
- Ordering allows determining whether there is records left
- Limit states the maximum number of rows that is returned
Consequently, the number of rows is significantly reduced allowing the operation to complete much faster.
Hash Join
The Hash Join algorithm is used-to-execute certains types of joins in a more perfomant the-a-than-the-Nested c1>.
Basic algorithm
The operation takes place on one node (the handler node to which, the client is connected). The rows of the left relation of the join was read and a hashing algorithm is applied on the fields of the relation which Participate in the join condition. The hashing algorithm generates a hash value which is used to store every row of the left relation in the proper position In a hash table.
Then the rows of the right relation be read One-by-one and the same hashing algorithm is applied on the fields that parti Cipate in the join condition. The generated hash value is used and make a lookup in the hash table. If No entry is found, the row was skipped and the processing continues with the next row from the right relation. If an entry is found, the join condition are validated (handling hash collisions) and on successful validation the combined The tuple of left and right relation are returned.
Basic Hash Join algorithm
Block Hash Join
the Hash Join algorithm requires a hash table containing all the rows of the "The left relation" to being stored in memory. Therefore, depending on the size of the relation (number of rows) and the size of each row, the size of this hash table mi Ght exceed the available memory of the node executing the hash join. To resolve this limitation the rows of the "left relation is loaded into the" hash table in blocks.
on every iteration the maximum available size of the & nbsp; hash table is calculated, based the number of rows and size of each row of the table but also taking into account the available memory for query execution on the node. Once This block-size was calculated the rows of the left relation be processed and inserted into the have H table until the block-size is reached. The operation then starts reading the rows of the right relation, process them one-by-one and performs the lookup and the Join condition validation. Once all rows from the right relation be processed the hash table is re-initialized b ased on a new calculation of the block size and a new iteration starts until all rows of the left relation is processed.< /p>
With this algorithm the memory limitation are handled in expense of have to iterate over the rows of the right table mult Iple times, and it is the default algorithm used for Hash joins execution by Cratedb.
Switch Tables Optimization
Since The right table can is processed multiple times (number of rows from left/block-size), the right table, should be th e smaller (in number of rows) of the and the relations participating in the join. Therefore, if originally the right relation was larger than the left of the query planner performs a switch to take advantage of this detail and execute the hash join with better performance.
Distributed Block Hash Join
Since Cratedb is a distributed database and a standard deployment consists of at least three nodes h More, the Hash Join algorithm execution can is further optimized (performance-wise) by executing it in a distributed man NER across the Cratedb cluster.
The idea was to has the hash join operation executing in multiple nodes of the cluster in parallel and then merge the inte Rmediate results before returning them to the client.
A hashing algorithm is applied on every row of both, the left, and right relations. On the integer value generated by this hash, a modulo, by the number of nodes in the cluster, is applied and the resulting Number defines the node to which this row should is sent. As a result each node of the cluster receives a subset of the whole data set which are ensured (by the hashing and modulo) To contain all candidate matching rows. Each node in turn performs a Block Hash Join On this subset and sends it result tuples to the handler node (where the client issued the query). Finally, the handler node receives those intermediate results, merges them and applies any pending ORDER BY, LIMITand OFFSET and sends the final result to the client.
This algorithm was used by Cratedb for most cases of the hash join execution except for joins in complex subqueries that Contai LIMIT n and/or OFFSET .
Distributed hash Join algorithm
Optimizationsquery then Fetch
Join operations on large relation can being extremely slow especially if the join is executed with a Nested Loop . -which means that the runtime complexity grows quadratically (O (n*m)). specifically for cross Joins This results in large amounts of data sent over the network and loaded into Memory at the handler node. Cratedb reduces the volume of data transferred by employing Query then Fetch:first, filtering and ordering is applied (i F possible where the data is located) to obtain the required document IDS. Next, as soon as the final data set is ready, Cratedb fetches the selected fields and returns the data to the client.
Push-down Query Optimization
Complex queries such as Listing 2 require the planner to decide when to filter, sort, and merge in order to efficiently ex Ecute the plan. In this case, the query would is split internally into subqueries before running the join. As shown in Figure 5, first filtering (and ordering) are applied to relations L and R on their shards and then the result was directly broadcast to the nodes running the join. Not only would this behavior reduce the number of rows to work with, it also distributes the workload among the nodes so th At the (expensive) join operation can run faster.
SELECT L.a, R.x from L, RWHERE L.ID = R.ID and L.b > - and R.y < TenORDER by L.a
Listing 2. An INNER join on IDs (effectively a equi join) which can be optimized.
Figure 5
Complex queries is broken down to subqueries that is run on their shards before joining.
Cratedb joins principle (Official document)