[Hive-languagemanual] Hive Concurrency Model (pending)

Source: Internet
Author: User

Hive Concurrency Model

    • Hive Concurrency Model
      • Use Cases
      • Turn Off Concurrency
      • Debugging
      • Configuration
    • Locking in Hive transactions

Use Cases

Concurrency Support (http://issues.apache.org/jira/browse/HIVE-1293) was a must in databases and their use cases be well u Nderstood. At a minimum, we want to support concurrent readers and writers whenever possible. It would is useful to add a mechanism to discover the current locks which has been acquired. There is no immediate requirement to add a API to explicitly acquire any locks, so all locks would be acquired implicitly .

The following lock modes would be defined in hive (Note that Intent lock was not needed).

    • Shared (S)
    • Exclusive (X)

As the name suggests, multiple shared locks can be acquired at the same time, whereas X lock blocks all other locks.

The compatibility matrix is as follows:

lock
compatibility  

Existing Lock

S

 

X

REQUESTED&NBSP
Lock

S

True

False

X

false

False

For some operations, locks was hierarchical in nature – for example for some partition operations, the table is also lock Ed (To make sure, the table cannot be dropped and a new partition is being created).

The rational behind the lock mode to acquire is as follows:

For a non-partitioned table, the lock modes is pretty intuitive. When the table is being read, a S-lock is acquired, whereas an X-lock is acquired for all other operations (insert INTO th e table, ALTER TABLE of any kind etc.)

For a partitioned table, the idea is as follows:

A ' S ' lock on table and relevant partition are acquired when a read is being performed. For all and operations, an ' X ' lock is taken on the partition. However, if the change was only applicable to the newer partitions, a ' S ' lock was acquired on the table, whereas if the CHA Nge is applicable to all partitions, a ' X ' lock was acquired on the table. Thus, older partitions can be read and written in, while the newer partitions is being converted to rcfile. Whenever a partition is being locked in any mode, and all of its parents be locked in ' S ' mode.

Based on this, the lock acquired for a operation is as follows:

Hive Command

Locks acquired

Select: T1 Partition P1

S on T1, T1. P1

INSERT into T2 (partition P2) Select: T1 Partition P1

S on T2, T1, T1. P1 and X on T2. P2

INSERT into T2 (partition P.Q) Select: T1 Partition P1

S on T2, T2. P, T1, T1. P1 and X on T2. P.q

ALTER TABLE T1 Rename T2

X on T1

ALTER TABLE T1 ADD COLS

X on T1

ALTER TABLE T1 replace COLS

X on T1

ALTER TABLE T1 CHANGE cols

X on T1

ALTER TABLE T1 ADD partition P1

S on T1, X on T1. P1

ALTER TABLE T1 DROP partition P1

S on T1, X on T1. P1

ALTER TABLE T1 Touch partition P1

S on T1, X on T1. P1

ALTER TABLE T1 set serdeproperties

S on T1

ALTER TABLE T1 SET serializer

S on T1

ALTER TABLE T1 set file format

S on T1

ALTER TABLE T1 set tblproperties

X on T1

drop table T1

X on T1

In order to avoid deadlocks, a very simple scheme was proposed here. All the objects-locked is sorted lexicographically, and the required mode-lock is acquired. Note that in some cases, the list of objects May is known--for example in case of dynamic partitions, the list of PA Rtitions being modified is not known at compile time – so, the list is generated conservatively. Since the number of partitions may isn't known, an exclusive lock are taken on the table, or the prefix of the are known.

The new configurable parameters'll be added to decide the number of retries for the lock and the wait time between each Retry. If the number of retries is really high, it can leads to a live lock. Look at ZooKeeper recipes (http://hadoop.apache.org/zookeeper/docs/r3.1.2/recipes.html#sc_recipes_Locks) Read/write locks can be implemented using the Zookeeper APIs. Note that instead of waiting, the lock request would be denied. The existing locks would be released, and all of the them would be retried after the retry interval.

The recipe listed above won't work as specified, because of the hierarchical nature of locks.

The ' S ' lock for table T is specified as follows:

    • Call Create () to create a node with pathname "/warehouse/t/read-". This was the lock node used later in the protocol. Make sure to set the sequence and ephemeral flag.
    • Call GetChildren () on the lock node without setting the watch flag.
    • If there is a child with a pathname starting with "write-" and a lower sequence number than the one obtained, the lock can Not being acquired. Delete the node created in the first step and return.
    • Otherwise the lock is granted.

The ' X ' lock for table T is specified as follows:

    • Call Create () to create a node with pathname "/warehouse/t/write-". This was the lock node used later in the protocol. Make sure to set the sequence and ephemeral flag.
    • Call GetChildren () on the lock node without setting the watch flag.
    • If there is a child with a pathname starting with "read-" or "write-" and a lower sequence number than the one obtained, T He lock cannot be acquired. Delete the node created in the first step and return.
    • Otherwise the lock is granted.

The proposed scheme starves the writers for readers. In case of a long readers, it may leads to starvation for writers.

The default Hive behavior is not being changed, and concurrency would is not being supported.

Turn Off Concurrency

You can turn off concurrency by setting the following variable to false:hive.support.concurrency.

Debugging

You can see the locks in a table by issuing the following command:

    • SHOW LOCKS <TABLE_NAME>;
    • SHOW LOCKS <TABLE_NAME> EXTENDED;
    • SHOW LOCKS <TABLE_NAME> PARTITION (<PARTITION_DESC>);
    • SHOW LOCKS <TABLE_NAME> PARTITION (<PARTITION_DESC>) EXTENDED;
Configuration

Configuration Properties for Hive locking is described in locking.

Locking in Hive transactions

Hive 0.13.0 adds transactions with Row-level ACID semantics, using a new lock manager. For more information, see:

    • ACID and transactions in Hive
    • Lock Manager

[Hive-languagemanual] Hive Concurrency Model (pending)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.