Hive lock (translated from hive wiki)

Source: Internet
Author: User
Use Cases of hive concurrency Model

Concurrency support (http://issues.apache.org/jira/browse/HIVE-1293) is a must for databases and Their Use Cases are well understood. At least, we should try to support concurrent reading and writing. It is useful to add several locks that are currently locked. There is no direct requirement to add an API to explicitly obtain the lock. Therefore, all locks are obtained implicitly.

Hive defines the pattern lock (note that no intention lock is required)

    • Share (s)
    • Exclusive (X)

As you can see, multiple shared locks can be acquired at the same time, and exclusive locks block other locks.

The compatibility list is as follows:

  • *
Existing lock S X
Requested lock
  • *
  • *
  • *
S
  • *
True False
X
  • *
False False

For some operations, the nature of the lock is hierarchical. For example, in some partition operations, the table is also locked (for example, when the table partition is being created, the table cannot be deleted)

The principle behind lock mode acquisition is as follows:

For non-partition tables, the locking mode is quite intuitive. When the table is being read, an S lock is obtained. To perform other operations (insert data to a table and modify any attributes of the table), you must obtain the X lock.

For partitioned tables, the principle is as follows:

When reading table partitions, the S lock of the table is obtained. For other operations, the X lock of the partition is obtained. However, if the modification is only for the new partition, the S lock of the table will be obtained. If the modification is for all partitions, the X lock of the table will be obtained.

Therefore, when reading and writing the old partition, the new partition can also be converted to rcfile.

At any time, when a partition is locked, the S lock of all its parent nodes will be obtained.

Based on this, the Operation locks are obtained as follows:

Hive command Locks acquired
Select... T1 partition p1 S on T1, t1.p1
Insert into T2 (partition P2) Select .. T1 partition p1 S on T2, T1, t1.p1 and X on t2.p2
Insert into T2 (partition P. Q) Select... T1 partition p1 S on T2, t2.p, T1, t1.p1 and X on t2.p. Q
Alter table T1 rename T2 X on T1
Alter table T1 add Cols X on T1
Alter table T1 replace Cols X on T1
Alter table T1 change Cols X on T1
Alter table T1 add partition p1 S on T1, X on t1.p1
Alter table T1 drop partition p1 S on T1, X on t1.p1
Alter table T1 touch partition p1 S on T1, X on t1.p1
* Alter table T1 set serdeproperties * S on T1
* Alter table T1 set serializer * S on T1
* Alter table T1 set file format * S on T1
* Alter table T1 set tblproperties * X on T1
Drop table T1 X on T1

To avoid deadlocks, a simple plan is proposed here. All locked objects are ordered alphabetically and obtained in the lock mode. Note that in some scenarios, the object list may not be known-for example, dynamic partitions, or the list of partitions being modified during compilation. Therefore, the list is generated conservatively. because the number of partitions may not be known, the exclusive lock will be obtained on the table or the known prefix.

Add two configurable parameters to determine the number of lock attempts and wait time. If the number of retries is very high, it can lead to a live lock. Refer to zookeeper recipes (http://hadoop.apache.org/zookeeper/docs/r3.1.2/recipes.html# SC _recipes_Locks) to understand how to use zookeeper APIs to implement read/write locks. It should be noted that the lock request that is not waiting will be rejected. The existing lock will be released, and the waiting lock will continue to be tried after the trial period.

Because of the layered nature of the lock, the rules listed above cannot work properly:

 

The S lock rules for table t are as follows:

Call create () to create a node named "/warehouse/T/read. In the protocol, the locked node will be used later. Set sequence and temporary flag.

Call getchildren () on the locked node and do not set the watch flag.

If there is already a subnode whose path name starts with "Write-" and a smaller serial number, the lock cannot be obtained. Delete the node created in step 1 and return.

Otherwise, the authorization lock is used.

 

The X lock rules for table t are as follows:

Call create () to create a node named "/warehouse/T/Write. In the protocol, the locked node will be used later. Set sequence and temporary flag.

Call getchildren () on the locked node and do not set the watch flag.

If there is already a subnode whose path name starts with "Read-" or "Write-" and a smaller serial number, the lock cannot be obtained. Delete the node created in step 1 and return.

Otherwise, the authorization lock is used.

The proposed computation, In order to read the hunger for writing, if a long time of reading, will lead to writing hunger.

The default hive behavior will not change, and the concurrency will not support it.

 

Disable concurrency

Disable concurrency. Modify the following variable to false: hive. Support. concurrency.

Debugging

To view the table lock, run the following command:

 
Show locks <table_name>; Show locks<Table_name>Extended; show locks<Table_name> partition (<partition_desc>); Show locks<Table_name> partition (<partition_desc>) extended;

 

Translation from https://cwiki.apache.org/confluence/display/Hive/Locking

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.