Hbase provides ticket-based
Line
Atomicity guarantee of Data Operations
That is, the change operation on the same row (including the operation on one/multiple columns/Multi-column family) is either completely successful or completely failed, and there will be no other status
Example:
Client A initiates an operation on the row with rowkey = 10: dim1: a = 1 dim2: B = 1
Client B initiates an operation on the row with rowkey = 10: dim1: A = 2 dim2: B = 2
Dim1 and dim2 are column family, And A and B are column
Client A and client B initiate a request at the same time. The value of each column in the row with rowkey = 10 may be dim1: a = 1 dim2: B = 1, or dim1: A = 2 dim2: B = 2
But it is definitely not dim1: a = 1 dim2: B = 2
Hbase ensures the atomicity of Single Row Operations Based on Row locks, You can see the hregion put code (base: hbase 0.94.20 )::
Org. Apache. hadoop. hbase. regionserver. hregion:
/** * @param put * @param lockid * @param writeToWAL * @throws IOException * @deprecated row locks (lockId) held outside the extent of the operation are deprecated. */ public void put(Put put, Integer lockid, boolean writeToWAL) throws IOException { checkReadOnly(); // Do a rough check that we have resources to accept a write. The check is // 'rough' in that between the resource check and the call to obtain a // read lock, resources may run out. For now, the thought is that this // will be extremely rare; we'll deal with it when it happens. checkResources(); startRegionOperation(); this.writeRequestsCount.increment(); this.opMetrics.setWriteRequestCountMetrics(this.writeRequestsCount.get()); try { // We obtain a per-row lock, so other clients will block while one client // performs an update. The read lock is released by the client calling // #commit or #abort or if the HRegionServer lease on the lock expires. // See HRegionServer#RegionListener for how the expire on HRegionServer // invokes a HRegion#abort. byte [] row = put.getRow(); // If we did not pass an existing row lock, obtain a new one Integer lid = getLock(lockid, row, true); try { // All edits for the given row (across all column families) must happen atomically. internalPut(put, put.getClusterId(), writeToWAL); } finally { if(lockid == null) releaseRowLock(lid); } } finally { closeRegionOperation(); } }
Getlock calls internalobtainrowlock:
private Integer internalObtainRowLock(final HashedBytes rowKey, boolean waitForLock) throws IOException { checkRow(rowKey.getBytes(), "row lock"); startRegionOperation(); try { CountDownLatch rowLatch = new CountDownLatch(1); // loop until we acquire the row lock (unless !waitForLock) while (true) { CountDownLatch existingLatch = lockedRows.putIfAbsent(rowKey, rowLatch); if (existingLatch == null) { break; } else { // row already locked if (!waitForLock) { return null; } try { if (!existingLatch.await(this.rowLockWaitDuration, TimeUnit.MILLISECONDS)) { throw new IOException("Timed out on getting lock for row=" + rowKey); } } catch (InterruptedException ie) { // Empty } } } // loop until we generate an unused lock id while (true) { Integer lockId = lockIdGenerator.incrementAndGet(); HashedBytes existingRowKey = lockIds.putIfAbsent(lockId, rowKey); if (existingRowKey == null) { return lockId; } else { // lockId already in use, jump generator to a new spot lockIdGenerator.set(rand.nextInt()); } } } finally { closeRegionOperation(); } }
Recommended Implementation Details of hbase row locks: hbase source code parsing row locks
Hbase also provides an API (lockrow/unlockrow) to display row locks.But not recommended. The reason is that when two clients have the lock requested by the other party and request the lock owned by the other party, a deadlock occurs. Before the lock times out, both blocked clients occupy the processing thread of a server, and the server thread is a very scarce resource.
Hbase provides several special atomic operation interfaces:
Checkandput/checkanddelete/increment/appendThese interfaces are very useful, and the internal implementation is also based on the row lock.
Code snippet for internal call of checkandput/checkanddelete:
// Lock row Integer lid = getLock(lockId, get.getRow(), true); ...... // get and compare try { result = get(get, false); ...... //If matches put the new put or delete the new delete if (matches) { if (isPut) { internalPut(((Put) w), HConstants.DEFAULT_CLUSTER_ID, writeToWAL); } else { Delete d = (Delete)w; prepareDelete(d); internalDelete(d, HConstants.DEFAULT_CLUSTER_ID, writeToWAL); } return true; } return false; } finally { // release lock if(lockId == null) releaseRowLock(lid); }
Implementation logic: Lock => Get => comparison => put/delete
Checkandput is very valuable in practical applications. We generate dpid projects online, and multiple clients generate dpid in parallel. If one client has generated a dpid, other clients cannot generate a new dpid.
Code snippet:
ret = hbaseUse.checkAndPut("bi.dpdim_mac_dpid_mapping", mac, "dim","dpid", null, dpid);if(false == ret){String retDpid = hbaseUse.query("bi.dpdim_mac_dpid_mapping", mac, "dim", "dpid");if(!retDpid.equals(ABNORMAL)){return retDpid;}}else{columnList.add("mac");valueList.add(mac);}
For more information about checkandput, see hbaseeveryday_atomic_compare_and_set.
Reference:
Hbase-Apache hbase (TM) ACID properties
Row lock for hbase source code parsing
Hbase authoritative guide
Hbaseeveryday_atomic_compare_and_set