I. Split trigger conditions
1. If the size of any hfile exceeds the default value of 10 Gb
2. if this value is reached, it is not split. The default value is int_max. No split is performed.
3. In compaction, if the size of the compact store exceeds the limit, split
4. Before flush, the system checks whether the number of hstorefiles in the region exceeds hbase. hstore. blockingstorefiles. If the number exceeds and does not wait for timeout, compactsplitthread. requestsplit (hregion) is called)
5. After flush, hregion. checksplit () is called to check whether split is required. If yes, compactsplitthread. requestsplit (hregion) is called)
6. Manual triggering
Ii. Split Process
1. Start a compactsplitthresd thread
2. splitrequest. Run
First, instantiate a transaction: Create a. Splits directory under the parent's region directory on HDFS
Second, instantiate two regioninfo: hri_a and hri_ B, and assign the start and end values of the key according to the given splitkey.
Then, execute the execute method:
(1) createdaughters
Each reference file manages half the data of the original file. The name of the reference file is an ID, which uses the hash of the name of the referenced region as the prefix. For example, 1278437856009925445.3323223323. The reference file contains only a small amount of information, including the key of the split original region and the first half or second half of the file management. Hbase uses the halfhfilereader class to access the reference file and read data from the original data file.
Offline parent in Meta. Information about put split to. Meta. Table
(2) opendaughter -- daughteropener. Run () -- opendaughterregion -- openhregion
Opendaughter --Postopendeploytasks
Addtoonlineregions // shocould add it toonlineregions
(3) transitionzknode: Finish off splittransaction, transition the zknode, update the split status (it will be processed by the master and catalogjanitor will clean up unnecessary folders)
Hbase split process and release conditions