Real-time system hbase read-write optimization--A large number of write-free

Source: Internet
Author: User
Tags compact

In the process of using hbase, it is found that when the amount of data written to HBase is very large, it often occurs. And we are based on hbase application is very high real-time requirements, once the hbase can not read and write will greatly affect the use of the system. The process of HBase write optimization is recorded below.


1. Prohibition of major compaction

When HBase is major compaction, the region merges all storefile, so the entire region is unreadable and all queries to this region block. HBase Major compaction is executed one day or so by default. We're going to ban major compaction and use cron scripts to perform major compaction on all tables every day while the system is idle.


Configuration of Major compaction:

<property>
<name>hbase.hregion.majorcompaction</name>
<value>0</value>
</property>

The default is 1 days, and each region initializes the regionmajorcompactiontime at the current time when it is created and sets the next major compaction time to 1+-0.2 days. This value is set to 0 in the configuration to prohibit major compaction.


Major_compaction script: Take out all of the table and execute Major_compact:

Tmp_file=tmp_tables
tables_file=tables.txt

echo "list" | HBase shell > tmp_tables sleep
2
sed ' 1,6d ' $TMP _file | TAC | Sed ' 1,2d ' | Tac > $TABLES _file sleep
2 with

table in $ (cat $TABLES _file), do
        echo "Major_compact ' $table" | HBase shell Sleep Done
2. Ban split

HBase through the split region to achieve the level of sharding, but in the process of split the old region will be offline, the new region will do compaction, in the middle of a period of time a large number of data can not be read and written, This is not tolerable for our online system. We also ban automatic split, and perform our Splittool manual split at night when the system is idle.


No split configuration:

<property>
 <name>hbase.hregion.max.filesize</name>
 <value>536870912000</ Value>
 </property>
The meaning of the configuration item is that when the size of the region is greater than the set value, the HBase begins to split, we set this value to 500G, and we think that a region will not exceed this size during the daytime system rush, and that running Splittool at night will split the region.


The logic of Splittool is relatively simple. Traversal of all region information, if the region size is greater than a certain value (such as 1G) then split the region, this is a round split, if there is no greater than the value of the region after the end, if there are more than a value of the region to continue a new round of split, Until no region is greater than a certain threshold value. Here's how to judge split completion by checking whether the old region folder on HDFs is cleared to determine if split is over.


3. Set Blockingstorefiles

The importance of this parameter is found in our performance tests. We've banned major_compaction and split. There should be no obstacle to writing theoretically, but in tests it is found that writing a single region faster than 10m/s can still occur for a long time without writing. By looking at the log, we found that this line of log "waited 90314ms on a compaction to clean up ' too many store files '", by looking at the code that turns out to be blockingstorefiles this parameter Blame.


The flushregion detects whether the number of hfile in the current store is greater than this value, and if greater than the block data is written, waits for other threads to hfile the compact. Thus, if the write speed exceeds the speed of the compact, HBase blocks the region's data from being written.

Private Boolean flushregion (final Flushregionentry fqe) {
    hregion region = fqe.region;
    if (!fqe.region.getregioninfo (). Ismetaregion () &&
        istoomanystorefiles (region)) {
      if ( Fqe.ismaximumwait (This.blockingwaittime)) {
        Log.info ("waited" + (System.currenttimemillis ()-fqe.createTime) +
          "Ms on a compaction to clean up ' too many store files '; waited" +
          "long enough ... proceeding with flush of" +
  region.getregionnameasstring ());
      
The default value is 7
This.blockingstorefilesnumber =
      conf.getint ("Hbase.hstore.blockingStoreFiles", 7);
    if (This.blockingstorefilesnumber = = 1) {
      This.blockingstorefilesnumber = 1 +
        conf.getint (" Hbase.hstore.compactionThreshold ", 3);
    }


We set this value to a large value so that the problem does not block our write.

<property>
<name>hbase.hstore.blockingStoreFiles</name>
<value>2100000000</ Value>
</property>




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.