How to pre-partition a region when hbase constructs a table

Source: Internet
Author: User
Tags versions

If you know the distribution of the key in the HBase data table, you can pre-partition hbase on the table when you are building it. The benefit of this is to prevent hot issues from large data insertions and improve the efficiency of data insertion.


steps:

1. Plan HBase pre-partitioning


The first is to understand how the key of the data is distributed, then plan how many region to divide, the Startkey and EndKey of each region, and then write the planned key to a file. For example, the first few strings of key are from the 0001~0010 number, so that can be divided into 10 region, the file partition key is as follows:

0001|
0002|
0003|
0004|
0005|
0006|
0007|
0008|
0009|

Why follows a "|" In the ASCII code, "|" The value is 124, greater than all symbols such as numbers and letters, of course, can also use "~" (ASCII-126). Separating the first behavior of a file the first region of the Stopkey, each line and so on, and the last line is not only the stopkey of the penultimate region, but also the startkey of the last region. That is, the partition file is filled with the key value range of the separation point, as shown in the following image:


2.hbase Shell partition table, specify partition file


Enter create directly in the HBase shell and you will see the following prompt:

Examples:create a table with namespace=ns1 and table qualifier=t1 hbase> Create ' ns1:t1 ', {NAME = ' F1 ', VERSION  S = 5} Create a table with Namespace=default and table qualifier=t1 hbase> Create ' t1 ', {name = ' F1 '}, {name = ' F2 '}, {NAME = ' F3 '} hbase> # The above in shorthand would is the following:hbase> create ' t1 ', ' F1 ' , ' F2 ', ' F3 ' hbase> create ' t1 ', {NAME = ' F1 ', VERSIONS = 1, TTL = 2592000, Blockcache = true} HBAs e> create ' t1 ', {NAME = ' F1 ', CONFIGURATION = {' Hbase.hstore.blockingStoreFiles ' = ' + '}} Table Configur
ation options can be put on the end. examples:hbase> create ' ns1:t1 ', ' F1 ', splits = [' ten ', ' + ', ' + ', ' + '] hbase> create ' t1 ', ' F1 ', splits = = [' Ten ', ' + ', ' + ', ' + '] hbase> create ' t1 ', ' F1 ', splits_file = ' splits.txt ', OWNER = ' JohnDoe ' HBAs e> create ' t1 ', {NAME = ' F1 ', VERSIONS = 5}, METADATA = {' MyKey ' = ' myvalue '} hbase> # Optionally pre-split the table into numregions, using hbase> # Splitalgo ("Hexstringsplit", "Uniformsplit" or class Name) hbase> create ' t1 ', ' F1 ', {numregions =, Splitalgo = ' hexstringsplit '} hbase> create ' t1 ', ' F1 ' , {numregions = +, Splitalgo = ' hexstringsplit ', CONFIGURATION = {' Hbase.hregion.scan.loadColumnFamiliesOnDemand ' + ' True '}} hbase> create ' T1 ', {name = ' F1 '}, {name ' = ' IF1 ', local_index=> ' combine_index| Indexed=f1:q1:8|rowkey:rowkey:10,update=true '}


You can specify the partition file by specifying the value of Splits_file, or you can use the splits partition directly if the partition information is less. We can build a partition table with the following command to specify the partition file generated in the first step:

Create ' split_table_test ', ' CF ', {splits_file = ' region_split_info.txt '}


If I still want to do a snappy compression on the HBase table, how should I write it?

Create ' Split_table_test ', {NAME = ' cf ', COMPRESSION = ' SNAPPY '}, {splits_file = ' region_split_info.txt '}
Note here that it is important to specify the parameters of the partition separately with one curly brace, because the partition is for the full table, not for a column family.


Below, we login to Master's Web page


We see that the first region is not Startkey, and the last region is not stopkey.










Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.