Http://hbase.apache.org/book.html#ops.capacity.regions.count
In production scenarios, where you had a lot of data, and you were normally concerned with the maximum number of regions you c An has a per server. Too many regions have technical discussion on the subject. Basically, the maximum number of regions is mostly determined by memstore memory usage. Each region have its own memstores; These grow up to a configurable size; Usually in 128-256 MB range, see hbase.hregion.memstore.flush.size. One Memstore exists per column family (so there ' s only one per region if there ' s one CF in the table). The RS dedicates some fraction of total memory to their memstores (see Hbase.regionserver.global.memstore.size). If this memory is exceeded (too much memstore usage), it can cause undesirable consequences such as unresponsive server or Compaction storms. A Good starting point for the number of regions per RS (assuming one table) is:
((RS memory) * (total memstore fraction)) / ((memstore size)*(# column families))
This formula is pseudo-code. Here is the formulas using the actual tunable parameters, first for HBase 0.98+ and second for HBase 0.94.x.
-
HBase 0.98.x
((RS Xmx) * hbase.regionserver.global.memstore.size)/(Hbase.hregion.memstore.flush.size * (# Column families))
-
HBase 0.94.x
((RS Xmx) * hbase.regionserver.global.memstore.upperLimit)/(Hbase.hregion.memstore.flush.size * (# column families) +
If a given regionserver has more than GB of RAM, with default settings, the formula works off to 16384*0.4/128 ~ Wuyi regions per R S is a starting point. The formula can extended to multiple tables; If they all has the same configuration, just use the total number of families.
HBase Region-to-memory relationship