Hbase and zookeeper Performance Optimization-parameter settings

Source: Internet
Author: User
Tags compact

Zookeeper. session. Timeout
Default Value
: 3 minutes (180000 ms)
Description: The Connection timeout between regionserver and zookeeper. After the time-out period expires, the reigonserver will be removed from the RS cluster list by zookeeper. After the hmaster receives the removal notification, it will re-balance the regions responsible for this server to allow other surviving regionservers to take over.
Optimization:
This timeout determines whether the regionserver can promptly failover. Set to 1 minute or lower, which can reduce the failover time extended due to wait timeout.
However, it should be noted that for some online applications, regionserver takes a short time from downtime to recovery (transient network disconnection, crash, and other faults can be quickly involved in O & M ), if you reduce the timeout time, the loss will not be worth the candle. The reason is that when the reigonserver is officially removed from the RS cluster, the hmaster starts to perform the balance (so that other RS can be restored Based on the wal logs recorded by the faulty machine ). When the faulty RS is manually recovered, this balance action is meaningless, which will result in uneven load and bring more burden to Rs. In particular, for scenarios with fixed regions allocation.

Hbase. regionserver. handler. Count
Default Value
: 10
Description: The number of I/O threads that the regionserver requests process.
Optimization:
The optimization of this parameter is closely related to the memory.
A small number of Io threads are suitable for big put scenarios with high memory consumption for processing a single request (large-capacity single put or scan with a large cache) or the memory of the reigonserver is relatively tight.
A large number of Io threads are suitable for scenarios with low memory consumption and high TPS requirements for a single request. When setting this value, the main reference is monitoring memory.
Note that if the number of region servers is small and a large number of requests are on the same region, the read/write lock caused by memstore triggering flush will affect the global TPS, the higher the number of Io threads, the better.
Enabling RPC-level logging is enabled during stress testing to monitor the memory consumption and GC status of each request at the same time. At last, the number of I/O threads is adjusted through multiple stress testing results.
Here is a case? Hadoop and hbase Optimization for read intensive search applications. The author sets the number of I/O threads to 100 on SSD machines for reference only.

Hbase. hregion. Max. filesize
Default Value
: 256 m
Description: Maximum storage space of a single reigon on the current reigonserver. If a single region exceeds this value, the region is automatically split into smaller region.
Optimization:
Small region is friendly to split and compaction, because it splits storefile in region or compact small region quickly and consumes less memory. The disadvantage is that split and compaction are frequent.
In particular, a small number of region shards are constantly split and compaction, which may cause great fluctuations in the cluster response time. A large number of region not only causes management trouble, but also causes some hbase bugs.
Generally, less than 512 is a small region.

Large Region is not suitable for split and compaction, because a compact and split operation may cause a long pause, which has a great impact on the application's read/write performance. In addition, large region means a large storefile, and compaction is also a challenge to memory.
Of course, the big region also has its potential. In your application scenario, if the traffic volume at a certain point in time is low, compact and split can successfully complete split and compaction, and ensure stable read/write performance for most of the time.

Since Split and compaction affect performance, is there a way to remove them?
Compaction cannot be avoided, but split can be automatically adjusted to manual.
By increasing the value of this parameter to a hard-to-reach value, such as 100 GB, auto split can be indirectly disabled (regionserver does not split region that has not reached GB ).
Combined with the regionsplitter tool, manual split is required when split is required.
Manual split is much more flexible and stable than automatic split. On the contrary, the management cost is not much higher. It is recommended for online real-time systems.

In terms of memory, the small region is flexible in setting the memstore size value, while the large region is too large or too small. After the Conference, the IO wait of the app increases during the flush process, if it is too small, the read performance will be affected due to too many store files.

Hbase. regionserver. Global. memstore. upperlimit/lowerlimit

Default Value:0.4/0.35
Upperlimit description: Hbase. hregion. memstore. Flush. size is used to flush all memstores of a region when the total size of all memstores in a single region exceeds the specified value. The flush of regionserver is asynchronously processed by adding a queue to the request and simulating the production consumption mode. There is a problem here. When the queue is too late to consume and a large backlog of requests are generated, the memory may increase sharply. The worst case is to trigger oom.
This parameter prevents excessive memory usage. When the total memory occupied by all region memstores in the reigonserver reaches 40% of heap, hbase forcibly blocks all updates and flush these region to release the memory occupied by all memstores.
Lowerlimit description: Same as upperlimit, but when lowerlimit occupies 35% of heap memory for all region memstores, it does not flush all memstores. It will find a region with the largest memory usage in memstore and perform some flush operations. At this time, write updates will still be blocked. Lowerlimit is a remedy before all region forces flush to cause performance degradation. In the log, it is displayed as "** flush thread woke up with memory above low
Water ."
Optimization: This is a heap memory protection parameter. The default value is applicable to most scenarios.
Parameter adjustment affects read/write. If the write pressure is high and the threshold value is often exceeded, the hfile of the read cache is reduced. block. cache. size increases the threshold value, or the read cache size is not modified when the heap margin is large.
If the threshold value is not exceeded under high pressure, we recommend that you reduce the threshold value and perform another stress test to ensure that the number of triggers is not too large. When there is more heap margin, increase hfile. block. cache. size improves read performance.
Is there another possibility? Hbase. hregion. memstore. Flush. Size remains unchanged, but Rs maintains excessive region. You must know that the number of region directly affects the memory usage.

Hfile. Block. cache. Size

Default Value0.2
Description: The percentage of heap size occupied by the read cache of storefile. 0.2 indicates 20%. This value directly affects the data read performance.
Optimization: Of course, the larger the value is, the better. If the write ratio is much less than the read ratio, it is okay to open to 0.4-0.5. If the read/write status is balanced, it is about 0.3. If you write more data than read data, use the default setting. When setting this value, you also need to refer? Hbase. regionserver. Global. memstore. upperlimit ?, This value is the maximum percentage of memstore to heap. One parameter affects reading and the other parameter affects writing. If the sum of the two values is greater than 80-90%, oom risks may occur.

Hbase. hstore. blockingstorefiles

Default Value:7
Description: When there are more than 7 storefiles in the store (coulmn family) in a region during flush, all write requests in the block are compaction to reduce the number of storefiles.
Optimization: Block write requests will seriously affect the response time of the Current regionserver, but too many storefiles will also affect read performance. In practical applications, you can set the value to an infinitely large value to obtain a smoother response time. If you are able to tolerate large fluctuations in the response time, you can adjust it by default or based on your own scenarios.

Hbase. hregion. memstore. Block. Multiplier

Default Value:2
Description: When the memory occupied by memstore in a region exceeds the size of hbase. hregion. memstore. Flush. Size twice, all requests of the region are blocked, flush is performed, and the memory is released.
Although we set the total memstores memory size occupied by region, such as 64 m, imagine that at the end of 63.9m, I put a m data, at this time, the size of memstore will instantly soar to exceed the expected hbase. hregion. memstore. flush. several times the size. This parameter is used to block all requests when the memstore size exceeds hbase. hregion. memstore. Flush. Size twice.
Optimization: The default value of this parameter is relatively reliable. If you predict that your normal application scenarios (excluding exceptions) will not experience unexpected write or write volume control, keep the default value. Under normal circumstances, your Write Request volume will often grow to several times of normal, so you should increase this multiple and adjust other parameter values, such as hfile. block. cache. size and hbase. regionserver. global. memstore. upperlimit/lowerlimit to reserve more memory to prevent hbase server oom.

Hbase. hregion. memstore. mslab. Enabled

Default Value:True
Description: Reduces full GC caused by memory fragments and improves overall performance.
Optimization: See http://kenwublog.com/avoid-full-gc-in-hbase-using-arena-allocation for details

Others

Enable lzo Compression
Compared with hbase's default gzip, lzo has higher performance and compression. For more information, see?Using lzo compression.Lzo is a good choice for developers who want to improve hbase read/write performance. For developers who are very concerned about storage space, we recommend that you keep the default value.

Do not define too many column families in a table

Hbase cannot process more than 2-3 CF tables. When a CF is flush, its neighboring CF is also flush triggered due to the association effect, which eventually leads to more Io.

Batch Import

Before importing data to hbase in batches, you can create regions in advance to balance the data load. For details, see? Table creation: Pre-creating regions

Avoid CMS concurrent mode failure

Hbase uses cms gc. By default, GC is triggered when the memory of the old generation reaches 90%. This percentage is set by the-XX: cmsinitiatingoccupancyfraction = n parameter. Concurrent mode failed occurs in the following scenario:
When the memory of the old generation reached 90%, CMS began to collect concurrent garbage. At the same time, the new generation was still rapidly promoted to the old generation. When the old generation CMS had not completed the concurrent mark, the old generation was full, and the tragedy happened. CMS has to pause Mark because there is no memory available, trigger a stop the world (suspend all JVM threads), and then clean all junk objects by using a single thread COPY method. This process will be very long. To avoid concurrent mode failed, we recommend that GC be triggered before it reaches 90%.

By setting? -XX: cmsinitiatingoccupancyfraction = N

This percentage can be calculated in this simple way. If your? Hfile. Block. cache. Size and? Hbase. regionserver. Global. memstore. upperlimit adds up to 60% (default), so you can set 70-80, generally about 10% High.
Maxclientcnxns = 300By default, Zookeeper provides 10 connections to each client IP address, which often results in insufficient connections. Currently, the number of connections can only be modified in the zoo. cfg configuration file. Therefore, you need to restart zookeeper to make the change take effect. Zoo. CFG: maxclientcnxns = 300 otherwise, the following error is reported: 09:39:44, 856-Warn [nioservercxn. factory: 0.0.0.0/0.0.0.0: 5858: nioservercnxn $ factory @ 253]-too many ons from/172. *. *. *-Max is 10 Hbase_heapsize = 3000Hbase has a special hobby for memory, and has enough memory for it if the hardware permits.
By modifying
Export hbase_heapsize = 3000 # The default value is 1000 MB.
Typical hadoop and hbase configurations• Region server • hbaseregion server JVM heap size:-xmx15gb • number of hbaseregion server handlers: hbase. regionserver. handler. count = 50 (matching number of active regions) • region size: hbase. hregion. max. filesize = 53687091200 (50 GB to avoid automatic split) • turn off auto major compaction: hbase. hregion. majorcompaction = 0 • map reduce • number of data node threads: DFS. datanode. handler. count = 100 • number of Name node threads: DFS. namenode. handler. count = 1024 (Todd: • Name node heap size:-xmx30gb • turn off map speculative execution: mapred.map.tasks.speculative.exe cution = false • turn off reduce speculative execution: mapred.cece.tasks.speculative.exe cution = false • client settings • hbaserpc Timeout: hbase. RPC. timeout = 600000 (10 minutes for client side timeout) • hbaseclient pause: hbase. client. pause = 3000 • HDFS • block size: DFS. block. size = 134217728 (128 MB) • data node xcievercount: DFS. datanode. max. xcievers = 131072 • number of mappers per node: mapred. tasktracker. map. tasks. maximum = 8 • number of each CERs per node: mapred. tasktracker. reduce. tasks. maximum = 6 • swap turned off

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.