A hbase optimization case study: Facebook Messages system problems and solutions

Source: Internet
Author: User

HDFs design is designed to store large files (such as log files), batch processing, sequential I/O. However, the original intention of hbase design on HDFs is to solve the request of random reading and writing of massive data. How do you rub together the two components that are diametrically opposed to each other's original design? This layered structure is designed to make the architecture clearer, hbase layers and hdfs layers, but it brings potential performance degradation. Two of the most common problems that people use hbase to complain about in many business scenarios are Java GC-related issues and random read-write performance issues. The Facebook Messages (FM System) system can be said to be the first case of HBase in the online storage scene ("Apache Hadoop Goes Realtime at Facebook", Sigmod 2011), Recently they published a paper in the storage area top-level meeting FAST2014 "analysis of the HDFS Under hbase:a Facebook Messages case Study" analyzes some of the problems and solutions they encounter in using HBase, Use HBase to do online storage students can refer to the following.

The paper begins with the analysis of Facebook's methodology, including the architecture and file and data composition of the TRACING/ANALYSIS/SIMULATION,FM system, and then starts to analyze some of the performance problems of the FM system and proposes solutions.

Main read/write I/O load for FM system

Figure 2 describes the I/O composition of each layer, explaining that reading is dominant in the FM system's external request, but due to the logging/compaction/replication/caching, the write is severely amplified.

The design of HBase is hierarchical, which is DB logic layer, FS logic layer, and underlying system logic layer. The primary operation of the externally used interface provided by the DB logic layer is the put () and get () requests, both of which are written to the HDFs, which reads and writes more than 99/1 (Figure 2).

Since the DB logic layer is logging to ensure the persistence of the data, the compaction is done for the high efficiency of the reading, and both of these operations are write-oriented, so the two operations (overheads) are added after the read-write ratio is 79/21 (the second in Figure 2).

The amount of data written to HBase is equivalent to calling the put () operation to write two copies: one write memory Memstore then flush to Hfile/hdfs, and the other through logging to write Hlog/hdfs directly. Memstore accumulated a certain amount of data will write hfile, which makes the compression ratio will be relatively high, and write Hlog request real-time append record results in compression ratio (HBASE-8155) relatively low, resulting in more than 4 times times written amplification.

Compaction operation is to read small hfile to memory merge-sorting into large hfile and then output, speed up hbase read operation. Compaction operation caused the writing to be magnified more than 17 times times, indicating that each part of the data was repeatedly read and written 17 times, so for the content of large attachments is not suitable for storage in hbase. Because reading is a major proportion of the FM business, speeding up the reading operation is very helpful to the business, so the compaction strategy will be more radical.

The HBase data reliable is guaranteed by the HDFS layer, that is, the HDFs three-backup strategy. Then the above HDFs writes will be converted to three times times the local file I/O and twice times the network I/O. This makes the measurement of read/write ratios in local disk I/O become 55/45.

However, because the data requested for the local disk is cached by the local OS cache, the real read is due to the amount of I/O to the read operation caused by the cache miss, which makes the read-write ratio 36/64 and the write is amplified further.

In addition, figure 3 from the real business requirements of the I/O data size to see the various levels, the various operations caused by I/O changes. In addition to the above, it was also found that the entire system was eventually stored on disk with a large number of cold data (2/3), so the need to support hot/cold data storage separately.

In general, the logging/compaction/replication/caching of HBase stack puts uppercase I/O, leading to a business logic-driven HBase system written in the actual disk I/O of the strata.

Main file type and size of FM system

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.