"HBase authoritative guide" reading note 8: Eighth chapter structure

Source: Internet
Author: User

8.1 Data Lookup and Transmission B + Tree: B + tree improvement for leaf nodes is also sequential emissions
LSM (log-structured merge-tree) tree
Storage System Overview

HBase mainly handles two types of files: the WAL (write-ahead log) pre-write log and the actual data file basic process
    1. Client-side Zookeeper lookup row health
    2. Get the region server name with-root-by zookeeper
    3. The region server containing the-root-can be queried for inclusion. META. The corresponding region server name in the table that contains the row key information for the request. The main contents of both are cached and are queried only once
    4. Finally, through the query. META. Server to get the server name of the region where the client's row key data resides
Hregionserver
    • Hregionserver is responsible for opening the region and creating the corresponding Hregion instance, and when Hregion is opened, it creates a store instance for each hcolumnfamily of the table.
    • Each store instance contains one or more storefile instances, which are lightweight packages for the actual data file hfile
    • Each store also has its counterpart, a Memstore
    • A hregionserver share a Hlog instance
Path to write path write request to:
    1. WAL (HDFS)
    2. Memstore
    3. hfile
There is a very good picture on the Internet, although not the authoritative manual inside, so add to this article inside. It is obvious from the diagram that the Wal is on HDFs

File
    • The wal is stored in the. log folder inside the/hbase directory in HDFs.
    • The display size is 0 because the Wal file has just been created. This is because the file is written in HDFs with append, and the file is visible to the user until the file reaches a full block.
    • The WAL file waits until Hbase.regionserver.logroll.period (the default is 60 minutes) to be scrolled, followed by its next new log file size starting with 0.
    • After scrolling the old logs are placed under the. Oldlogs and wait until the Hbase.master.logcleaner.ttl (default is 10 minutes) is deleted. The detection interval is set by the Hbase.master.cleaner.interval property.
Table-Level files
    • In HBase, each table has its own directory, located under the HBase root directory. Each table directory includes a top-level file named. tableinfo
    • The file corresponds to the serialized htabledescription
Region-level files
    • . regioninfo corresponding Hregioninfo instance
    • HBCK is to use. RegionInfo to examine and generate missing entries in the metadata table
    • Region if the configured maximum value of hbase.hregion.max.filesize is exceeded, it is split and a splits directory is created
Merge
    • When the number of files reaches the threshold, the merge operation is triggered. The process continues until the largest of these files exceeds the maximum storage size, and then triggers a region split
    • Merge into minor and major
    • Minor merge is responsible for rewriting the last generation of several files into a larger file
    • Minor merge is responsible for rewriting the last generated files into a larger file, specifically merging several files by hbase.hstore.compaction.min definition
    • The maximum number of files that can be processed by the minor merge defaults to 10, which can be defined by the user Hbase.hstore.compaction.max
    • Major merge to compress all files into a single file
Trigger for Merge:
    • The Compactionchecker class is implemented with a fixed cycle trigger check, which is controlled by the hbase.server.thread.wakefrequency parameter (multiplied by Hbase.server.thread.wakefrequency.multiplier, set to 1000 so that it does not perform as often as other thread-based tasks
    • Unless Majorcompact () is used, the server first checks whether the time limit specified for the last run to Hbase.hregion.majorcompaction (24 hours by default) is now reached.
    • If there is no execution cycle to the major merge, the system chooses minor merge execution
Walhlog class
    • The class that implements the Wal is called Hlog
    • The Wal is optional, if the user can get extra performance when performing a mapreduce job that imports data in large batches offline, but it is important to be aware of the possibility of data loss when importing (it is strongly recommended not to close)
    • Hlog is shared by all the region on this machine
The Hlogkey class Wal uses the Sequencefile of Hadoop. This file format stores data according to key value, Hlogkey class as key storage, data attribution, region and table name, write time, cluster Idlogsyncer class pipeline write and multi-path write sync () to achieve pipeline write, When the modification is sent to the first datanode, after processing is sent to the next Datanode, until 3 Datanode have confirmed the write operation, the client is allowed to continue the multi-path write is sent to 3 hosts, when all the host confirmed the write operation , the client can continue to differentiate: the pipeline write latency is high, but the bandwidth can be better utilized. Multiple writes have a relatively low latency because the client only needs to wait for the slowest datanode acknowledgment.
Delay Log Brush Write deferred log flush, the default is False, if true, the modification is first cached in the region server, then there is a Logsyncer class on the server, Write data once every 1 seconds (Hbase.regionserver.optionallogflushinterval settings) Logroller log appears
2011-06-15 01:45:33,323 INFO org.apache.hadoop.hbase.region server. Hlog:too many hlogs:logs=130,maxlog=96;forcing flush of 8 region (s): ...
is because the number of log files that need to be retained exceeds the maximum number of log files set, but there is still some data to be updated. The server enters a special mode to force the update data in the content to be brushed to reduce the amount of log that needs to be saved. Other parameters for controlling log scrolling are
    • Hbase.regionserver.hlog.blocksize (set to file system default block size or fs.local.block.size default to 32M)
    • Hbase.regionserver.logroll.multiplier (set to 0.95) indicates that the log reaches 95% of the block size and scrolls
The reason for replaying a daily journal is to reduce disk addressing and improve performance, but it can cause recovery problems by splitting the log split log into two types of logs that need to be replayed:
    • When the cluster starts
    • When the service fails
Data recovery
    • When the region starts, it checks to see if the Recovered.edits directory exists and begins to read and recover the data if it exists.
    • When the sequence ID is less than the sequence ID on the hard disk, it is ignored
Read path In fact the internal implementation of get is also the Scanregion life cycle region of all possible states
Status Description
Offline Region downline
Pending Open A request to open a region has been sent to the server
Opening Server starts opening region
Open Region is already open and can be used
Pending Close The request to close region has been sent to the server
Closing is shutting down
Closed is off
Splitting Server starts splitting region
Split Region has been sliced.

Information stored in the Zookeeperzookeeper
    • /hbase/hbaseid: Can be viewed using Get/hbase/hbaseid, similar to other commands, including Clusterid
    • /hbase/master contains server name
    • /hbase/replication contains replica information
    • /hbase/root-region-server contains the machine name of the region server where the-root-region is located, which is often used in area targeting
    • /hbase/rs This znode is the root node for all region servers, and the cluster is used to track server exceptions, each znode is a temporary node, and the node name is the name of the region server
    • /hbase/shutdown This node is used to track cluster status information, including the time of cluster startup and the empty state when the cluster is shut down
    • /hbase/splitlog Reconcile Log Split-related parent node
    • /hbase/table when the table is disabled, the information is added to this znode. The table name is the new node name, and the content is DISABLED

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

"HBase authoritative guide" reading note 8: Eighth chapter structure

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.