HBase Foundation and pseudo-distributed installation configuration

Source: Internet
Author: User

One, HBase (NoSQL) data model

1.1 Tables (table), which is the storage management data.

1.2 Rows key (Row key), similar to the primary key in MySQL, the row key is the HBase table naturally comes with, do not need to specify when creating the table

1.3 Column family (column family), a collection of columns.

There are multiple rows in a table, a row jian reads a record, the column family is similar to the column in MySQL, but it is a collection of columns

The column families in HBase need to be specified when the table is defined, and the columns are dynamically incremented when the record is inserted.

When data in an hbase table is stored on a local disk, each column family is stored separately as a file.

Represents a row of tables in HBase

Unlike relational databases,

The value of each column in a relational database row can only be one, such as:

UserId UserName

1 jchubby

In NoSQL, the value of one column in a row can be multiple, such as, or:

UserId UserName

1 jchubby

Looky

The timestamp timestamp column is omitted, but when reading this line of data in NoSQL, the data should be the same as the relational database read.

The timestamp column acts as the identity column data version, and when the timestamp is not specified the default is the most recent column data, please refer to

1.4 The stored data is a byte array.



Ii. Physical model of HBase

2.1 HBase is a database of simple second-level queries for massive data, such as 20PB.

2.2 The records in the HBase table, split by row key and split into region.

For example, in a table with 1W rows, each 2K row is divided into a region stored in a different node, each region records the starting and final position of the line health [Startkey,endkey]

Many region stores are stored in region server (a separate physical machine).

In this way, the operation of the table translates into a parallel query to multiple region servers.

There are two special tables in HBase, namely-root and. META

. The meta records the beginning and end of each region, and when the. Meta records are large, they are split into different region records in the-root table according to the same rules.

As shown, when querying the data, find the region information recorded in the-root table and locate the corresponding. Region in the meta table, querying data on the region to the actual node



Iii. the system structure of HBase

3.1 HBase is a master-slave structure, hmaster, hregionserver


Iv.. HBase Pseudo-distributed installation

HBase installation is built on top of Hadoop and zookeeper clusters

Ensure that Hadoop and zookeeper clusters are installed successfully and started during installation

4.1 decompressing, renaming, setting environment variables

Copy the hbase-0.94.2-security.tar.gz to the/home/hadoop.

Unzip hbase-0.94.2-security.tar.gz and rename

#cd/home/hadoop

#tar-ZXVF hbase-0.94.2-security.tar.gz

#mv hbase-0.94.2-security HBase

Modify the/etc/profile file.

#vi/etc/profile

Increase

Export Hbase_home=/home/hadoop/hbase

Modify

Export path= $JAVA _home/bin: $PATH: $HADOOP _home/bin: $HBASE _home/bin

Save exit

#source/etc/profile

4.2 Modify $hbase_home/conf/hbase-env.sh, modify the content as follows:

Export java_home=/usr/java/jdk1.6.0_45

Export Hbase_manages_zk=true

The first configuration of the Java environment variable

The second hbase configured on this machine can start zookeeper itself and use

4.2 Modify $hbase_home/conf/hbase-site.xml, modify the content as follows:

<property>

<name>hbase.rootdir</name>

<value>hdfs://master:9000/hbase</value>

</property>

<property>

<name>hbase.cluster.distributed</name>

<value>true</value>

</property>

<property>

<name>hbase.zookeeper.quorum</name>

<value>master</value>

</property>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

Hbase.rootdir Configuring the path to HBase storage on the HDFs file system

Hbase.cluster.distributed whether the configuration is distributed

Hbase.zookeeper.quorum configuration zookeeper on which node

Dfs.replication Number of configuration replicas

Note: The host and port number of the Hbase.rootdir is consistent with the Fs.default.name host and port number of the Hadoop configuration file Core-site.xml

4.3 (optional) The content of the file regionservers is master, which records the hostname of each node of the regionserver because it is a pseudo-distributed installation, only one, localhost, or host name can be

4.4 Start HBase and execute the command in the bin directory start-hbase.sh

Before starting HBase, make sure that Hadoop is healthy and can write to the file *******

4.5 Verify that the installation is successful:

(1) Implementation of JPS, found that the new addition of 3 Java processes, respectively, Hmaster, Hregionserver, Hquorumpeer

(2) Access to http://master:16010 using a browser, you can enter a Web management page similar to Hadoop

HBase Foundation and pseudo-distributed installation configuration

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.