Implementation and performance of Hadoop reference design: HBase Application Performance test method

Source: Internet
Author: User
Keywords nbsp;

Test Tool YCSB Installation

YCSB Introduction: YCSB (Yahoo! Cloud serving Benchmark) is Yahoo Open source of a common performance testing tool. Can be used to test a variety of NoSQL products. Related instructions can refer to Https://github.com/brianfrankcooper/YCSB/wiki.

YCSB works as shown in the above illustration, the main modules include workload and DB Interface:

Workload: Through the configuration file, the definition reads and writes the ratio, the data size and so on DB Interface: connects through the interface and operates each kind of cloud serving Store, namely various NoSQL product including HBase. When running YCSB, you can configure different workload and DB interface, and you can define additional parameters such as the number of threads.

Installation method One: Download the compiled package directly

Download Address: https://github.com/downloads/brianfrankcooper/YCSB/ycsb-0.1.4.tar.gz

Decompression: Tar xfvz ycsb-0.1.4

This approach is simple and easy to use. However, there may be problems with different versions of HBase. The installation needs to be compiled manually.

Installation mode two: source code compilation installation

Download source from github: Git clone https://github.com/brianfrankcooper/YCSB.git

Configure the appropriate HBase version: Modify Ycsb/pom.xml, update hbase.version this property.

Compiling: CD YCSB, mvn install

Build YCSB Package: distribution/target/ycsb-0.1.4.tar.gz

Decompression: Tar xfvz ycsb-0.1.4

1. Test steps

A) Configure HBase connections and Classpath

The easiest way to do this is to copy the HBase server's configuration file {$HBase _home}/conf/hbase-site.xml directly to the YCSB directory {$YCSB _home}/hbase-binding/conf.

Copy the HBase jar file to the {$YCSB _home}/hbase-binding/lib so that when you execute the YCSB command, you can guarantee that the jar you need is on the classpath.

b) Introduction to the YCSB command

Direct execution of the YCSB command, you can see the introduction of usage, there are 3 kinds of parameters mainly:

Commands: What command to execute, load-load data, run-run test, shell-interactive mode;

Databases: What DB Interface to use;

Options: Includes attribute parameters and thread parameters.

Https://github.com/brianfrankcooper/YCSB/wiki/Core-Properties

Refer to the links above to discover the workload core attribute parameters.

c) Loading Data

BIN/YCSB load hbase-p workloads/workloada-p columnfamily=f1-p recordcount=10000-s-threads 10

Insert 10,000 data into the USERTABLE,F1 under HBase Server and print the execution to the screen. For hbase databases, the data is read in byte-coded array byte], and for different data sources, whether it is a string of the license plate number or an electronic picture, the type of binary is byte when read from the HBase database. The difference is the length of the array.

For a "Beijing K12345" license plate, its length is 8 bits, for a 5M or so picture, its length is 2326122. We can set it through the workload core Properties Fieldlength.

D) Performance Testing

YCSB with 6 workload configuration file, simulate different pressure scene

The above is the content of WORKLOADC, simulation is 100% read operation scene.

BIN/YCSB run hbase-p workloads/workloadc-p columnfamily=f1-s-threads 10

Perform performance testing based on WORKLOADC.

2. Custom development and expansion of tools

The above introduction is based on YCSB function. In some cases, we need to extend and customize the test method, YCSB is open source pure Java solution, can fully meet the special requirements. The following is an analysis of the Java Class associated with YCSB.

A) definition of workload: com.yahoo.ycsb.workloads.CoreWorkload

b HBase DB Interface definition: com.yahoo.ycsb.db.HBaseClient

c) Data Generator generator:com.yahoo.ycsb.generator.*

D YCSB Main program: Com.yahoo.ycsb.Clien

Pressure test parameters and instructions

The test is divided into two parts, small data table and large data table. The so-called size represents the size of a single record, in the test, small data table of a single record size of 8Byte, large data table single record size is 2MB. Through the YCSB database testing tool, we have conducted stress tests on various database operations, including read, insert, update, Scan, and read-modify-write. These tests reflect the performance of the Hadoop hbase, as well as the actual operation

is simulated.

Instructions:

READ: Reads a record. Speed is related to the IO rate of the system, the faster the read rate is, the faster the read rate.

Insert: Inserts a record. Speed is related to the IO rate of the system, the faster the system writes, the faster the read rate.

Update: Updates a record that is essentially the same as the insert operation.

Scan: Scan the entire table, the rate is related to the reading rate and the size of the entire table, the larger the table, the slower the single Scan rate.

Test environment:

Hardware:

Software:

* Note: Unless specifically noted in this article, all other parameters are in the default parameters of Apache Hadoop Intel distribution 2.3.

YCSB test results for Apache hadoop* Intel distribution

In the following tests, we set the YCSB client to simulate different pressure situations by setting a different number of threads.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.