How Python Thrift Framework Operations HBase database and shell operations

Last Update:2017-01-13 Source: Internet

Author: User

Tags mongodb serialization

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Used to be used MongoDB, but the quantity is big, mongodb appear not so reliable, change into hbase prop up a magnitude.

HBase is a database of Apache Hadoop, which provides random, real-time, read-write access to large data. The goal of HBase is to store and process large data. HBase is an open-source, distributed, multiple-version, column-oriented storage model. It stores loose-type data.

The HBase provides a rich access interface.

HBase Shell

Java CLIETN API

Jython, Groovy DSL, Scala

REST

Thrift (Ruby, Python, Perl, C + + ...) ）

Mapreduce

Hive/pig

HBase (Main):001:0>

#创建表

HBase (Main): 002:0* create ' blog ', ' info ', ' content '

0 row (s) in 2.0290 seconds

#查看表

HBase (main):003:0> list

TABLE

Blog

Test_standalone

2 row (s) in 0.0270 seconds

#增添数据

HBase (main):004:0> put ' blog ', ' 1 ', ' info:editor ', ' Liudehua '

0 row (s) in 0.1340 seconds

HBase (main):005:0> put ' blog ', ' 1 ', ' info:address ', ' BJ '

0 row (s) in 0.0070 seconds

HBase (main):006:0> put ' blogs ', ' 1 ', ' content:header ', ' This is header '

0 row (s) in 0.0070 seconds

HBase (Main):007:0>

HBase (main): 008:0*

HBase (Main): 009:0* get ' blog ', ' 1 '

COLUMN CELL

Content:header timestamp=1407464302384, Value=this is header

Info:address timestamp=1407464281942, VALUE=BJ

Info:editor timestamp=1407464270098, Value=liudehua

3 row (s) in 0.0360 seconds

HBase (main):010:0> get ' blog ', ' 1 ', ' info '

column                                                   cell                                                                                                                                                                

 info:address                                            timestamp=1407464281942, value=bj                                                                                                                                  

 info:editor                                             timestamp=1407464270098, value=liudehua                                                                                                                            

2 row (s) in 0.0120 seconds

#这里是可以按照条件查询的.

HBase (Main): 012:0* scan ' blog '

ROW Column+cell

1 Column=content:header, timestamp=1407464302384, Value=this is header

 1                                                       column= Info:address, timestamp=1407464281942, value=bj                                                                                                            

1 Column=info:editor, timestamp=1407464270098, Value=liudehua

1 row (s) in 0.0490 seconds

HBase (Main):013:0>

HBase (Main): 014:0* put ' blogs ', ' 1 ', ' content:header ', ' This is Header2 '

0 row (s) in 0.0080 seconds

HBase (Main):015:0>

HBase (main): 016:0*

HBase (Main): 017:0* put ' blogs ', ' 1 ', ' content:header ', ' This is Header3 '

0 row (s) in 0.0050 seconds

HBase (main):018:0> Scan ' blog '

row                                                      Column+cell                                                                                                                                                         

1 Column=content:header, timestamp=1407464457128, Value=this is header 3

 1                                                       column= Info:address, timestamp=1407464281942, value=bj                                                                                                            

1 Column=info:editor, timestamp=1407464270098, Value=liudehua

1 row (s) in 0.0180 seconds

HBase (main):020:0> get ' blog ', ' 1 ', ' Content:header '

column                                                   cell                                                                                                                                                                

Content:header timestamp=1407464457128, Value=this is Header3

1 row (s) in 0.0090 seconds

HBase (Main):021:0>

#可以看到历史版本记录

HBase (Main): 022:0* get ' blog ', ' 1 ', {COLUMN => ' content:header ', versions => 2}

COLUMN CELL

Content:header timestamp=1407464457128, Value=this is Header3

Content:header timestamp=1407464454648, Value=this is Header2

2 row (s) in 0.0100 seconds

#可以看到历史版本记录

HBase (main):023:0> get ' blog ', ' 1 ', {COLUMN => ' content:header ', versions => 3}

COLUMN CELL

Content:header timestamp=1407464457128, Value=this is Header3

Content:header timestamp=1407464454648, Value=this is Header2

Content:header timestamp=1407464302384, Value=this is header

3 row (s) in 0.0490 seconds

HBase (Main):024:0>

Base Java to operate is the most convenient, but also the most efficient way. But Java is not lightweight and inconvenient to debug in any environment. And different developers are familiar with the language is not the same, development efficiency is not the same. HBase through the thrift, you can also use Python,ruby,cpp,perl and other languages to operate.

Thrift is a remote invocation component of Google-like Protobuf, which Facebook developed open source. However, PROTOBUF only has serialization of data, and only binary protocols are supported, and there is no remote call section. Protobuf native Support Cpp,python,java, in addition to third-party implementation of Objectc,ruby and other languages. And thrift is the realization of serialization, transmission, protocol definition, remote invocation and other functions, more cross-language capabilities. In some respects they can be substituted for each other, but in some respects they have their own scope of application.

Thrift installation and Thrift Python related modules ~

The code is as follows	Copy Code
Http://www.apache.org/dist//thrift/0.9.1/thrift-0.9.1.tar.gz Tar zxvf thrift-0.8.0.tar.gz CD thrift-0.8.0 ./configure-with-cpp=no Make sudo make install sudo pip install thrift

Here are the thrift and hbase modules that can generate Python ~

Thrift-gen py/home/ubuntu/hbase-0.98.1/hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift/ Hbase.thrift

The code is as follows	Copy Code
From Thrift.transport import Tsocket From Thrift.protocol import Tbinaryprotocol From HBase import HBase Transport=tsocket.tsocket (' localhost ', 9090) Protocol=tbinaryprotocol.tbinaryprotocol (transport) Client=hbase.client (Protocol) Transport.open () Client.gettablenames ()

The version of HBase 0.98 looks like there is no thrift related to the formation, I am here with the 0.94 version of the fix.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More