How Python Thrift Framework Operations HBase database and shell operations

Source: Internet
Author: User
Tags mongodb serialization

Used to be used MongoDB, but the quantity is big, mongodb appear not so reliable, change into hbase prop up a magnitude.





HBase is a database of Apache Hadoop, which provides random, real-time, read-write access to large data. The goal of HBase is to store and process large data. HBase is an open-source, distributed, multiple-version, column-oriented storage model. It stores loose-type data.





The HBase provides a rich access interface.





HBase Shell


Java CLIETN API


Jython, Groovy DSL, Scala


REST


Thrift (Ruby, Python, Perl, C + + ...) )


Mapreduce


Hive/pig





HBase (Main):001:0>





#创建表





HBase (Main): 002:0* create ' blog ', ' info ', ' content '





0 row (s) in 2.0290 seconds





#查看表





HBase (main):003:0> list





TABLE





Blog





Test_standalone





2 row (s) in 0.0270 seconds





#增添数据





HBase (main):004:0> put ' blog ', ' 1 ', ' info:editor ', ' Liudehua '





0 row (s) in 0.1340 seconds








HBase (main):005:0> put ' blog ', ' 1 ', ' info:address ', ' BJ '





0 row (s) in 0.0070 seconds





HBase (main):006:0> put ' blogs ', ' 1 ', ' content:header ', ' This is header '





0 row (s) in 0.0070 seconds








HBase (Main):007:0>





HBase (main): 008:0*





HBase (Main): 009:0* get ' blog ', ' 1 '





COLUMN CELL





Content:header timestamp=1407464302384, Value=this is header





Info:address timestamp=1407464281942, VALUE=BJ





Info:editor timestamp=1407464270098, Value=liudehua





3 row (s) in 0.0360 seconds





HBase (main):010:0> get ' blog ', ' 1 ', ' info '





column                                                   cell                                                                                                                                                                


 info:address                                            timestamp=1407464281942, value=bj                                                                                                                                  


 info:editor                                             timestamp=1407464270098, value=liudehua                                                                                                                            


2 row (s) in 0.0120 seconds








#这里是可以按照条件查询的.





HBase (Main): 012:0* scan ' blog '





ROW Column+cell





1 Column=content:header, timestamp=1407464302384, Value=this is header


 1                                                       column= Info:address, timestamp=1407464281942, value=bj                                                                                                            


1 Column=info:editor, timestamp=1407464270098, Value=liudehua


1 row (s) in 0.0490 seconds








HBase (Main):013:0>





HBase (Main): 014:0* put ' blogs ', ' 1 ', ' content:header ', ' This is Header2 '





0 row (s) in 0.0080 seconds





HBase (Main):015:0>





HBase (main): 016:0*





HBase (Main): 017:0* put ' blogs ', ' 1 ', ' content:header ', ' This is Header3 '





0 row (s) in 0.0050 seconds





HBase (main):018:0> Scan ' blog '





row                                                      Column+cell                                                                                                                                                         


1 Column=content:header, timestamp=1407464457128, Value=this is header 3


 1                                                       column= Info:address, timestamp=1407464281942, value=bj                                                                                                            


1 Column=info:editor, timestamp=1407464270098, Value=liudehua


1 row (s) in 0.0180 seconds





HBase (main):020:0> get ' blog ', ' 1 ', ' Content:header '





column                                                   cell                                                                                                                                                                


Content:header timestamp=1407464457128, Value=this is Header3


1 row (s) in 0.0090 seconds





HBase (Main):021:0>











#可以看到历史版本记录





HBase (Main): 022:0* get ' blog ', ' 1 ', {COLUMN => ' content:header ', versions => 2}





COLUMN CELL





Content:header timestamp=1407464457128, Value=this is Header3


Content:header timestamp=1407464454648, Value=this is Header2


2 row (s) in 0.0100 seconds





#可以看到历史版本记录





HBase (main):023:0> get ' blog ', ' 1 ', {COLUMN => ' content:header ', versions => 3}





COLUMN CELL





Content:header timestamp=1407464457128, Value=this is Header3


Content:header timestamp=1407464454648, Value=this is Header2


Content:header timestamp=1407464302384, Value=this is header


3 row (s) in 0.0490 seconds





HBase (Main):024:0>








Base Java to operate is the most convenient, but also the most efficient way. But Java is not lightweight and inconvenient to debug in any environment. And different developers are familiar with the language is not the same, development efficiency is not the same. HBase through the thrift, you can also use Python,ruby,cpp,perl and other languages to operate.








Thrift is a remote invocation component of Google-like Protobuf, which Facebook developed open source. However, PROTOBUF only has serialization of data, and only binary protocols are supported, and there is no remote call section. Protobuf native Support Cpp,python,java, in addition to third-party implementation of Objectc,ruby and other languages. And thrift is the realization of serialization, transmission, protocol definition, remote invocation and other functions, more cross-language capabilities. In some respects they can be substituted for each other, but in some respects they have their own scope of application.

Thrift installation and Thrift Python related modules ~

The code is as follows Copy Code
Http://www.apache.org/dist//thrift/0.9.1/thrift-0.9.1.tar.gz
Tar zxvf thrift-0.8.0.tar.gz
CD thrift-0.8.0
./configure-with-cpp=no
Make
sudo make install
sudo pip install thrift



Here are the thrift and hbase modules that can generate Python ~


Thrift-gen py/home/ubuntu/hbase-0.98.1/hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift/ Hbase.thrift

The code is as follows Copy Code
From Thrift.transport import Tsocket

From Thrift.protocol import Tbinaryprotocol

From HBase import HBase

Transport=tsocket.tsocket (' localhost ', 9090)

Protocol=tbinaryprotocol.tbinaryprotocol (transport)

Client=hbase.client (Protocol)

Transport.open ()

Client.gettablenames ()



The version of HBase 0.98 looks like there is no thrift related to the formation, I am here with the 0.94 version of the fix.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.