Configure Thrift
Python-Used Package thrift
The python compiler used by the individual is Pycharm Community Edition. In the project setup, locate Project interpreter, locate the package under the appropriate engineering, and then select "+" Add, search Hbase-thrift (Python Client for HBase thrift interface), After installing the package.
Install server-side thrift.
Refer to the official website, but also can be installed on this machine to use the terminal.
Thrift Getting Started
You can also refer to the installation method python invoke hbase example
First, install Thrift
Download thrift, here, I'm using this version of thrift-0.7.0-dev.tar.gz.
Tar xzf thrift-0.7.0-dev.tar.gz
CD Thrift-0.7.0-dev
sudo./configure–with-cpp=no–with-ruby=no
sudo make
sudo make install
Then, go to HBase's source pack and find
src/main/resources/org/apache/hadoop/hbase/thrift/
Perform
Thrift–gen py Hbase.thrift
MV gen-py/hbase//usr/lib/python2.4/site-packages/(depending on Python version may be different)
Get Data Sample 1
# Coding:utf-8 from thrift Import Thrift to Thrift.transport import tsocket from thrift.transport import Ttransport fr Om thrift.protocol import tbinaryprotocol from hbase import hbase # from Hbase.ttypes import Columndescriptor, Mutation, B Atchmutation from hbase.ttypes import * Import csv def client_conn (): # make Socket transport = Tsocket.tsocket (' Host Name,like:localhost ', Port] # buffering is critical. Raw sockets are very slow transport = Ttransport.tbufferedtransport (transport) # Wrap in a protocol protocol = TBINARYP Rotocol.
Tbinaryprotocol (Transport) # Create a client to use the Protocol encoder client = hbase.client (protocol) # connect! Transport.open () return client if __name__ = = "__MAIN__": client = Client_conn () # r = client.getrowwithcolumns (' tab
Le name ', ' Row name ', [' column name '] # print (R[0].columns.get (' column name '), type ((R[0].columns.get (' column name ')) result = Client.getrow ("Table name", "Row name") data_simple =[] # print result[0]. Columns.items () for K, V in Result[0].columns.items (): #.keys () #data. Append ((k,v)) # Print type (k), type (v), V.value ,, V.timestamp data_simple.append (V.timestamp, V.value) writer.writerows (data) csvfile.close () Csvfile_simple = op En ("Data_xy_simple.csv", "WB") Writer_simple = Csv.writer (csvfile_simple) writer_simple.writerow (["Timestamp", " Value "]) writer_simple.writerows (data_simple) csvfile_simple.close () print" Finished "
The underlying python should know that result is a list,result[0].columns.items () is a dict key-value pair. Can query the relevant information. Or, by outputting the variable, observe the value and type of the variable.
Description: transport.open () in the above program link, after the execution, also need to disconnect transport.close ()
Currently, only read data is involved, and other dBASE operations will continue to be updated.
The above Python operation HBase data method is small series to share all the content, hope to give you a reference, also hope that we support cloud habitat community.