Introduction to the Hadoop environment:
Master Service: Node1
Slave Server: Node2,node3,node4
MySQL server: node29
Thrift installed on the NODE1 server!
Related software versions:
Hadoop version: hadoop-0.20.2
Sqoop version: SQOOP-1.2.0-CDH3B4
Java version: jdk1.7.0_67
MySQL version: 5.1.65
Thrift Version: thrift-0.9.0
Thrift Installation Links: http://thrift.apache.org/download/
Python version: 2.7.3
ps:python2.5 Version Use Thrift problem
One: Pre-Test preparation work
1) First load the data in the MySQL database into hbase:
MySQL data is as follows:
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M02/4D/0F/wKiom1RKCu7Acn6aAAEPRQIQRa8209.jpg "title=" MySQL condition. jpg "alt=" wkiom1rkcu7acn6aaaeprqiqra8209.jpg "/>
The command format for importing MySQL data into HBase is:
Sqoop import--connect jdbc:mysql://mysqlserver_ip/databasename--username--password Password--table DataTable-- Hbase-create-table--hbase-table hbase_tablename--column-family col_fam_name--hbase-row-key key_col_name
Description: DatabaseName and DataTable are MySQL database and table names, Hbase_tablename is the table name to be used in HBase, Key_col_name can specify which column in the DataTable is The rowkey,col_fam_name of the new table for HBase is the column family name for all columns except Rowkey
2) Load MySQL data (node29) into hbase on Node1:
Sqoop import--connect jdbc:mysql://172.16.41.29/sqoop--username sqoop--password Routon--table Students-- Hbase-create-table--hbase-table students--column-family stuinfo--hbase-row-key ID
Verify that the load succeeded in HBase:
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/4D/10/wKioL1RKDCHRujBiAAJMEeBRwoU194.jpg "title=" HBase situation. jpg "alt=" wkiol1rkdchrujbiaajmeebrwou194.jpg "/>
Two Thrift Software Installation
Python version: 2.7.3
The steps are:
1) Install python2.7.3
Description: python2.7.3 and thrift Combination No problem, python2.5 version seems to be no!
The syntax rhel5 in the generated hbase.py file does not support python2.4
Tar fvxj python-2.7.3.tar.bz2
./configure--prefix=/usr/local/python2.7
Make && make install
The python2.7.3 path is:
/usr/local/python2.7/bin/python
Modify the default Python version to 2.7
Set python2.7 to environment variable, system default Python version is 2.4
Rm-rf/usr/bin/python
Ln-s/usr/local/python2.7/bin/python/usr/bin/python
[Email protected] thrift-0.9.0]# python-v
Python 2.7.3
2) Install Thrift
Tar fvxz thrift-0.9.0.tar.gz
CD thrift-0.9.0
./configure
Make && make install
Thrift 0.9.0
Building C + + Library ...: no
Building C (GLib) Library ....: Yes
Building Java Library ...: No
Building C # Library ...: No
Building Python Library ...: Yes
Building Ruby Library ...: No
Building Haskell Library .....: No
Building Perl Library ...: No
Building PHP Library ....: No
Building Erlang Library ...: No
Building Go Library ....: No
Building D Library .....: No
Python Library:
Using Python ....:/usr/bin/python ......
Can see thrift support many languages, according to the current needs, support Python can!
View Thrift Version:
[Email protected] thrift-0.9.0]# thrift-version
Thrift version 0.9.0
3) Let thrift support hbase
Execute the following command:
Thrift--gen Py/usr/local/hbase-0.90.5/src/main/resources/org/apache/hadoop/hbase/thrift/hbase.thrift
A directory is created at the current time and the directory name is:
[email protected] ~]$ LL
Total 7056
-rw-rw-r--1 Hadoop hadoop 3045 Oct 13:55 access_log2.txt
-rw-r--r--1 Hadoop hadoop 7118627 Feb 1 access_log.txt
-rw-rw-r--1 Hadoop hadoop 3500 Oct 10:17 Derby.log
Drwxrwxr-x 3 Hadoop hadoop 4096 Oct 15:28 gen-py
-rw-rw-r--1 Hadoop hadoop 3551 Oct 11:21 pig_1413170429087.log
The GEN-PY directory structure is as follows:
[Email protected] ~]$ tree gen-py/
gen-py/
|--__init__.py
'--HBase
|--Hbase-remote
|--hbase.py
|--__init__.py
|--constants.py
'--ttypes.py
1 directory, 6 files
4) Copy the Gen-py directory to the Python related directory:
Cp-r gen-py/hbase//usr/local/python2.7/lib/python2.7/site-packages/.
5) Allow Python to import the Thrift module:
[Email protected] ~]# ln-s/usr/lib/python2.7/site-packages/thrift*/usr/local/python2.7/lib/python2.7/ Site-packages/.
[Email protected] ~]# ls-l/usr/local/python2.7/lib/python2.7/site-packages/
Total 12
Drwxr-xr-x 2 root root 4096 Oct 15:32 hbase
-rw-r--r--1 root root 119 Oct
lrwxrwxrwx 1 root root the Oct 15:50 thrift-/usr/lib/python2.7/site-packages/thrift
lrwxrwxrwx 1 root root the Oct 15:50 thrift-0.9.0-py2.7.egg-info-/usr/lib/python2.7/site-packages/thrift-0.9.0-p Y2.7.egg-info
6) Start the Thrift service:
HBase thrift-p 9090 Start
7) Write a Python script on Node1 to see which tables are in HBase:
#! /usr/bin/env python#coding=utf-8import sys#hbase.thrift the generated py file is placed here Sys.path.append ('/usr/ Local/lib/python2.7/site-packages/hbase ') From thrift import thriftfrom thrift.transport import TSocketfrom thrift.transport import TTransportfrom thrift.protocol import tbinaryprotocolfrom hbase import hbase#, such as columndescriptor , define from in Hbase.ttypes hbase.ttypes import *# make socket# Here you can modify the address and Port Transport = tsocket.tsocket ( ' 172.16.41.26 ', 9090) # buffering is critical. raw sockets are very slow# can also use Tframedtransport, is also efficient transmission mode Transport = ttransport.tbufferedtransport (transport) # wrap in a protocol# transport protocol and transfer process is separate, can support multi-protocol protocol = Tbinaryprotocol.tbinaryprotocol (transport) #客户端代表一个用户client = hbase.client (Protocol) # Open Connection Transport.open () #打印表名print (Client.gettablenames ())
Execute script:
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/4D/10/wKiom1RKEALwhL86AAEY9rF6T_g766.jpg "title=" Thrift.jpg "alt=" Wkiom1rkealwhl86aaey9rf6t_g766.jpg "/>
Here, Python can communicate with HBase via the thrift plugin!
This article is from the "Shine_forever blog" blog, make sure to keep this source http://shineforever.blog.51cto.com/1429204/1567640
Use Python to access HBase (Thrift module installation and testing)