Use Jython to operate hbase

Source: Internet
Author: User

View Original

Previously, ply of python was used to write a prototype of hbase-like SQL Compiler. Currently, only the lexical syntax analyzer has been initially completed, when writing the following pre-processor, logical plan generator, and physical plan generator, the problem arises: hbase and the entire hadoop project are written in Java, of course, Java APIs are the most direct. To use APIs in other languages, you can use the following methods:

  • Thrift API
  • Restful API
  • Other JVM-based languages (Jython, groovy, Scala, etc)

Currently, there are three solutions to complete hbase-like SQL Compiler operations:

  1. The front end continues to write the compiler using python, and the back end uses Java APIs to operate hbase. The intermediate results are saved in some form (such as files ).
  2. Re-use Java to write the lexical syntax compiler, and then directly use hbase's Java API
  3. Continue to write the compiler in Python, and then use Jython

For solution 1, the intermediate results may be complex data structures that are not easy to save. Even if they can be saved, reading and writing will be troublesome.

Solution 2 focuses on the need to find and learn the lexical syntax compiler of Java. I have found anlr, javacc, and so on before, but most of them are heavy and the learning cost is high. Besides, my expectation for SQL-like compilers is relatively simple, and there is no need for such advanced tool libraries.

For solution 3, the compiler written with ply can be said to be half done, and it is also very lightweight to use. If Jython can operate hbase well, the progress can be ensured. I tried Jython and it feels good!

Considering various factors, solution 3 is selected for the time being.

Well, it's a bit difficult. Let's get down to the point and see if Jython operates hbase. (Configuration of Jython and hbase will not be discussed here)

First, start hbase:

Bin/start-hbase.sh

Or, the classpath used by hbase (because Jython needs to use the Java class under classpath ),

ps auwx | grep java | grep org.apache.hadoop.hbase.master.HMaster | perl -pi -e "s/.*classpath //"

PS is used to obtain a running process. The format started by Java is similar

/Usr/lib/JVM/Java-6-sun // bin/Java-xmx1000m... -Classpath xxx

The last Perl is used to get the xxx after-classpath. -P refers to the cyclic operation for entering one row and one row. I refers to the process where you do not need to back up the input file.-E refers to the execution of the command S/. * classpath //. (This command removes both the classpath and the previous characters. Is it sure-Is classpath the last parameter? What if there are other parameters not related to classpath? There are indeed unrelated parameters in the actual situation. Fortunately, there is only one point. This is a small bug .)

Import the obtained classpath to the Environment Variable

export CLASSPATH=XXX

In this way, you can use Jython to run a Python script to operate hbase. The following is a simple example:

123456789101112131415161718192021222324252627282930313233343536 import
java.lang
from
org.apache.hadoop.hbase
import
HBaseConfiguration, HTableDescriptor, HColumnDescriptor, HConstants
from
org.apache.hadoop.hbase.client
import
HBaseAdmin, HTable, Put, Get
 conf =
HBaseConfiguration() admin =
HBaseAdmin(conf) tablename =
"test_jython_hbase" desc =
HTableDescriptor(tablename)desc.addFamily(HColumnDescriptor("content")) # Drop and recreate if it existsif
admin.tableExists(tablename):
    admin.disableTable(tablename)    admin.deleteTable(tablename)admin.createTable(desc) table =
HTable(conf, tablename) # Add contentrow =
'row_x'put_row =
Put(row)put_row.add('content',
'some_content',
'some_value')table.put(put_row) # Read contentget =
Get(row)data_row =
table.get(get)data =
java.lang.String(data_row.value(), "UTF8")print
"The fetched row contains the value '%s'"
%
data # Delete the table.admin.disableTable(desc.getName())admin.deleteTable(desc.getName())

The output result is as follows:

............
12/01/29 23:55:51 debug client. hconnectionmanager $ hconnectionimplementation: cached location for. Meta., 1.1028785192 is ubuntu2-vmware: 60020
12/01/29 23:55:52 debug client. metascanner: scanning. Meta. Starting at ROW = test_jython_hbase, 00000000000000 for max = 2147483647 rows
12/01/29 23:55:52 info zookeeper. zookeeper: initiating client connection, connectstring = ubuntu3-vmware: 2181, ubuntu2-vmware: 2181 sessiontimeout = 180000 watcher = hconnection
12/01/29 23:55:52 info zookeeper. clientcnxn: Opening socket connection to server ubuntu3-vmware/192.168.1.202: 2181
12/01/29 23:55:52 info zookeeper. clientcnxn: Socket Connection established to ubuntu3-vmware/192.168.1.202: 2181, initiating session
12/01/29 23:55:52 info zookeeper. clientcnxn: session establishment complete on server ubuntu3-vmware/192.168.1.202: 2181, sessionid = 0x1352c9556270012, negotiated timeout = 180000
12/01/29 23:55:52 debug client. hconnectionmanager $ hconnectionimplementation: lookedup root region location, connection = org. Apache. hadoop. hbase. Client. hconnectionmanager $ hconnectionimplementation @ 8a2006; HSA = ubuntu3-vmware: 60020
12/01/29 23:55:52 debug client. hconnectionmanager $ hconnectionimplementation: cached location for. Meta., 1.1028785192 is ubuntu2-vmware: 60020
12/01/29 23:55:52 debug client. metascanner: scanning. Meta. Starting at ROW = test_jython_hbase, 00000000000000 for max = 10 rows
12/01/29 23:55:52 debug client. hconnectionmanager $ hconnectionimplementation: cached location for test_jython_hbase, 1327910151208.09451a5e064db613641741bd8c896eb7. is ubuntu2-vmware: 60020
The fetched row contains the value 'some _ value'
12/01/29 23:55:52 info client. hbaseadmin: started disable of test_jython_hbase
12/01/29 23:55:52 debug client. hbaseadmin: Sleeping = 1000 ms, waiting for all regions to be disabled in test_jython_hbase
12/01/29 23:55:53 debug client. hbaseadmin: Sleeping = 1000 ms, waiting for all regions to be disabled in test_jython_hbase
12/01/29 23:55:54 info client. hbaseadmin: Disabled test_jython_hbase
12/01/29 23:55:55 info client. hbaseadmin: deleted test_jython_hbase

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.