Php uses thrift0.9.0 to operate HBase

Source: Internet
Author: User
Php uses thrift0.9.0 to operate HBase. Recently, thrift and php need to be used in the project to read and write data in HBase. Therefore, the related classes are sorted out and tested.

Currently, you can use the following methods to operate HBase:

1. HBase Shell is mainly used to execute shell after configuration to view HBase data, such as count 'xxxx' and scan 'xxxx.

2. Through Native Java Api, You encapsulate a RESTfull Api and operate HBase through the provided Api (http ).

3. Using Thrift serialization technology, Thrift supports C ++, PHP, Python, and other languages and is suitable for operating HBase in other heterogeneous systems.

4. Using HBasExplorer, a graphical client previously written to operate HBase, http://www.cnblogs.com/scotoma/archive/2012/12/18/2824311.html.

5. Hive/Pig, which has not been used yet.

Thrift is currently the third way, this is open source Facebook, the official website is http://thrift.apache.org /.

 

Download, install, and start. For more information, see references.

Check whether running is successful...

 

Use php files to operate Hbase and generate class files. For details, refer to the production method in the article. However, I have tested the generation method with a Bug, the namespace in the generated class file is empty, but namespace Hbase is generated from the official source code library, so pay attention to it here.

I debugged a driver file and put it on github. You can download it as needed.

Https://github.com/xinqiyang/buddy/tree/master/Vender/thrift

Next, the test operation, refer to the test class here in the http://blog.csdn.net/hguisu/article/details/7298456, write a test, and debug the following

 open();//echo "Time: " . $client -> time();$tables = $client->getTableNames();sort($tables);foreach ($tables as $name) {echo $name."\r\n";}//create a fc and then create a table$columns = array(new \Hbase\ColumnDescriptor(array('name' => 'id:','maxVersions' => 10)),new \Hbase\ColumnDescriptor(array('name' => 'name:')),new \Hbase\ColumnDescriptor(array('name' => 'score:')),);$tableName = "student";/*try {    $client->createTable($tableName, $columns);} catch (AlreadyExists $ae) {    var_dump( "WARN: {$ae->message}\n" );}*/// get table descriptors$descriptors = $client->getColumnDescriptors($tableName);asort($descriptors);foreach ($descriptors as $col) {var_dump( "  column: {$col->name}, maxVer: {$col->maxVersions}\n" );}//set clomn//add update column data$time = time();var_dump($time);$row = '2';$valid = "foobar-".$time;$mutations = array(new \Hbase\Mutation(array('column' => 'score','value' => $valid)),);$mutations1 = array(new \Hbase\Mutation(array('column' => 'score:a','value' => $time,)),);$attributes = array ();//add row, write a row$row1 = $time;$client->mutateRow($tableName, $row1, $mutations1, $attributes);echo "-------write row $row1 ---\r\n";//update row$client->mutateRow($tableName, $row, $mutations, $attributes);//get column data$row_name = $time;$fam_col_name = 'score:a';$arr = $client->get($tableName, $row_name, $fam_col_name, $attributes);// $arr = arrayforeach ($arr as $k => $v) {// $k = TCellecho " ------ get one : value = {$v->value} , 
";echo " ------ get one : timestamp = {$v->timestamp}
";}echo "----------\r\n";$arr = $client->getRow($tableName, $row_name, $attributes);// $client->getRow return a arrayforeach ($arr as $k => $TRowResult) {// $k = 0 ; non-use// $TRowResult = TRowResultvar_dump($TRowResult);}echo "----------\r\n";/****** //no test public function scannerOpenWithScan($tableName, \Hbase\TScan $scan, $attributes); public function scannerOpen($tableName, $startRow, $columns, $attributes); public function scannerOpenWithStop($tableName, $startRow, $stopRow, $columns, $attributes); public function scannerOpenWithPrefix($tableName, $startAndPrefix, $columns, $attributes); public function scannerOpenTs($tableName, $startRow, $columns, $timestamp, $attributes); public function scannerOpenWithStopTs($tableName, $startRow, $stopRow, $columns, $timestamp, $attributes); public function scannerGet($id); public function scannerGetList($id, $nbRows); public function scannerClose($id);*/echo "----scanner get ------\r\n";$startRow = '1';$columns = array ('column' => 'score', );//$scan = $client->scannerOpen($tableName, $startRow, $columns, $attributes);//$startAndPrefix = '13686667';//$scan = $client->scannerOpenWithPrefix($tableName,$startAndPrefix,$columns,$attributes);//$startRow = '1';//$stopRow = '2';//$scan = $client->scannerOpenWithStop($tableName, $startRow, $stopRow, $columns, $attributes);//$arr = $client->scannerGet($scan);$nbRows = 1000;$arr = $client->scannerGetList($scan, $nbRows);var_dump('count of result :'.count($arr));foreach ($arr as $k => $TRowResult) {// code...//var_dump($TRowResult);}$client->scannerClose($scan);//close transport$transport->close();

  

CreateTable, Insert Row, Get Table, Update Row, and Scan Table are operated here.

 

In actual operations, note the following:

1. the php version must support namespaces, so more than 5.3 of the php version is required.

2. Install thrift's php extension. It seems that this is not actually used. You still need to use the relevant php file. Who can write an extension? I don't know whether the performance can be improved.

3. For scan-related operations, test the start/stop and prefix Scan. It still works.

4. I feel that the php namespace is very frustrated. What should I do? \ The split feeling is so unauthentic ......

Next, if you have time, perform several other operations, perform stress testing, and deploy the operation to the cluster.

Thanks to the article written by hguisu (for reference), you can get started as soon as possible.

 

Update content:

20130517 when Thrift is started on the cluster to find that the write operation is still unstable, and there is a serious timeout. For this operation, you need to optimize the php operation class. in fact, I feel that the operation class is too complicated to write.

 

 

 

References:

Http://blog.csdn.net/hguisu/article/details/7298456

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.