Recently, thrift and php are needed in the project to read and write related data in HBase, So we sorted out the related classes and tested them.
Currently, you can use the following methods to operate HBase:
1. HBase Shell is mainly used to execute shell after configuration to view HBase data, such as count 'xxxx' and scan 'xxxx.
2. Through Native Java Api, You encapsulate a RESTfull Api and operate HBase through the provided Api (http ).
3. Using Thrift serialization technology, Thrift supports C ++, PHP, Python, and other languages and is suitable for operating HBase in other heterogeneous systems.
4. Using HBasExplorer, a graphical client previously written to operate HBase, http://www.cnblogs.com/scotoma/archive/2012/12/18/2824311.html.
5. Hive/Pig, which has not been used yet.
Thrift is currently the third way, this is open source Facebook, the official website is http://thrift.apache.org /.
Download, install, and start. For more information, see references.
Check whether running is successful...
Use php files to operate Hbase and generate class files. For details, refer to the production method in the article. However, I have tested the generation method with a Bug, the namespace in the generated class file is empty, but namespace Hbase is generated from the official source code library, so pay attention to it here.
I debugged a driver file and put it on github. You can download it as needed.
Https://github.com/xinqiyang/buddy/tree/master/Vender/thrift
Next, the test operation, refer to the test class here in the http://blog.csdn.net/hguisu/article/details/7298456, write a test, and debug the following
<?php/***Thrift Test Class by xinqiyang*/ini_set('display_error', E_ALL);$GLOBALS['THRIFT_ROOT'] = './lib';/* Dependencies. In the proper order. */require_once $GLOBALS['THRIFT_ROOT'].'/Thrift/Transport/TTransport.php';require_once $GLOBALS['THRIFT_ROOT'].'/Thrift/Transport/TSocket.php';require_once $GLOBALS['THRIFT_ROOT'].'/Thrift/Protocol/TProtocol.php';require_once $GLOBALS['THRIFT_ROOT'].'/Thrift/Protocol/TBinaryProtocol.php';require_once $GLOBALS['THRIFT_ROOT'].'/Thrift/Transport/TBufferedTransport.php';require_once $GLOBALS['THRIFT_ROOT'].'/Thrift/Type/TMessageType.php';require_once $GLOBALS['THRIFT_ROOT'].'/Thrift/Factory/TStringFuncFactory.php';require_once $GLOBALS['THRIFT_ROOT'].'/Thrift/StringFunc/TStringFunc.php';require_once $GLOBALS['THRIFT_ROOT'].'/Thrift/StringFunc/Core.php';require_once $GLOBALS['THRIFT_ROOT'].'/Thrift/Type/TType.php';require_once $GLOBALS['THRIFT_ROOT'].'/Thrift/Exception/TException.php';require_once $GLOBALS['THRIFT_ROOT'].'/Thrift/Exception/TTransportException.php';require_once $GLOBALS['THRIFT_ROOT'].'/Thrift/Exception/TProtocolException.php';/* Remember these two files? */require_once $GLOBALS['THRIFT_ROOT'].'/Types.php';require_once $GLOBALS['THRIFT_ROOT'].'/Hbase.php';use Thrift\Protocol\TBinaryProtocol;use Thrift\Transport\TSocket;use Thrift\Transport\TSocketPool;use Thrift\Transport\TFramedTransport;use Thrift\Transport\TBufferedTransport;use Hbase\HbaseClient;//define host and port$host = '192.168.56.56';$port = 9090;$socket = new Thrift\Transport\TSocket($host, $port);$transport = new TBufferedTransport($socket);$protocol = new TBinaryProtocol($transport);// Create a calculator client$client = new HbaseClient($protocol);$transport->open();//echo "Time: " . $client -> time();$tables = $client->getTableNames();sort($tables);foreach ($tables as $name) {echo $name."\r\n";}//create a fc and then create a table$columns = array(new \Hbase\ColumnDescriptor(array('name' => 'id:','maxVersions' => 10)),new \Hbase\ColumnDescriptor(array('name' => 'name:')),new \Hbase\ColumnDescriptor(array('name' => 'score:')),);$tableName = "student";/*try { $client->createTable($tableName, $columns);} catch (AlreadyExists $ae) { var_dump( "WARN: {$ae->message}\n" );}*/// get table descriptors$descriptors = $client->getColumnDescriptors($tableName);asort($descriptors);foreach ($descriptors as $col) {var_dump( " column: {$col->name}, maxVer: {$col->maxVersions}\n" );}//set clomn//add update column data$time = time();var_dump($time);$row = '2';$valid = "foobar-".$time;$mutations = array(new \Hbase\Mutation(array('column' => 'score','value' => $valid)),);$mutations1 = array(new \Hbase\Mutation(array('column' => 'score:a','value' => $time,)),);$attributes = array ();//add row, write a row$row1 = $time;$client->mutateRow($tableName, $row1, $mutations1, $attributes);echo "-------write row $row1 ---\r\n";//update row$client->mutateRow($tableName, $row, $mutations, $attributes);//get column data$row_name = $time;$fam_col_name = 'score:a';$arr = $client->get($tableName, $row_name, $fam_col_name, $attributes);// $arr = arrayforeach ($arr as $k => $v) {// $k = TCellecho " ------ get one : value = {$v->value} , <br> ";echo " ------ get one : timestamp = {$v->timestamp} <br>";}echo "----------\r\n";$arr = $client->getRow($tableName, $row_name, $attributes);// $client->getRow return a arrayforeach ($arr as $k => $TRowResult) {// $k = 0 ; non-use// $TRowResult = TRowResultvar_dump($TRowResult);}echo "----------\r\n";/****** //no test public function scannerOpenWithScan($tableName, \Hbase\TScan $scan, $attributes); public function scannerOpen($tableName, $startRow, $columns, $attributes); public function scannerOpenWithStop($tableName, $startRow, $stopRow, $columns, $attributes); public function scannerOpenWithPrefix($tableName, $startAndPrefix, $columns, $attributes); public function scannerOpenTs($tableName, $startRow, $columns, $timestamp, $attributes); public function scannerOpenWithStopTs($tableName, $startRow, $stopRow, $columns, $timestamp, $attributes); public function scannerGet($id); public function scannerGetList($id, $nbRows); public function scannerClose($id);*/echo "----scanner get ------\r\n";$startRow = '1';$columns = array ('column' => 'score', );//$scan = $client->scannerOpen($tableName, $startRow, $columns, $attributes);//$startAndPrefix = '13686667';//$scan = $client->scannerOpenWithPrefix($tableName,$startAndPrefix,$columns,$attributes);//$startRow = '1';//$stopRow = '2';//$scan = $client->scannerOpenWithStop($tableName, $startRow, $stopRow, $columns, $attributes);//$arr = $client->scannerGet($scan);$nbRows = 1000;$arr = $client->scannerGetList($scan, $nbRows);var_dump('count of result :'.count($arr));foreach ($arr as $k => $TRowResult) {// code...//var_dump($TRowResult);}$client->scannerClose($scan);//close transport$transport->close();
CreateTable, Insert Row, Get Table, Update Row, and Scan Table are operated here.
In actual operations, note the following:
1. the php version must support namespaces, so more than 5.3 of the php version is required.
2. Install thrift's php extension. It seems that this is not actually used. You still need to use the relevant php file. Who can write an extension? I don't know whether the performance can be improved.
3. For scan-related operations, test the start/stop and prefix Scan. It still works.
4. I feel that the php namespace is very frustrated. What should I do? \ The split feeling is so unauthentic ......
Next, if you have time, perform several other operations, perform stress testing, and deploy the operation to the cluster.
Thanks to the article written by hguisu (for reference), you can get started as soon as possible.
Update content:
20130517 when Thrift is started on the cluster to find that the write operation is still unstable, and there is a serious timeout. For this operation, you need to optimize the php operation class. in fact, I feel that the operation class is too complicated to write.
References:
Http://blog.csdn.net/hguisu/article/details/7298456