Reading notes-hbase in action-Part II Advanced concepts-(2) coprocessor

Source: Internet
Author: User

Coprocessor is an attribute introduced by HBase 0.92.0. Use coprocessor. The ability to push some computational logic down to the HBase node, HBase is upgraded to a distributed data processing platform by a simple storage system.

Coprocessor is divided into two types: observer and endpoint.

Observer can change the existing client operation function. And endpoint can introduce a new client operation.

Observer

The Observer function is similar to a database trigger or an advice in AOP. Add observer to the put operation, where 1-2-4-6 is a normal put operation RPC call procedure, while 3 and 5 belong to observer and can add their own definition processing logic before and after the put operation.


Observer contains three kinds, regionobserver (for data access and update operations, executed on region)/walobserver (for Wal log events, executed in regionserver context)/ Masterobserver (executes on the master node for DDL operations).

Endpoint

The role of endpoint is similar to a database stored procedure. The implementation mechanism is to expose the client to a new operating interface by extending the HBase RPC protocol.

For example, the client is responsible for initiating the invocation and collecting the results, and the nodes of the server are responsible for parallel computations.


Actual combat

The above chapter of the follows table for example, through the observer implementation Followedby is concerned about table data consistency maintenance. Endpoint to realize the number of people concerned.

Because you want to insert the follows table now, you are actively inserting the Followedby table. Need to use the username information of followers/followers, so first upgrade schema.

watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvawrvbnr3yw50b2jl/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/center ">

Implement Observer

Three stares in the code are worth noting:

    1. The Postput method is called after the put operation.
    2. Assume that you install observer through Hbase-site.xml. is applied to all global tables, so it is inferred whether the put operation is a follows table.

    3. Here's a bit bad smell. Observer executes on the server side. In order to share the code, but also call the client code, only for demonstration purposes.

Packagehbaseia.twitbase.coprocessors;//...publicclass Followsobserver extends Baseregionobserver {private HTablePool    Pool = null; @Override public void Start (Coprocessorenvironment env) throws IOException {pool = Newhtablepool (Env.getconfigur    ation (), integer.max_value);    } @Override public void Stop (Coprocessorenvironment env) throws IOException {pool.close (); @Override public void Postput (//1, called after a put operation Finalobservercontext<regioncoprocessorenvironment> E        , final put put, final Waledit edit, Final Boolean Writetowal) throws IOException {        Byte[] Table=e.getenvironment (). Getregion (). Getregioninfo (). Gettablename (); if (!  Bytes.equals (Table,follows_table_name)) return;        2, inferred table name KeyValue kv =put.get (Relation_fam, from). Get (0);        String from =bytes.tostring (Kv.getvalue ());        KV = Put.get (relation_fam,to). Get (0); String to =bytes.tostring (Kv.getvalue ());        Relationsdao relations = Newrelationsdao (pool); Relations.addfollowedby (To,from);//3, inserting Followedby table}}
The installation of observer can be done by altering the hbase-site.xml or using the Tableschema change statement, the former requires a restart of the HBase service, which only needs to be up and down the corresponding table.

$ hbase Shellhbaseshell; Enter ' help<return> ' for list of supported commands. Type "exit<return>" to leave the HBase ShellVersion0.92.0, r1231986, Mon Jan, 13:16:35 UTC 2012hbase (main): 001:0> ;d isable ' follows ' 0 row (s) in 7.0560 secondshbase (main): 002:0>alter ' follows ', METHOD = ' Table_att ', ' Coprocessor ' = ' file:///Users/ndimiduk/repos/hbaseiatwitbase/target/twitbase-1.0.0.jar| Hbaseia.twitbase.coprocessors.followsobserver|1001| ' Updatingall regions with the new schema ... 1/1regions updated. done.0 row (s) in 1.0770 secondshbase (main): 003:0>enable ' follows ' 0 row (s) in 2.0760 seconds

1001 of them are priority. When loading multiple observer. Execute according to priority order.

Implement Endpoint

The number of people concerned can be achieved by clientscan, compared to the endpoint scheme. There are two points to be improved:

    1. Transfer all the attention to the client, unnecessary network I/O.

    2. Get the full attention of the person after result. The traversal implementation count is single-threaded.

Implementation endpoint contains three parts

Defining the RPC Interface

Publicinterface Relationcountprotocol extends Coprocessorprotocol {public    long Followedbycount (String userId) Throwsioexception;}
Service-Side implementation

Unlike client, Internalscanner executes on a specific region. The original KeyValue object is returned.

Packagehbaseia.twitbase.coprocessors;//...publicclass Relationcountimpl extends Baseendpointcoprocessor Implementsrelationcountprotocol {@Override public longfollowedbycount (String userId) throws IOException {by        Te[]startkey = Md5utils.md5sum (userId);        Scan scan = Newscan (Startkey);        Scan.setfilter (Newprefixfilter (Startkey));        Scan.addcolumn (Relation_fam,from);        Scan.setmaxversions (1);        regioncoprocessorenvironmentenv= (regioncoprocessorenvironment) getenvironment ();        Internalscanner scanner =env.getregion (). Getscanner (scan),//1,server end long sum = 0;        list<keyvalue> results= new arraylist<keyvalue> ();        Boolean hasmore = false;            do {Hasmore =scanner.next (results);            Sum + = Results.size ();        Results.clear ();        } while (Hasmore);        Scanner.close ();    return sum; }}
Client code

The gaze of the examination:

    1. Defining a Call instance
    2. Call the service-side endpoint.
    3. Aggregate results from all Regionserver

Public long Followedbycount (final String userId) throws Throwable {Htableinterface followed =pool.gettable (followed_t    Able_name);    Final byte[] Startkey = Md5utils.md5sum (userId);    Final byte[] EndKey =arrays.copyof (Startkey, startkey.length);    endkey[endkey.length-1]++;        Batch.call<relationcountprotocol,long> callable = Newbatch.call<relationcountprotocol, Long> () { @Override Public Longcall (Relationcountprotocol instance) throws IOException {Returninstance.followedb        Ycount (USERID); }};//1 call Instance map<byte[], Long>results = Followed.coprocessorexec (R                                   Elationcountprotocol.class, Startkey, EndKey,    callable);//2 invoke endpoint long sum = 0;    For (map.entry<byte[],long> E:results.entryset ()) {sum +=e.getvalue (). Longvalue (); }//3 Aggreagte Results REturn sum;} 
Endpoint can only be deployed through configuration files, and the associated jar packages need to be added to HBase classpath.

<property>    <name>hbase.coprocessor.region.classes</name>    <value> Hbaseia.twitbase.coprocessors.relationcountimpl</value></property>

Reading notes-hbase in action-Part II Advanced concepts-(2) coprocessor

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.