Hbase-mapreduce-hbase as an example of an input source | That Yi-Wipe smile

Source: Internet
Author: User
Tags zookeeper

BlogThat Yi-Wipe smilecsdn Blog Address:http://blog.csdn.net/u012185296 itdog8 address link: http://www.itdog8.com/thread-203-1-1.html
blog post title:Hbase-mapreduce-hbase As an example of an input source | That Yi-wipe smile
Personality Signature:The furthest distance in the world is not the horizon, nor the cape, but I stand in front of you, but you do not feel my presence
Technical Direction:Flume+kafka+storm+redis/hbase+hadoop+hive+mahout+spark ... Cloud Computing Technology
Reprint statement:can be reproduced, but must be in the form of hyperlinks to indicate the original source and author information and copyright statement, thank you for your cooperation!
QQ Exchange Group:214293307 (look forward to learning with you, common progress) Reference: Http://abloz.com/hbase/book.html#mapreduce.example


1 Official Website CodeThe following is an example of MapReduce reading using HBase as the source. In particular, there are only mapper instances, no reducer. Mapper not produce anything. As shown below ...
Configuration config = hbaseconfiguration.create ();
Job Job = new Job (config, "Exampleread");
Job.setjarbyclass (Myreadjob.class); class that contains mapper

Scan scan = new scan ();
Scan.setcaching (500); 1 is the default in Scan, which would be a bad for MapReduce jobs
Scan.setcacheblocks (FALSE); Don ' t set to true for MR jobs
Set other scan Attrs
...

Tablemapreduceutil.inittablemapperjob (
TableName,//input HBase table name
Scan,//scan instance to control CF and attribute selection
Mymapper.class,//Mapper
NULL,//Mapper output key
NULL,//Mapper output value
Job);
Job.setoutputformatclass (Nulloutputformat.class); Because we aren ' t emitting anything from mapper

Boolean B = Job.waitforcompletion (true);
if (!b) {
throw new IOException ("Error with job!");
}
... mapper need to inherit fromTablemapper...
public class Mymapper extends Tablemapper<text, longwritable> {public void map (immutablebyteswritable row, Result V Alue, Context context) throws Interruptedexception, IOException {//process data for the row from the Result instance.


2 My reference code
Package Com.itdog8.cloud.hbase.mr.test;import Java.io.ioexception;import Org.apache.commons.logging.log;import Org.apache.commons.logging.logfactory;import Org.apache.hadoop.conf.configuration;import Org.apache.hadoop.fs.filesystem;import Org.apache.hadoop.fs.path;import Org.apache.hadoop.hbase.hbaseconfiguration;import Org.apache.hadoop.hbase.client.result;import Org.apache.hadoop.hbase.client.scan;import Org.apache.hadoop.hbase.io.immutablebyteswritable;import Org.apache.hadoop.hbase.mapreduce.tablemapreduceutil;import Org.apache.hadoop.hbase.mapreduce.TableMapper; Import Org.apache.hadoop.hbase.util.bytes;import Org.apache.hadoop.io.text;import Org.apache.hadoop.mapreduce.Job ; Import Org.apache.hadoop.mapreduce.lib.output.fileoutputformat;import org.apache.hadoop.mapreduce.lib.output.textoutputformat;/** * Testhbaseassourcemapreducemainclass * * @author the Smile @date 2015-07-30 18:00:21 * */public class Testhbaseassourcemapreducemainclass {private static final Log _log = Logfactor Y.gEtlog (Testhbaseassourcemapreducemainclass.class); private static final String job_name = "Testhbaseassourcemapreduce"; private static String Tmppath = "/tmp/com/itdog8/yting/testhbaseassourcemapreduce";  private static String hbaseinputtble = "itdog8:test_1";  public static class Examplesourcemapper extends Tablemapper<text, text> {private text k = new text ();   Private text v = new text ();  @Override protected void Setup (context context) throws IOException, Interruptedexception {super.setup (context); } @Override protected void map (immutablebyteswritable key, Result value, Context context) throws IOException, Interrupte      dexception {String Rowkey = bytes.tostring (Key.get ());    Here the operation needs to be familiar with the Result of the operation of the line, followed by the business logic of the try {//Set value K.set ("Look Baa");       V.set ("excrement You");   Context write to Reducer Context.write (K, v);   } catch (Exception e) {e.printstacktrace (); }} @Override protected void Cleanup (context context) throws IOException, Interruptedexception {super.cleanup (context); }} public static void Main (string[] args) throws Exception {//hbase configuration Configuration conf = Hbaseconfigu  Ration.create ();  Conf.set ("Hbase.zookeeper.quorum", "a234-198.hadoop.com,a234-197.hadoop.com,a234-196.hadoop.com");  Conf.set ("Hbase.zookeeper.property.clientPort", "2181");  Batch and caching scan scan = new scan ();  Scan.setcaching (10000);  Scan.setcacheblocks (FALSE);   Scan.setmaxversions (1);  Set Hadoop speculative execution to False Conf.setboolean ("Mapred.map.tasks.speculative.execution", false);   Conf.setboolean ("Mapred.reduce.tasks.speculative.execution", false);  TMP index Path tmppath = Args[0];  Path Tmpindexpath = new Path (Tmppath);  FileSystem fs = Filesystem.get (conf); if (fs.exists (Tmpindexpath)) {//Fs.delete (Tmpindexpath, true);//dangerous//_log.info ("Delete tmp index path:" + Tmpin   Dexpath.getname ());   _log.warn ("The HDFs Path [" +tmppath+ "] existed, please change a path.");  return;}//Job && conf Job Job = new Job (conf, job_name);   Job.setjarbyclass (Testhbaseassourcemapreducemainclass.class); Tablemapreduceutil.inittablemapperjob (hbaseinputtble, Scan, Examplesourcemapper.class, Text.class, Text.class, Job) ;//Job.setreducerclass (Myreducer.class);  Own processing logic job.setoutputkeyclass (Text.class);  Job.setoutputvalueclass (Text.class);  Job.setoutputformatclass (Textoutputformat.class);   Fileoutputformat.setoutputpath (Job, Tmpindexpath); int success = Job.waitforcompletion (true)?   0:1; System.exit (Success); }}


Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Hbase-mapreduce-hbase as an example of an input source | That Yi-Wipe smile

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.