The previous blog mentions the eclipse operation stand-alone version of HBase is not familiar with friends can go to see
Eclipse connects and operates a single version of HBase
This article describes a Mapreduce that reads data from HBase and computes the column similar to WordCount but the input at this point is read from HBase
First you need to create an input source
Start HBase, open hbase shell here my config file is no longer a single machine, but HDFs is used as the file system
<span style= "FONT-SIZE:18PX;" ><configuration><property><name>hbase.rootdir</name><value>hdfs://localhost The location where the:9000/hbase</value><description> data is stored. </description></property><property><name>dfs.replication</name><value>1 </value><description> specifies that the number of replicas is 1 because of pseudo-distributed. </description></property></configuration></span>
CREATE table after entering HBase shell
<span style= "FONT-SIZE:18PX;" >hbase (main):007:0> create ' data_input ', ' message ' 0 row (s) in 1.1110 secondshbase (main):008:0> create ' Data_ Output ',{name=> ' message ', version=>1}0 row (s) in 1.0900 seconds</span>
Data_input tables are used to store input data for MapReduce
Data_output the output data used to store MapReduce
Then generate random data into the Data_inout table, where you use eclipse to manipulate hbase into the table data_input and write the data code as follows:
<span style= "FONT-SIZE:18PX;" >package Hbase_mapred1;import Java.util.random;import Org.apache.hadoop.conf.configuration;import Org.apache.hadoop.hbase.hbaseconfiguration;import Org.apache.hadoop.hbase.client.htable;import Org.apache.hadoop.hbase.client.put;import Org.apache.hadoop.hbase.util.bytes;public class Importer1 {public static void Main (string[] args) throws Exception {String [] pages = {"/", "/a.html", "/b.html", "/c.html"}; Hbaseconfiguration hbaseconfig = new Hbaseconfiguration (); Configuration hbaseconfig=hbaseconfiguration.create (); htable htable = new Htable (hbaseconfig, "data_input"); Htable.setautoflush (FALSE); Htable.setwritebuffersize (1024 * 1024 * 12); int totalrecords = 100000; int maxid = totalrecords/1000; Random rand = new Random (); SYSTEM.OUT.PRINTLN ("Importing" + Totalrecords + "records ...."); for (int i=0; i < TotalrecordS i++) {int userID = Rand.nextint (MAXID) + 1; byte [] Rowkey = Bytes.add (Bytes.tobytes (UserID), bytes.tobytes (i)); String randompage = Pages[rand.nextint (pages.length)]; Put put = new put (Rowkey); Put.add (bytes.tobytes ("message"), Bytes.tobytes ("page"), Bytes.tobytes (Randompage)); Htable.put (Put); } htable.flushcommits (); Htable.close (); System.out.println ("Done"); }}</span>
So far, the data has been written to the table Data_input table, followed by the table's data
as input data for MapReduce
The code is as follows:
<span style= "FONT-SIZE:18PX;" >package Hbase_mapred1;import Java.io.ioexception;import Org.apache.hadoop.hbase.hbaseconfiguration;import Org.apache.hadoop.hbase.client.put;import Org.apache.hadoop.hbase.client.result;import Org.apache.hadoop.hbase.client.scan;import Org.apache.hadoop.hbase.filter.firstkeyonlyfilter;import Org.apache.hadoop.hbase.io.immutablebyteswritable;import Org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil; Import Org.apache.hadoop.hbase.mapreduce.tablemapper;import org.apache.hadoop.hbase.mapreduce.TableReducer; Import Org.apache.hadoop.hbase.util.bytes;import Org.apache.hadoop.io.intwritable;import Org.apache.hadoop.mapreduce.job;public class FreqCounter1 {static Class Mapper1 extends Tablemapper<immutablebytes Writable, intwritable> {private int numrecords = 0; Private static final Intwritable one = new intwritable (1); @Override public void Map (immutablebyteswritable row, Result values, context context) throws IoexceptIon {//extract UserKey from the Compositekey (userId + counter) immutablebyteswritable UserKey = n EW immutablebyteswritable (Row.get (), 0, Bytes.sizeof_int); try {context.write (UserKey, one); } catch (Interruptedexception e) {throw new IOException (e); } numrecords++; if ((numrecords% 10000) = = 0) {context.setstatus ("mapper processed" + NumRecords + "records so Far"); }}} public static class Reducer1 extends Tablereducer<immutablebyteswritable, intwritable, Im mutablebyteswritable> {public void reduce (immutablebyteswritable key, iterable<intwritable> values, Conte XT context) throws IOException, interruptedexception {int sum = 0; for (intwritable val:values) {sum + = Val.get (); Put put = new put (Key.get ()); Put.add (bytes.tobytes("message"), Bytes.tobytes ("Total"), bytes.tobytes (sum)); System.out.println (String.Format ("Stats:key:%d, Count:%d", Bytes.toint (Key.get ()), sum)); Context.write (key, put); }} public static void Main (string[] args) throws Exception {hbaseconfiguration conf = new Hbaseconfigur Ation (); Job Job = new Job (conf, "Hbase_freqcounter1"); Job.setjarbyclass (Freqcounter1.class); Scan scan = new scan (); String columns = "Details"; Comma seperated//scan.addcolumns (columns); Scan.setfilter (New Firstkeyonlyfilter ()); Tablemapreduceutil.inittablemapperjob ("Data_input", scan, Mapper1.class, Immutablebyteswritable.class, IntW Ritable.class, Job); Tablemapreduceutil.inittablereducerjob ("Data_output", Reducer1.class, Job); System.exit (Job.waitforcompletion (true)? 0:1); }}</span>
The approximate logic is:
The map stage reads the data from the datainput and then marks it as 1 the students familiar with the map reduce process should be easily understood
The same key in the data input table is combined in the shuffle phase and the reduce phase reads how many of the same keys are added together to get a total of that total in the Data_output table
Finally, to verify the results,
Because data in hbase cannot be read directly, a program converts the data in hbase into a readable data format code as follows
<span style= "FONT-SIZE:18PX;" >package Hbase_mapred1;import Org.apache.hadoop.hbase.hbaseconfiguration;import Org.apache.hadoop.hbase.client.htable;import Org.apache.hadoop.hbase.client.result;import Org.apache.hadoop.hbase.client.resultscanner;import Org.apache.hadoop.hbase.client.scan;import Org.apache.hadoop.hbase.io.immutablebyteswritable;import Org.apache.hadoop.hbase.util.bytes;public Class Printusercount {public static void main (string[] args) throws Exception {hbaseconfiguration conf = new Hbaseco Nfiguration (); htable htable = new htable (conf, "data_output"); Scan scan = new scan (); Resultscanner scanner = Htable.getscanner (scan); Result R; while (((R = Scanner.next ()) = null)) {immutablebyteswritable b = r.getbytes (); byte[] key = R.getrow (); int userId = Bytes.toint (key); byte[] Totalvalue = R.getvalue (bytes.tobytes ("message"), Bytes.tobytes ("total"); int count = ByteS.toint (Totalvalue); System.out.println ("key:" + userid+ ", Count:" + count); } scanner.close (); Htable.close (); }}</span>
<span style= "FONT-SIZE:18PX;" >key:1, count:1007key:2, count:1034key:3, count:962key:4, count:1001key:5, count: 1024key:6, count:1033key:7, count:984key:8, count:987key:9, count:988key:10, count: 990key:11, count:1069key:12, count:965key:13, count:1000key:14, count:998key:15, Count:1002key:16, count:983 ... </span>
Note that the corresponding package requires the Import program directory structure to be
Hbase + Mapreduce + Eclipse instance