HBase uses filter to quickly and efficiently query

Last Update:2018-07-26 Source: Internet

Author: User

Tags rand

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This blog is the way HBase uses filter to quickly and efficiently query, I will slowly

A few big filters
1, comparision Filters
1.1 RowFilter
1.2 Familyfilter
1.3 Qualifierfilter
1.4 Valuefilter
1.5 Dependentcolumnfilter
2, dedicated Filters
2.1 Singlecolumnvaluefilter
2.2 Singlecolumnvalueexcludefilter
2.3 Prefixfilter
2.4 Pagefilter
2.5 Keyonlyfilter
2.6 Firstkeyonlyfilter
2.7 Timestampsfilter
2.8 Randomrowfilter
3, Decorating Filters
3.1 Skipfilter
3.2 whilematchfilters

A simple example of singlecolumnvaluefilter

 public static void Selectbyfilter (String tablename,list<string> arr) throws ioexception{htable table=new  
        Htable (Hbaseconfig,tablename);  
        filterlist filterlist = new filterlist ();  
        Scan S1 = new Scan ();  
            for (String V:arr) {//each condition is the relationship between "and" string [] S=v.split (",");  
                                                             Filterlist.addfilter (New Singlecolumnvaluefilter (Bytes.tobytes (s[0)), Bytes.tobytes (S[1]), compareop.equal,bytes.tobyt  
            ES (s[2]));  
        After adding the following line, only the specified cell is returned, and the other cells in the same row do not return//S1.addcolumn (Bytes.tobytes (S[0]), Bytes.tobytes (s[1]));  
        } s1.setfilter (filterlist);  
        Resultscanner resultscannerfilterlist = Table.getscanner (S1); For (Result Rr=resultscannerfilterlist.next (); rr!=null;rr=resultscannerfiltErlist.next ()) {for (KeyValue kv:rr.list ()) {System.out.println ("row:" +new String (kv.getr  
                ow ()));  
                SYSTEM.OUT.PRINTLN ("column:" +new String (Kv.getcolumn ()));  
            System.out.println ("Value:" +new String (Kv.getvalue ()));  
 }  
        }  
    }

Multiplecolumnprefixfilter

The API describes the following

This filter was used for selecting only those keys with columns, that matches a particular prefix. For example, if prefix was ' an ', it would pass the keys would columns like ' and ', ' anti ' but the keys with the columns like ' Ball ', ' Act '.

The construction method is as follows

Public Multiplecolumnprefixfilter (byte[][] prefixes)

Incoming multiple prefix
The source code is described below.

Public multiplecolumnprefixfilter (final byte [] prefixes) {
     if (prefixes! = null) {for
       (int i = 0; i < Prefi Xes.length; i++) {
         if (!sortedprefixes.add (Prefixes[i]))
           throw new IllegalArgumentException ("Prefixes must be distinct");
       }
     }
   }

The sample code is as follows: I looked for it from the Internet, and it was hard to understand.

+public class Testmultiplecolumnprefixfilter {+ + private final static hbasetestingutility Test_util = new + Hbaset
Estingutility (); + + @Test + public void Testmultiplecolumnprefixfilter () throws IOException {+ String family = "Family"; + htable
Descriptor HTD = new Htabledescriptor ("Testmultiplecolumnprefixfilter");
+ htd.addfamily (new Hcolumndescriptor (family));
+//hregioninfo info = new Hregioninfo (HTD, NULL, NULL, FALSE);
+ Hregioninfo info = new Hregioninfo (htd.getname (), NULL, NULL, FALSE); + hregion region = hregion.createhregion (info, hbasetestingutility. + Gettestdir (), Test_util.getconfiguration ()
, HTD);
+ + list<string> rows = generaterandomwords ("row");
+ list<string> columns = generaterandomwords (10000, "column");
+ Long Maxtimestamp = 2;
+ + list<keyvalue> kvlist = new arraylist<keyvalue> (); + + map<string, list<keyvalue>> prefixmap = new hashmap<string, + List<keyvaluE>> ();
+ + prefixmap.put ("P", new Arraylist<keyvalue> ());
+ prefixmap.put ("Q", new arraylist<keyvalue> ());
+ prefixmap.put ("s", New Arraylist<keyvalue> ());
+ + String valuestring = "valuestring";        + + for (string row:rows) {+ put p = new put (bytes.tobytes (row)), + for (string column:columns) {+ for (long timestamp = 1; timestamp <= maxtimestamp; timestamp++) {+ KeyValue kv = keyvaluetestutil.create (ro
W, family, column, timestamp, + valuestring);
+ p.add (KV);
+ Kvlist.add (KV); + for (String S:prefixmap.keyset ()) {+ if (Column.startswith (s)) {+ Prefixmap.get (s)). A
DD (KV);
+} +} +} +} + Region.put (p);
+} + + Multiplecolumnprefixfilter filter;
+ Scan scan = new scan ();
+ scan.setmaxversions ();
+ byte [] filter_prefix = new byte [2][];
+ filter_prefix[0] = new byte [] {' P '}; + filter_prefix[1] = new Byte [] {' Q '};
+ + filter = new Multiplecolumnprefixfilter (filter_prefix);
+ Scan.setfilter (filter);  
+ list<keyvalue> results = new arraylist<keyvalue> ();
+ Internalscanner scanner = Region.getscanner (scan);
+ while (Scanner.next (results));
+ Assertequals (Prefixmap.get ("P"). Size () + prefixmap.get ("Q"). Size (), results.size ());  +} + + @Test + public void testmultiplecolumnprefixfilterwithmanyfamilies () throws IOException {+ String family1 =
"Family1";
+ String family2 = "Family2";
+ Htabledescriptor HTD = new Htabledescriptor ("Testmultiplecolumnprefixfilter");
+ htd.addfamily (new Hcolumndescriptor (family1));
+ htd.addfamily (new Hcolumndescriptor (family2));
+ Hregioninfo info = new Hregioninfo (htd.getname (), NULL, NULL, FALSE); + hregion region = hregion.createhregion (info, hbasetestingutility. + Gettestdir (), Test_util.getconfiguration ()
, HTD);
+ + list<string> rows = generaterandomwords ("row");  +  list<string> columns = generaterandomwords (10000, "column");
+ Long maxtimestamp = 3;
+ + list<keyvalue> kvlist = new arraylist<keyvalue> ();
+ + map<string, list<keyvalue>> prefixmap = new hashmap<string, + list<keyvalue>> ();
+ + prefixmap.put ("P", new Arraylist<keyvalue> ());
+ prefixmap.put ("Q", new arraylist<keyvalue> ());
+ prefixmap.put ("s", New Arraylist<keyvalue> ());
+ + String valuestring = "valuestring";        + + for (string row:rows) {+ put p = new put (bytes.tobytes (row)), + for (string column:columns) {+          for (long timestamp = 1; timestamp <= maxtimestamp; timestamp++) {+ Double Rand = Math.random (); +
KeyValue kv;                + if (Rand < 0.5) + kv = keyvaluetestutil.create (row, family1, column, timestamp, +
valuestring);        + else + kv = keyvaluetestutil.create (row, family2, column, timestamp, +        valuestring);
+ p.add (KV);
+ Kvlist.add (KV); + for (String S:prefixmap.keyset ()) {+ if (Column.startswith (s)) {+ Prefixmap.get (s)). A
DD (KV);
+} +} +} +} + Region.put (p);
+} + + Multiplecolumnprefixfilter filter;
+ Scan scan = new scan ();
+ scan.setmaxversions ();
+ byte [] filter_prefix = new byte [2][];
+ filter_prefix[0] = new byte [] {' P '};
+ filter_prefix[1] = new byte [] {' Q '};
+ + filter = new Multiplecolumnprefixfilter (filter_prefix);
+ Scan.setfilter (filter);  
+ list<keyvalue> results = new arraylist<keyvalue> ();
+ Internalscanner scanner = Region.getscanner (scan);
+ while (Scanner.next (results));
+ Assertequals (Prefixmap.get ("P"). Size () + prefixmap.get ("Q"). Size (), results.size ()); +} + + @Test + public void Testmultiplecolumnprefixfilterwithcolumnprefixfilter () throws IOException {+ String F
amily = "Family"; +   Htabledescriptor HTD = new Htabledescriptor ("Testmultiplecolumnprefixfilter");
+ htd.addfamily (new Hcolumndescriptor (family));
+ Hregioninfo info = new Hregioninfo (htd.getname (), NULL, NULL, FALSE); + hregion region = hregion.createhregion (info, hbasetestingutility. + Gettestdir (), Test_util.getconfiguration ()
, HTD);
+ + list<string> rows = generaterandomwords ("row");
+ list<string> columns = generaterandomwords (10000, "column");
+ Long Maxtimestamp = 2;
+ + String valuestring = "valuestring";        + + for (string row:rows) {+ put p = new put (bytes.tobytes (row)), + for (string column:columns) {+ for (long timestamp = 1; timestamp <= maxtimestamp; timestamp++) {+ KeyValue kv = keyvaluetestutil.create (ro
W, family, column, timestamp, + valuestring);
+ p.add (KV);
+} +} + Region.put (p);
+} + + Multiplecolumnprefixfilter Multipleprefixfilter; + Scan scan1 = new ScAn ();
+ scan1.setmaxversions ();
+ byte [] filter_prefix = new byte [1][];
+ filter_prefix[0] = new byte [] {' P '};
+ + Multipleprefixfilter = new Multiplecolumnprefixfilter (filter_prefix);
+ Scan1.setfilter (multipleprefixfilter);  
+ list<keyvalue> results1 = new arraylist<keyvalue> ();
+ Internalscanner Scanner1 = Region.getscanner (scan1);
+ while (Scanner1.next (RESULTS1));
+ + Columnprefixfilter Singleprefixfilter;
+ Scan scan2 = new scan ();
+ scan2.setmaxversions ();
+ Singleprefixfilter = new Columnprefixfilter (bytes.tobytes ("P"));
+ + scan2.setfilter (singleprefixfilter);  
+ list<keyvalue> results2 = new arraylist<keyvalue> ();
+ Internalscanner Scanner2 = Region.getscanner (scan1);
+ while (Scanner2.next (RESULTS2));
+ + assertequals (results1.size (), results2.size ()); +} + + list<string> generaterandomwords (int numberofwords, String suffix) {+ set<string> Wordset = NE W Hashset<sTring> (); + for (int i = 0; i < numberofwords; i++) {+ int lengthofwords = (int) (Math.random () *) + 1; + char[] Wo
Rdchar = new Char[lengthofwords];      + for (int j = 0, J < Wordchar.length; J + +) {+ Wordchar[j] = (char) (Math.random () * 26 + 97); +} +
String Word; + if (suffix = = null) {+ word = new string (Wordchar); +} else {+ word = new String (Wordchar) +
Suffix
+} + wordset.add (word);
+} + list<string> wordList = new arraylist<string> (wordset);
+ Return wordList; +  }
+}
+
.

Columnprefixfilter

public class Columnprefixfilterextends Filterbasethis filter are used for selecting only those keys with columns this match Es a particular prefix. For example, if prefix was ' an ', it would pass the keys would columns like ' and ', ' anti ' but the keys with the columns like ' Ball ', ' Act '.

The above is a description of the class
There is only one structure columnprefixfilter(byte[] prefix)
This kind of usage is very simple, is the match prefix is prefix's rowkey, but, does not know everybody uses after has what feeling, I was used, but does not have the function, has the function The Daniel to tell me next.

Helpless under, had to choose Prefixfilter

Prefixfilter

Class Description:

Pass results that has same row prefix.

The same construction method is identical to Columnprefixfilter, and the usage is the same.

Basically a few filter is these, slowly I update this article

The previous code, I wrote it myself, using the code in

	public static string Getkeywordtablerowkeyusefilter (String filterstring1,string filterString2) {filterlist filterlist
		= new FilterList ();
		String rowkeyvalue = "";
			Scan S1 = new Scan ();
			String [] Sf1=filterstring1.split (","); Filterlist.addfilter (New Singlecolumnvaluefilter (Bytes.tobytes (sf1[0)), Byte
					                                         S.tobytes (Sf1[1]), Compareop.equal,bytes.tobytes (Sf1[2])
			));
			String [] Sf2=filterstring2.split (",");
                    Filterlist.addfilter (New Singlecolumnvaluefilter (Bytes.tobytes (sf2[0)), Bytes.tobytes (Sf2[1]),
			Compareop.equal,bytes.tobytes (sf2[2]));
			Filterlist.addfilter (New Columnprefixfilter (Bytes.tobytes ("3274980668:"));
	
		Filterlist.addfilter (New Prefixfilter (Bytes.tobytes ("3274980668:"));
		S1.setfilter (filterlist);
		Resultscanner resultscannerfilterlist; try {ResultScannerfilterlist = Tablekeyword.getscanner (S1); For (Result Rr=resultscannerfilterlist.next (); Rr!=null;rr=resultscannerfilterlist.next ()) {String rowkeyvaluetmp =
				
				New String (Rr.getrow ());
				
			Rowkeyvalue = Rowkeyvalue + "# #" + rowkeyvaluetmp;
		}} catch (IOException e) {//TODO auto-generated catch block E.printstacktrace ();
		} log.warn ("Rowkeyvalue" + rowkeyvalue);
	return rowkeyvalue; }

The usage of Prefixfilter and columnprefixfilter is almost the same, but in development, it is recommended to use Prefixfilter

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More