DDL (create and Delete tables) how to create and Delete tables in Hbase. It can be implemented through Java and HbaseShell methods. Create a table in HBase by calling createTable () to operate HBaseAdmin. HTableDescriptor describes the table schema.
DDL (create and Delete tables) how to create and Delete tables in Hbase. It can be implemented through Java and Hbase Shell methods. Create a table in HBase by calling createTable () to operate HBaseAdmin. HTableDescriptor describes the table schema.
DDL (create and Delete tables)
How to create and Delete tables in Hbase. It can be implemented through Java and Hbase Shell methods.
Create a table
In HBase, the table is created by calling createTable () to operate the HBaseAdmin object.
HTableDescriptor describes the table schema. You can use addFamily () to add a column family.
The following Java code creates a simple Hbase table 'table1', which has two column families: f1 and f2.
public class createTable{ private static Configuration config; private static HBaseAdmin ha; public static void main(String[] args){ try{ config = HBaseConfiguration.create(); config.addResource("core-site.xml"); config.addResource("hdfs-site.xml"); config.addResource("yarn-site.xml"); config.addResource("mapred-site.xml"); ha = new HBaseAdmin(config); //create table descriptor String tableName = "table1"; HTableDescriptor htd = new HTableDescriptor(Bytes.toBytes(tableName)); //create and configure column families HColumnDescriptor hcd1 = new HColumnDescriptor(Bytes.toBytes("family1")); hcd1.setBlocksize(65536); hcd1.setMaxVersions(1); hcd1.setBloomFilterType(BloomType.ROW); hcd1.setCompressionType(Algorithm.SNAPPY); hcd1.setDataBlockEncoding(DataBlockEncoding.PREFIX); hcd1.setTimeToLive(36000); hcd1.setInMemory(false); HColumnDescriptor hcd2 = new HColumnDescriptor(Bytes.toBytes("family2")); hcd2.setBlocksize(65536); hcd2.setMaxVersions(1); hcd2.setBloomFilterType(BloomType.ROW); hcd2.setCompressionType(Algorithm.SNAPPY); hcd2.setDataBlockEncoding(DataBlockEncoding.PREFIX); hcd2.setTimeToLive(36000); hcd2.setInMemory(false); //add column families to table descriptor htd.addFamily(hcd1); htd.addFamily(hcd2); //create table ha.createTable(htd); System.out.println("Hbase table created."); }catch (TableExistsException e){ System.out.println("ERROR: attempting to create existing table!"); }catch (IOException e){ e.printStackTrace(); }finally{ try{ ha.close(); }catch(IOException e){ e.printStackTrace(); } } }}
In Hbase Shell, the table creation function is implemented by creating 'hbase table name' and ['column family name.
For example, create 'table1', 'family1 ', and 'famil2' can also create the preceding table.
Delete table
You can also use HBaseAdmin to delete a table. before deleting a table, you must disable the table. This is a time-consuming operation, so it is not recommended to delete tables frequently.
The following Java code deletes table "table1:
public class deleteTable{ private static Configuration config; private static HBaseAdmin ha; public static void main(String[] args){ try{ config = HBaseConfiguration.create(); config.addResource("core-site.xml"); config.addResource("hdfs-site.xml"); config.addResource("yarn-site.xml"); config.addResource("mapred-site.xml"); ha = new HBaseAdmin(config); String tableName = "table1"; //Only an existing table can be dropped if (ha.tableExists(tableName)){ //read&write denied ha.disableTable(tableName); ha.deleteTable(tableName); System.out.println("Hbase table dropped!"); } }catch(IOException e){ e.printStackTrace(); }finally{ try{ ha.close(); }catch(IOException e){ e.printStackTrace(); } } }}
In Hbase Shell, the table deletion function is implemented by dropping 'hbase table name.
For example, disable 'table1' and drop 'table1' to delete the table above.
Data insertion
In Java operations, the put method is used to insert data.
The put method can pass a single Put object: public void put (Put put) throws IOException, and can also insert multiple Put objects in batches: public void put (List puts) throws IOException
The following Java code inserts batch data into table "table1. After the data is inserted, the table contains 10000 rows. The columns "family1" and "family2" contain "q1" and "q2, the column family "family1" stores integer data (int) and the column family "family2" stores string (string ).
ATTENTION: Although Hbase supports multiple types of storage, we recommend that you use String as the storage type of table values for hbase with high performance optimization. As shown in the preceding example, "family1: q1" is originally an integer type and must be converted into a string before entering the table.
public class insertTable{ private static Configuration config; public static void main(String[] args) throws IOException{ config = HBaseConfiguration.create(); config.addResource("core-site.xml"); config.addResource("hdfs-site.xml"); config.addResource("yarn-site.xml"); config.addResource("mapred-site.xml"); String tableName = "table1"; HTable table = new HTable(config, tableName); //set AutoFlush table.setAutoFlush(true); int count = 10000; String familyName1 = "family1"; String familyName2 = "family2"; String qualifier1 = "q1"; String qualifier2 = "q2"; //data to be inserted String[] f1q1 = new String[count]; String[] f1q2 = new String[count]; String[] f2q1 = new String[count]; String[] f2q2 = new String[count]; for(int i = 0; i < count; i++){ f1q1[i] = Integer.toString(i); f1q2[i] = Integer.toString(i+10000); f2q1[i] = String.format("f2q1%d", i); f2q2[i] = String.format("f2q2%d", i); } List puts = new ArrayList(); //insert by rows for (int j = 0; j < count; j++){ //Create a Put object for a specified row-key Put p = new Put(Bytes.toBytes(String.format("Row%05d",j))); //fill columns p.add(Bytes.toBytes(familyName1), Bytes.toBytes(qualifier1), Bytes.toBytes(f1q1[j])); p.add(Bytes.toBytes(familyName1), Bytes.toBytes(qualifier2), Bytes.toBytes(f1q2[j])); p.add(Bytes.toBytes(familyName2), Bytes.toBytes(qualifier1), Bytes.toBytes(f2q1[j])); p.add(Bytes.toBytes(familyName2), Bytes.toBytes(qualifier2), Bytes.toBytes(f2q2[j])); puts.add(p); //put for every 100 rows if((j+1)%100==0){ table.put(puts); puts.clear(); } } table.close(); System.out.println("Data inserted!"); }}
In Hbase Shell, the single data insertion function is implemented by put 'hbase table name', 'rowkey', 'column Family name: column name', and 'data value.
Data Query
Hbase table data query can be divided into single query and batch query.
Single Query
A single query queries data of a row in a table by matching the rowkey. You can use the get () method in Java.
The following Java code retrieves data from all columns of a row of the specified rowkey in Table table1:
public class getFromTable{ private static Configuration config; public static void main(String[] args) throws IOException{ String tableName = "table1"; config = HBaseConfiguration.create(); config.addResource("core-site.xml"); config.addResource("hdfs-site.xml"); config.addResource("yarn-site.xml"); config.addResource("mapred-site.xml"); HTable table = new HTable(config, tableName); Get get = new Get(Bytes.toBytes("Row01230")); //add target columns for get get.addColumn(Bytes.toBytes("family1"), Bytes.toBytes("q1")); get.addColumn(Bytes.toBytes("family1"), Bytes.toBytes("q2")); get.addColumn(Bytes.toBytes("family2"), Bytes.toBytes("q1")); get.addColumn(Bytes.toBytes("family2"), Bytes.toBytes("q2")); Result result = table.get(get); //get results byte[] rowKey = result.getRow(); byte[] val1 = result.getValue(Bytes.toBytes("family1"), Bytes.toBytes("q1")); byte[] val2 = result.getValue(Bytes.toBytes("family1"),Bytes.toBytes("q2")); byte[] val3 = result.getValue(Bytes.toBytes("family2"), Bytes.toBytes("q1")); byte[] val4 = result.getValue(Bytes.toBytes("family2"), Bytes.toBytes("q2")); System.out.println("Row key: " + Bytes.toString(rowKey)); System.out.println("value1: " + Bytes.toString(val1)); System.out.println("value2: " + Bytes.toString(val2)); System.out.println("value3: " + Bytes.toString(val3)); System.out.println("value4: " + Bytes.toString(val4)); table.close(); }}
In Hbase Shell, the single data query function is implemented by get 'hbase table name', 'rowkey', and 'column Family name: column name.
Batch query
Batch query is performed by specifying a rowkey range. You can use the getask() method in Java.
The following Java code retrieves data from all columns with a specified rowkey range in Table table1:
public class scanFromTable { private static Configuration config; public static void main(String[] args) throws IOException{ config = HBaseConfiguration.create(); config.addResource("core-site.xml"); config.addResource("hdfs-site.xml"); config.addResource("yarn-site.xml"); config.addResource("mapred-site.xml"); String tableName = "table1"; HTable table = new HTable(config, tableName); //Scan according to rowkey range Scan scan = new Scan(); //set starting row(included), if not set, start from the first row scan.setStartRow(Bytes.toBytes("Row01000")); //set stopping row(excluded), if not set, stop at the last row scan.setStopRow(Bytes.toBytes("Row01100")); //specify columns to scan, if not specified, return all columns; scan.addColumn(Bytes.toBytes("family1"), Bytes.toBytes("q1")); scan.addColumn(Bytes.toBytes("family1"), Bytes.toBytes("q2")); scan.addColumn(Bytes.toBytes("family2"), Bytes.toBytes("q1")); scan.addColumn(Bytes.toBytes("family2"), Bytes.toBytes("q2")); //specify maximum versions for one cell, if called without arguments, get all versions, if not called, get only the latest version scan.setMaxVersions(); //specify maximum number of cells to avoid OutOfMemory error caused by huge amount of data in a single row scan.setBatch(10000); ResultScanner rs = table.getScanner(scan); for(Result r:rs){ byte[] rowKey = r.getRow(); byte[] val1 = r.getValue(Bytes.toBytes("family1"), Bytes.toBytes("q1")); byte[] val2 = r.getValue(Bytes.toBytes("family1"), Bytes.toBytes("q2")); byte[] val3 = r.getValue(Bytes.toBytes("family2"), Bytes.toBytes("q1")); byte[] val4 = r.getValue(Bytes.toBytes("family2"), Bytes.toBytes("q2")); System.out.print(Bytes.toString(rowKey)+": "); System.out.print(Bytes.toString(val1)+" "); System.out.print(Bytes.toString(val2)+" "); System.out.print(Bytes.toString(val3)+" "); System.out.println(Bytes.toString(val4)); } rs.close(); table.close(); }}
In Hbase Shell, the batch data search function consists of scan 'hbase table name', {COLUMNS => 'column Family name: column name', STARTROW => 'start rowkey ', STOPROW => 'terminate rowkey.
Filter
Filters are used to perform filtering on the Hbase server. They can be applied to RowFilter, QualifierFilter, and ValueFilter ).
Two common filters are listed here: RowFilter and SingleColumnValueFilter.
RowFilter
RowFilter uses the row key to filter data.
BinaryComparator directly compares two byte arrays. The optional comparison operators (CompareOp) include EQUAL, NOT_EQUAL, GREATER, GREATER_OR_EQUAL, LESS, and LESS_OR_EQUAL.
public class rowFilter{ public static void main(String[] args) throws IOException{ String tableName = "table1"; Configuration config = HBaseConfiguration.create(); config.addResource("core-site.xml"); config.addResource("hdfs-site.xml"); config.addResource("yarn-site.xml"); config.addResource("mapred-site.xml"); HTable table = new HTable(config, tableName); Scan scan = new Scan(); scan.addColumn(Bytes.toBytes("family1"), Bytes.toBytes("q1")); Filter filter = new RowFilter(CompareFilter.CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("Row01234"))); scan.setFilter(filter); ResultScanner scanner = table.getScanner(scan); for(Result res:scanner){ byte[] value = res.getValue(Bytes.toBytes("family1"),Bytes.toBytes("q1")); System.out.println(new String(res.getRow())+" value is: "+Bytes.toString(value)); } scanner.close(); table.close(); }}
SingleColumnValueFilter
SingleColumnValueFilter filters the values of a specific column.
SubstringComparator checks whether the given string is a sub-string of the column value. The optional comparison operators (CompareOp) include EQUAL and NOT_EQUAL.
public class singleColumnValueFilter{ public static void main(String[] args) throws IOException{ Configuration config = HBaseConfiguration.create(); config.addResource("core-site.xml"); config.addResource("hdfs-site.xml"); config.addResource("yarn-site.xml"); config.addResource("mapred-site.xml"); String tableName = "table1"; HTable table = new HTable(config,tableName); SingleColumnValueFilter filter = new SingleColumnValueFilter( Bytes.toBytes("family2"), Bytes.toBytes("q1"), CompareFilter.CompareOp.NOT_EQUAL, new SubstringComparator("45")); //when setting setFilterIfMissing(true), rows with "null" values are filtered filter.setFilterIfMissing(true); Scan scan = new Scan(); scan.setFilter(filter); ResultScanner scanner = table.getScanner(scan); for (Result res:scanner){ byte[] val = res.getValue(Bytes.toBytes("family1"), Bytes.toBytes("q1")); System.out.println(new String(res.getRow())); System.out.println("value: " + Bytes.toString(val)); } scanner.close(); table.close(); }}
Original article address: hbase Java API operation instance. Thank you for sharing it.