Sometimes, you need to delete the data of specified rows. However, I see that the hbase authoritative guide does not seem to have a delete filter, So I thought about how to delete a specified row.
- If you know that you want to delete the row keys of some rows, you can use list <Delete> to complete this function.
- If the row to be deleted is continuous in the table, that is, the row in the specified range is deleted, but you do not know all the row keys in this range, such as the row key 11-19 in a table, however, the table only has 11,13, 16. In this case, you can use scan to read the row keys in the range first, and use scan to use filter. One of the filters is keyonlyfilter, because we only need the key.
Scan sc = new Scan();Filter fil = new KeyOnlyFilter();sc.setStartRow(startRow);sc.setStopRow(stopRow);
- The filter mentioned above, so we can use scan and filter to read the specified key and then delete it.
Scan scan=new Scan();Filter filter=new RowFilter(CompareFilter.CompareOp.EQUAL,new RegexStringComparator(pyramidName));scan.setFilter(filter);ResultScanner resultScanner=tileTable.getScanner(scan);List<Delete> deletes=new ArrayList<Delete>();for(Result result:resultScanner){byte[] row=result.getRow();Delete deleteTile=new Delete(row);deletes.add(deleteTile);}tileTable.delete(deletes);
PS: the third method transfers a lot of useless data in the cluster, virtually increasing the network bandwidth. However, this is also impossible. The above Code has not been tested, which generally means
Pss: if anyone has a better method, I 'd like to say it to me.