In the process of importing MySQL data into hbase data, it is found that the data capacity of hbase is increasing rapidly.
Originally in MySQL storage 30G capacity data import to HBase has been increased to fast 150G (not fully imported, manual end),
With the default of 3 backup storage, basically the cluster has 450G capacity.
Looking at some data, it is found that the storage of hbase is really space consuming, and the general hbase is solved by compression algorithm, in which snappy
The algorithm received the praise of Google, and CDH, directly installed the snappy library, so directly used.
hbase> disable ' test ' hbase> alter ' test ', {NAME = ' cf ', COMPRESSION = ' SNAPPY '}hbase> enable ' test '
After use, no immediate effect, according to some information after the execution of the Major_compact command, also found no effective,
But after a period of time to find the table capacity changes, compressed before the original 150G, compressed 15G or so (check data compression ratio of the best effect of about 22%, this has not been a lot of testing)
The effect is more obvious.
HBase uses compressed storage (snappy)