The best number of column families should be one or two, should not exceed 3 <----from apache.com
No limit on the number of labels
Data is stored as binary in HBase (hbase more like a data management system, where data is stored in HDFs, similar to DB2 and Oracle, where relational database data is stored on disk),
So when you operate hbase through the Java API, you need to convert the. getBytes () into bytecode form
Cell cell is the basic storage unit, one row of data is a cell plus a rowkey plus timestamp, which forces designers to use simple, short rowkey to save storage space.
At the same time, Rowkey should carry some important business information.
A cell is composed of a column family and a column label (Cfone:gender) and a value (value = "male").
Example:
Scan ' tablename '
1531187321_20161230224431 column=cfone:addr, timestamp=1466343766398, Value=shanghai
1531187321_20161230224431 Column=cfone:phone, timestamp=1466343766398, value=153765324169
1531187321_20161230224431 Column=cfone:time, timestamp=1466343766398, value=218
1531187321_20161230224431 Column=cfone:type, timestamp=1466343766398, value=1
Hbase thought is.
Baidu Encyclopedia seems to explain the good
Http://baike.baidu.com/link?url=Iy3VSkddq3HH-vzedzOIGakgwjg7qf49M5keEdCPHafH3qZEcbEvxVTH_y7wRQmrGt2L0FveKKifCsAf_cKKOq
Hbase does not support join
Hbase Introduction
Hbase--hadoop Database is a highly reliable and high performance scalable real-time read-write distributed databases
Using Hadoop HDFs as its file storage system, using MapReduce to deal with the massive data in HBase, using zookeeper as distributed cooperative service
Mainly used to store unstructured semi-structure loose data
Zookeeper
Ensure that there is only one master in the cluster at any time
Store all region addressing portals
Monitor region server's online and offline information in real time and notify Master in real time
Store hbase schema and table meta data
Master
Assigning region to Region server
Responsible for load balancing of Region server
Reassign the region on the failed region server
Managing DDL DML operations on a table by users
Regionserver
Maintain region processing IO requests on these region
Responsible for splitting the region in the process of operation