Transferred from: http://my.oschina.net/u/189445/blog/595232
Two months ago used HBase, now the most basic commands are forgotten, leave a reference ~
HBase shell command |
Describe |
Alter |
Modify Column family (family) mode |
Count |
Number of rows in the statistics table |
Create |
Create a table |
Describe |
Show table-related details |
Delete |
Deletes the value of the specified object (can be a value for a table, row, column, or a timestamp value) |
DeleteAll |
Delete all element values for the specified row |
Disable |
Make Table invalid |
Drop |
Delete a table |
Enable |
Make Table valid |
Exists |
Whether the test table exists |
Exit |
Exit HBase Shell |
Get |
Gets the value of a row or cell (cell) |
incr |
Increase the value of the specified table, row, or column |
List |
List all tables that exist in HBase |
Put |
Add a value to a table cell to a point |
Tools |
List the tools supported by HBase |
Scan |
To get a value for a pair by scanning the table |
Status |
Returns the status information for an hbase cluster |
Shutdown |
Close HBase cluster (different from exit) |
Truncate |
Re-create the specified table |
Version |
Return HBase version Information |
Note the difference between shutdown and exit: Shutdown means to close the HBase service, you must restart HBase to recover, exit just exits the hbase shell and can be completely re-entered after exiting.
HBase uses coordinates to locate data in a table, the row health is the first coordinate, and the next coordinate is the column family.
HBase is an online system that is tightly coupled to Hadoop mapreduce and gives it the ability to access offline.
HBase saves the change information or writes a failed exception when it receives the command, by default. Writes are written to two places: pre-write logs (Write-ahead log, also known as Hlog) and memstore to ensure data persistence . Memstore is Memory write buffers in the. The client does not interact directly with the underlying hfile during the write process, and when Menstore is full, it flushes to the hard disk and generates a new hfile.hfile that is the underlying storage format used by HBase. The size of the menstore is defined by The system-level attribute Hbase.hregion.memstore.flush.size in the hbase-si te.xml file. .
HBase uses the LRU caching mechanism (Blockcache) for read operations, and Blockcache is designed to hold frequently accessed data that is read into memory from hfile to avoid hard disk reads. Each column family has its own blockcache. The block in Blockcache is the unit of data that HBase reads from the hard disk once. A block is the smallest unit of data that is indexed and the smallest unit of data read from a hard disk. If it is used primarily for random queries, a smaller block will be better, but it will cause the index to become larger, consume more memory, and if the primary execution sequence is scanned, the larger block will be better, the block becomes larger and the index entry becomes smaller, thus saving memory.
LRU is the least recently used algorithm for least recently used. A page replacement algorithm for memory management, in which data blocks (memory blocks) that are not used in memory are called LRU, the operating system frees up space to load additional data based on what data belongs to the LRU and moves it out of memory.
The Data model summarizes:
Tables (table)---------hbase to organize your data. The table name is a string, consisting of characters that can be used in the file system path.
Row---------in the table, and the data is stored by row. Rows are uniquely identified by the row health (Rowkey). a row Jian has no data type and is always treated as a byte array byte[].
Columns Family (column family)-----------rows of data grouped by column family, and column families also affect the physical storage of hbase data. Therefore, they must be defined beforehand and not easily modified. Each row in the table has the same column family, although rows do not need to store data in each column family. The column family name is a string consisting of characters that can be used in the file system path. (HBase builds table is can add column family, alter ' T1 ', {NAME = ' F1 ', VERSIONS = 5} Put table disable after alter, then enable)
column Qualifiers (columns qualifier)--------The data in the column family are positioned by column qualifiers or columns. Column qualifiers do not have to be defined beforehand. column qualifiers do not have to be consistent between peers, just like a row health, where the column qualifier has no data type and is always treated as a byte array byte[].
Unit (cell)-------row health, column family, and column qualifier together to determine a cell. The data stored in the cell is called the cell value (value), and the value has no data type and is always treated as a byte array byte[].
Time version--------The cell value is sometimes versioned, and the time version is identified by a timestamp, which is a long. When a time version is not specified, the current timestamp is the base of the operation. the number of time versions of HBase reserved unit values is configured based on the column family. The default number is 3 .
HBase stores data in the table using four-dimensional coordinate systems, in order: row Kin, column family, column qualifier, and time version . HBase arranges time versions in descending order of timestamps, and other mappings are built in ascending order.
HBase stores data in a distributed file system that provides a single namespace. A table is made up of several smaller region The server hosting region is called Regionserver. A single region size is determined by the configuration parameter, Hbase.hregion.max.filesize, when a region size becomes larger than this value, it is divided into 2.
HBase is a database built on Hadoop. Rely on Hadoop for data access and data reliability. HBase is an on-line system that targets low latency, and Hadoop is an off-line system optimized for throughput. Complementary can build a horizontally scaled data application.
The representation in HBase is stored as column family.
Create a table with 3 column family
Create ' t1 ', {name = ' F1 ', VERSIONS = 1}, {name = ' F2 ', VERSIONS = 1}, {name = ' F3 ', VERSIONS = 1}
When defining a table, you only need to specify the name of column family, which is dynamically specified when put
Inserting data
Insert the name below without specifying column
Put ' t1 ', ' R1 ', ' F1 ', ' v1 '
Put ' T1 ', ' R2 ', ' F2 ', ' v2 '
Put ' T1 ', ' R3 ', ' F3 ', ' V3 '
Insert the name of the specified column below
Put ' t1 ', ' R4 ', ' f1:c1 ', ' v1 '
Put ' T1 ', ' R5 ', ' f2:c2 ', ' v2 '
Put ' T1 ', ' R6 ', ' f3:c3 ', ' v3 '
HBase (main):245:0> scan ' T1 '
ROW Column+cell
R1 COLUMN=F1:, timestamp=1335407967324, VALUE=V1
R2 COLUMN=F2:, timestamp=1335408004559, Value=v2
R4 column=f1:c1, timestamp=1335408640777, VALUE=V1
R5 Column=f2:c1, timestamp=1335408640822, Value=v2
R6 column=f1:c6, timestamp=1335412392258, Value=v3
R6 column=f2:c1, timestamp=1335412384739, Value=v3
R6 column=f2:c2, timestamp=1335412374797, Value=v3
Inserting multiple columns of data
Put ' t1 ', ' R7 ', ' f1:c4 ', ' v9 '
Put ' t1 ', ' R7 ', ' f2:c3 ', ' v9 '
Put ' t1 ', ' R7 ', ' f3:c2 ', ' v9 '
Write the Memstore into the hfile by hand
Flush ' T1 '
Delete all CF3 data
DeleteAll ' t1 ', ' R7 '
Flush ' T1 '
Every flash will build a new hfile.
$ .. /bin/hadoop DFS-LSR/HBASE/T1
Data is stored directly in the CF directory, with 3 to 4 hfile per CF directory
F1
f1/098a7a13fa53415b8ff7c73d4d69c869
f1/321c6211383f48dd91e058179486587e
F1/9722a9be0d604116882115153e2e86b3
F2
F2/43561825dbde4900af4fb388040c24dd
f2/93a20c69fdec43e8beeed31da8f87b8d
f2/b2b126443bbe4b6892fef3406d6f9597
F3
f3/98352b1b34e242ecac72f5efa8f66963
f3/e76ed1b564784799affa59fea349e00d
f3/f9448a9a381942e7b785e0983a66f006
F3/fca4c36e48934f2f9aaf1a585c237d44
F3 all data have been deleted, because there are no merged files exist
Manually merging hfile
HBase (Main):244:0> compact ' T1 '
0 row (s) in 0.0550 seconds
$ .. /bin/hadoop DFS-LSR/HBASE/T1
F1
f1/00c05ba881a14ca0bdea55ab509c2327
F2
F2/95fbe85769d64fc4b291cabe73b1ddb2
/f3
There's only one hfile,f3 under F1 and F2. There's no hfile down there because the data is deleted.
Put only one column at a time
Delete only one column at a time
Delete entire row, with DeleteAll
DeleteAll ' T1 ', ' R1 '
HBase table Design:
HBase tables are flexible and can store anything in the form of a character array. All things that store similar access patterns in the same column family.
The index is built on the key part of the KeyValue object, and the key consists of the row health, column qualifier, and timestamp in order. High table may support you in reducing the complexity of operations to O (1), but at the cost of atomicity.
HBase does not support cross-row transactions, column qualifiers can be used to store data, and the length of the column family name affects the size of the data that is passed back to the client over the network (in the KeyValue object), so be as concise as possible.
Hashing supports fixed-length keys and better data distribution, but loses the benefit of sorting. Anti-normalization is a feasible way to design hbase mode. From a performance point of view, normalization is optimized for writing, while anti-normalization is optimized for reading.
Enter HBase Shell Console
$HBASE _home/bin/hbase Shell
If you have Kerberos authentication, you need to use the appropriate keytab for authentication (using the Kinit command), and then use the HBase shell to enter the certificate successfully. You can use the WhoAmI command to view the current user
HBase (main) > WhoAmI
Management of Tables
1) all created tables (except the-root table and the. Meta table (filtered out) can be listed by list
HBase (main) > list
2) Create the table, where T1 is the table name, F1, F2 is the column family of T1. Tables in HBase have at least one column family. Among them, the column families directly affect the physical characteristics of the HBase data store.
# syntax:create <table>, {NAME = <family>, VERSIONS = <versions>}
# example: Create TABLE T1, with two family NAME:F1,F2 and 2 versions
HBase (Main) > create ' t1 ', {NAME = ' F1 ', VERSIONS = 2},{name = ' F2 ', VERSIONS = 2}
3) Delete Table
Two steps: First disable, then drop
Example: Delete Table T1
HBase (Main) > Disable ' T1 '
HBase (Main) > drop ' t1 '
4) View the structure of the table
# syntax: describe (DESC) <table> (you can see all the default parameters for this table)
# example: View the structure of a table T1
HBase (Main) > describe ' t1 '/desc ' t1 '
5) Modify Table structure
To modify a table structure, you must first disable
# syntax: Alter ' t1 ', {name = ' F1 '}, {name = ' F2 ', METHOD = ' Delete '}
# example: The TTL of the CF that modifies the table test1 is 180 days
HBase (Main) > Disable ' test1 '
HBase (Main) > Alter ' test1 ',{name=> ' body ',ttl=> ' 15552000 '},{name=> ' meta ', ttl=> ' 15552000 '}
HBase (Main) > Enable ' Test1 '
Rights Management
1) Assigning Permissions
# syntax: Grant <user> <permissions> <table> <column family> <column qualifier> parameters are separated by commas
# permissions are denoted by five letters: "RWXCA".
# READ (' R '), WRITE (' W '), EXEC (' X '), CREATE (' C '), ADMIN (' A ')
# For example, assigning the user ' test ' permission to read and write to the table T1
HBase (Main) > Grant ' Test ', ' RW ', ' T1 '
2) View Permissions
# syntax: User_permission <table>
# For example, view a list of permissions for a table T1
HBase (Main) > user_permission ' t1 '
3) Revoke permissions
# similar to assigning permissions, syntax: Revoke <user> <table> <column family> <column qualifier>
# For example, retract the test user's permissions on table T1
HBase (Main) > Revoke ' Test ', ' T1 '
Table Data Deletion and modification
1) Add data
# syntax: Put <table>,<rowkey>,<family:column>,<value>,<timestamp>
# For example: Add a row of records to table T1: Rowkey is rowkey001,family name:f1,column name:col1,value:value01,timestamp: System default
HBase (Main) > put ' t1 ', ' rowkey001 ', ' f1:col1 ', ' value01 '
Usage is relatively single.
2) query data
A) query a row of records
# syntax: Get <table>,<rowkey>,[<family:column>,....]
# example: Query the value of col1 under F1 in table t1,rowkey001
HBase (Main) > Get ' T1 ', ' rowkey001 ', ' f1:col1 '
Or
HBase (Main) > Get ' T1 ', ' rowkey001 ', {column=> ' f1:col1 '}
# Query All column values under F1 in table t1,rowke002
HBase (Main) > Get ' T1 ', ' rowkey001 '
b) Scan table
# syntax: Scan <table>, {COLUMNS = [<family:column>,....], LIMIT = num}
# Plus, you can add advanced features such as StartRow, Timerange, and Fitler
# For example: Scan the first 5 data of table T1
HBase (Main) > scan ' t1 ', {limit=>5}
c) Number of data rows in the query table
# syntax: Count <table>, {INTERVAL = intervalnum, CACHE = Cachenum}
# interval set How many lines to display once and corresponding Rowkey, the default 1000;cache each buffer size, default is 10, adjust this parameter to improve query speed
# For example, the number of rows in the query table T1, shown every 100 bars, and the buffer is 500
HBase (Main) > count ' T1 ', {INTERVAL = +, CACHE = 500}
3) Delete data
A) Delete a column value from the row
# syntax: delete <table>, <rowkey>, <family:column>, <timestamp>, column name must be specified
# example: Delete data from f1:col1 in table t1,rowkey001
HBase (main) > delete ' t1 ', ' rowkey001 ', ' f1:col1 '
Note: Data for all versions of the row f1:col1 column will be deleted
b) Delete Row
# syntax: DeleteAll <table>, <rowkey>, <family:column>, <timestamp>, you can delete entire rows of data without specifying a column name
# example: Delete data from table t1,rowk001
HBase (Main) > DeleteAll ' t1 ', ' rowkey001 '
c) Delete all data from the table
# syntax: Truncate <table>
# The specific process is: Disable table, drop table, CREATE table
# example: Delete all data from table T1
HBase (Main) > truncate ' t1 '
Region Management
1) Move Region
# syntax: Move ' encoderegionname ', ' ServerName '
# Encoderegionname refers to the regioname behind the code, servername refers to the Master-status region servers list
# example
HBase (Main) >move ' 4343995a58be8e5bbc739af1e91cd72d ', ' db-41.xxx.xxx.org,60020,1390274516739 '
2) Open/close region
# syntax: Balance_switch true|false
HBase (main) > Balance_switch
3) Manual Split
# syntax: Split ' regionname ', ' Splitkey '
4) manual Trigger major compaction
#语法:
#Compact all regions in a table:
#hbase > major_compact ' t1 '
#Compact an entire region:
#hbase > Major_compact ' R1 '
#Compact a single column family within a region:
#hbase > Major_compact ' R1 ', ' C1 '
#Compact a single column family within a table:
#hbase > major_compact ' t1 ', ' C1 '
Configuration Management and Node restart
1) Modify the HDFs configuration
HDFs Configuration Location:/etc/hadoop/conf
# Synchronous HDFS Configuration
Cat/home/hadoop/slaves|xargs-i-t scp/etc/hadoop/conf/hdfs-site.xml [email protected]{}:/etc/hadoop/conf/ Hdfs-site.xml
#关闭:
Cat/home/hadoop/slaves|xargs-i-t ssh [email protected]{} "sudo/home/hadoop/cdh4/hadoop-2.0.0-cdh4.2.1/sbin/ hadoop-daemon.sh--config/etc/hadoop/conf Stop Datanode "
#启动:
Cat/home/hadoop/slaves|xargs-i-t ssh [email protected]{} "sudo/home/hadoop/cdh4/hadoop-2.0.0-cdh4.2.1/sbin/ hadoop-daemon.sh--config/etc/hadoop/conf Start Datanode "
2) Modify HBase configuration
HBase Configuration Location:
# Synchronous HBase Configuration
Cat/home/hadoop/hbase/conf/regionservers|xargs-i-t scp/home/hadoop/hbase/conf/hbase-site.xml [email protected]{ }:/home/hadoop/hbase/conf/hbase-site.xml
# Graceful Restart
CD ~/hbase
bin/graceful_stop.sh--restart--reload--debug inspurXXX.xxx.xxx.org
HBase Common shell commands