Turn from: http://www.cnblogs.com/chay1227/archive/2013/03/17/2964020.html
Turn from: http://blog.csdn.net/allen879/article/details/40461227
Turn from: http://blog.itpub.net/28912557/viewspace-776770/
As a result of the project needs, the original system upgrade needs to use the HBase technology, after the use of the discovery, indeed very good. So the question is, why use hbase here instead of the previou
Htablepool will be phased out until the release version 0.98 is clear. Use the new API, hconnection. gettable (...).
Its Design Philosophy:
"By default, when needed, hconnectioninplementation will create an executorservice.This executorservice can be selectively transmitted and passed in for processing. Htableinterfaces from hconnection. By default, the executorservice of hconnection is used, but this can selectively."
References:
Bug reporting; https://issues.apache.org/jira/browse/
Hbase's three-dimensional ordered storage consists of rowkey, column key (columnfamily + qualifier), and timestamp.
1. rowkey. We know that rowkey is the primary key of the row, and hbase can only use a rowkey or a rowkey range, that is, scan, to find data. Therefore, the design of rowkey is crucial and affects the query efficiency at your application layer. We know that rowkey is ordered alphabetically. The stored bytecode and dictionary sorting, we
completebulkload tool) to pass the file location on HDFS to it, it will use RegionServer to import data to the corresponding region.
A simple and clear explanation of the entire process
Image from How-to: Use HBase Bulk Loading, and Why
Note: Before BulkLoad, create an empty table with the same name and structure as the program in HBase.
Java implementation i
Author: those things |ArticleCan be reproduced. Please mark the original source and author information in the form of a hyperlink
Web: http://www.cnblogs.com/panfeng412/archive/2012/11/04/hbase-how-to-resolve-not-serving-region-exception.html
During the reading and writing process, the hbase cluster may temporarily deprecate the region due to region split or region blance. When the client and
. HBase
For HBase, it is often heard that hbase is only suitable for supporting off-line analytical applications, especially as background data sources for mapreduce tasks. Hold this view a lot of, even in the domestic a resounding telecommunications equipment provider, HBase is also classified into the data analysis p
become too large during operation.Zookeeper: Through the election, ensure that at any time, only one master,master and regionservers in the cluster will register with the Zookeeper when it is started, store all the addressable portals of region, and monitor the region in real time. Server's on-line and offline information, and real-time notification to master, storing HBase schema and table metadata, by default,
Tags: reduce distributed security reliable analysis support ABI structured extensibleObjective: To understand the characteristics and implementation of hbase and support massive data query Characteristics and limitations of traditional relational database Traditional database transaction is particularly strong, requiring data integrity and security, resulting in system availability and scalability is greatly compromised. For high-concurrency traffic,
Tags: Cdh5 contain sharing app technology share tip common classes Bin1:hbase start hbase Shell Run command class path contains multiple slf4j Bindings. Error, because the jar package conflicts, so for the jar package with Hadoop, you can delete the other jar package, if you are not sure whether the deletion is correct, you can copy the other jar package backup or modify the name, to ensure that the operati
last re Quests per second. if ((Currenttime-lastran) > 0) { long C Urrentrequestcount = Gettotalrequestcount (); Requestspersecond = (currentrequestcount-lastrequestcount)/ (Currentti Me -Lastran)/1000.0); Lastrequestcount = Currentrequestcount; } Lastran = CurrentTime; 6,gettotalrequestcount () returns the value of RegionServer.rpcServices.requestCount. While RequestCount represents the number of RPC requests recorded in Regionserver, this value is incremented
Code version: hbase-1.2.6
Project: Hbase-server
Class: Org.apache.hadoop.hbase.regionserver.HRegion
Issues that need to be addressed:
1, when to trigger the split.
2. What is the strategy for splitting?
1. Determine if you need to slice
Method: Checksplit
return value: Splitpoint
After doing some judgment, it is actually called:
byte[] ret = Splitpolicy.getsplitpoint ();
2. Segmentation Strategy
Org.a
-oriented
Scalable
Large-scale structured storage clusters can be built on inexpensive PC servers
3. HBase vs. RDBMSHBase is suitable for databases with unstructured data stores. A data storage method between the map Entry and the DB row. And RDBMS is a follow ldquo; Codd's 12 rules rdquo; database. The main differences are as follows:Data type: HBase has only a simple string type, and it only
running the Hadoop cluster; three virtual machines are also running HBase: master, node1, node2
The Mysql database is installed on the master.
At present, there is a need:
There are a bunch of text files, each containing logs;
Each row is a record;
Now, you need to read the record row by row and obtain the account information from HBase or Mysql based on the mac address and sn in the record.
Then, it is me
Problem Description:
Running a Web project on idea, it was thought that using MAVEN to introduce the HABSE package would do, so the compilation could pass. Did not expect to throw the error at run time, said Noclassdeffounderror:org/apache/hadoop/hbase/hbaseconfiguration
Solution:
Download the hbase-1.1.0.1 on the official web and copy all the jar packages (excluding Ruby directories) from the Lib di
Recently, a hbase based Mr Program was written. Summarized as follows:
1, using Tablemapper to read the table
2. The first way to write a table is to use the Tablemapreduceutil.inittablereducerjob method, which can be output in the map phase as well as in the reduce phase. The difference is that reduce's class is set to NULL or actual reduce below is an example of a table copy:
Package com.run.test;
Import java.io.IOException;
Import java.util.List;
Environment:hadoop:hadoop-2.2.0hbase:hbase-0.96.01.org.apache.hadoop.hbase.client.put0.94.6 the public class put extends Mutation implements HeapSize, writable, comparable0.96.0 when public class put extends Mutation implements HeapSize, comparableWorkaround:By the public class Monthuserlogintimeindexreducer extends ReducerChange public class Monthuserlogintimeindexreducer extends Reducer2.org.apache.hadoop.hbase.client.mutation.familymapORG.APACHE.HADOOP.HBASE.CLIENT.MUTATION.FAMILYMAP Type cha
tree, Bloom Filter
Bloom Filter
Lock and transaction
Client timestap (DYNAMO uses vector lock)
Optimistic Concurrency Control
Read/write Performance
Fast data read/write positioning.
Data read/write positioning may require a maximum of six network RPC times, with low performance.
Cap comment
1. Weak Consistency, and data may be lost.2. high availability.3. Easy resizing.
1. Strong Consistency, zero data loss.2. low availability.3. Easy resizing.
I. The basic concept of hbase1. Row KeyRow primary key, when querying hbase can only rely on row key,hbase does not support conditional query and other similar to the query mode of some major databases, reading records can only rely on row primary key and global sweep surface, you can think of the row primary key in the main database query process used in the primary key (for example, id).2. Column FamilyCo
write data from HDFs, although each process is important, the individual considers hregionserver to be the most central process in hbase. Hregionserver A brief description of the internal structure: Hregions erver internal management of a series of Hregion objects, hregion and region is a matter. In fact, hregion corresponds to a region,hregion in the table is the encapsulation of it. Each hreg
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.