Phoenix implementation querying hbase with SQL

Last Update:2014-12-02 Source: Internet

Author: User

Tags sql client zookeeper

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Article source: Daniel notes

HBase, a NoSQL database, can store large amounts of non-relational data.

HBase, which can be manipulated with the hbase shell, or by using the HBase Java API. HBase is a database, but its query statements are not very useful. It would be perfect to use SQL statements to manipulate hbase like a relational database such as MySQL.

Existing tools have many Hive,tez,impala,shark/spark,phoenix and so on. Today the main record is Phoenix.

Phoenix, an open source project by Saleforce.com, then donated it to Apache. It is equivalent to a Java middleware that helps developers, like using JDBC to access relational databases, to access NoSQL database HBase.

Phoenix, operational tables and data, stored on hbase. Phoenix just needs to be associated with the HBase table. Then use the tool for some read or write operations.

In fact, Phoenix can only be seen as a tool to replace HBase's syntax. Although Java can be used to connect Phoenix with JDBC and then manipulate hbase, it cannot be used in OLTP in a production environment. in the context of online transaction processing, low latency is required, while Phoenix does some optimizations when querying hbase, but the latency is still not small. So it is still used in Olat, and then the results are returned to store.

Phoenix official online, the explanation of Phoenix has been very good. If English is good, can crossing net, more formal some.

Phoenix Installation1, download Phoenix.

Phoenix and HBase versions correspondence

Phoenix 2.x-hbase 0.94.x

Phoenix 3.x-hbase 0.94.x

Phoenix 4.x-hbase 0.98.1+

I am currently testing the use of version profiles:

Hadoop1.0.4

HBase0.94.18

So I can use phoenix2.x,phoenix3.x.

Official website download page has

I chose the phoenix3.1.0 version.

2, upload to the main node Linux is OK, unzip

TAR–ZXVF phoenix.tar.gz

Pwd

/root/phoenix

LL Phoenix

The Phoenix directory structure may be a bit different, primarily the location of the bin directory, possibly under HADOOP1, or directly under/root/phoenix. It doesn't matter, it's almost.

3. Copy some files

Since the use of the hadoop1.x cluster, then we use the contents of the Phoenix directory under the HADOOP1 directory.

① will hadoop1 under the Phoenix-core-3.x.jar copy to the Lib directory of each node of the Hadoop cluster hbase.

② re-start HBase

4. Verify if Ann is successful

On the master node, switch to the/root/phoenix/hadoop1/bin directory

Input./sqlline.py master:2181

If this picture appears, it is success. If not successful, there may be some problems with the zookeeper configuration.

OK, first exit this interface, enter!quit return and then you can exit.

This Phoenix is very interesting, some commands need to enter an exclamation mark!

use of Phoenix

MySQL, you can operate CLI command line, can be accessed by using JDBC, in Java code, you can use SQLYOG access management;

Phoenix, how do you use it? ~ can be seen as MySQL. Phoenix can operate under the CLI, can be used with JDBC, and can be accessed squirrel by a client tool in Phoenix;

Say Squirrel first, this is simple.

Squirrel is a client tool that connects to a database. Databases that generally support JDBC can be used to connect. (e.g. squirrel connection to MySQL)

To download the squirrel SQL Client, unzip it. The graphical interface appears when running Squirrel-sql.bat.

It must be said how to even Phoenix?

① add a few jar packages under the Squirrel installation directory Lib

A,phoenix-core-xxx.jar

B,phoenix-3.0-client.jar

C,hbase-0.94.18.jar

D,hadoop1.0.4.jar

E,hadoop-common-xxx.jar

You need to click "Drivers" to add the Phoenix driver.

③ Click the Blue "+" plus sign in the upper left corner to add

In the order above, fill in in turn. First step name: Write a name, mark the connection; the second step example URL: equivalent to the MySQL JDBC connection string, here alias write zookeeper host name, port number, can write, can not write, I generally do not write The third step is to select the Phoenix-core jar package, and the fourth step is to manually enter the Org.apache.phoenix.jdbc.PhoenixDriver.

then click OK.

④ Configuring connections

Name: It's a casual title.

Driver: Select the Phoenix driver added in ③.

URL: Write as above, Jdbc:phoenix:node1,node2,master etc here is mainly zookeeper hostname.

User name: The username of the host to connect to

Password: The password of the host to connect to

Click Test for testing or click OK to connect.

After the connection is complete, you can see the following effect when you start. Here I have created several tables, all of which exist on HBase.

Squirrel Some of the layout of the introduction:

1, some connections built with squirrel

2, under the current connection, all objects, including the assertive system table, ordinary table, view.

3, for tables, these tables are actually present on HBase managed by zookeeper. Right-click this table to manage the table.

4, for the view.

5, where the SQL script is written, you can enter script execution. Script execution method, in 5 there is a villain, select SQL, click on the villain can be executed. or press the CTRL + ENTER key to execute.

6, for the selected object, some basic information, column information, number of rows, and so on.

7, some states that are executed for SQL.

The following creates a table in squirrel

The process of creating a table in Squirrel is primarily written in SQL, for execution. How to write SQL, you need to see what the Phoenix driver supports.

This needs to look at the official website of Phoenix.

It is important to note that Phoenix is case-sensitive, htablename,columnfamily in its own defined hbase, and the field column, which needs to be consistent with Phoenix.

Phoenix Operation HBase, we have two ways to create a table and create a view.

These two ways, there are differences.

For example, if you create a table, you can insert, query, and delete HBase. View, it is generally only possible to query operations

Although it seems that the functionality of the table is more powerful than the view. But like a relational database such as MySQL, deleting a table operation deletes the table. However, deleting a view operation does not affect the structure of the original table.

Because using Phoenix, the associated mappings are automatically established with HBase when the table is created. When you use Phoenix to delete the relationship between hbase, the table in HBase is deleted as well.

So using a view will have less impact on the original HBase table.

Phoenix can create tables,

If there is no htable in HBase:

Create Htablename (

PK VARCHAR PRIMARY key NOT NULL,

Col1 VARCHAR NULL,

Col2 VARCHAR NULL,

Col3 VARCHAR NULL

)

Create Htablename2 (

PK VARCHAR PRIMARY KEY NULL,

"CF". " Col1 "VARCHAR null,

"CF". " Col2 "VARCHAR null,

"Cf2". " Col3 "VARCHAR null,

"Cf2". " Col4 "VARCHAR null,

)

The above SQL script, which can be executed in squirrel, will be prompted under the tool if an error occurs during execution. If successful, you can see the table in HBase.

If there is already a htable named Htablename3 in HBase, then squirrel does not directly show the existing table in HBase, and we need to do some extra work.

After execution in squirrel, the htablename3 of hbase is mapped to squirrel. This allows us to operate in the Java API. Otherwise, it is not possible.

So what are the specific things you need to do? It was very simple, but I didn't think

Just like creating a table, use CREATE table. It's that simple.

But how do you write this statement? How does it correspond?

Create Htablename3 (

PK VARCHAR PRIMARY KEY null,-------this sentence directly to write, so that the Rowkey in HBase into Phoenix in the primary key, column name is called PK. The rowkey automatically corresponds to the primary key.

"CF". " Col1 "VARCHAR null,-------will be named CF under Columnfamily, field named Col1 field, written here.

"CF". " Col2 "VARCHAR null,-------will be named CF columnfamily, field named Col2 field, write here ... Here, and so on.

"Cf2". " Col3 "VARCHAR null,

"Cf2". " Col4 "VARCHAR null,

)

Then execute it in squirrel and then you can see the data.

But at this point, there may be problems, garbled. In squirrel, the primary key and some fields containing Chinese characters, are blocks and so on garbled. How to solve this???

Create an attempt to

CREATE VIEW "Heihei"

(PK VARCHAR primary key)

default_column_family = ' FM '

Once created, the "Heihei" here is the name of the table in HBase. Then define a primary key and it's OK.

Create a View

①create VIEW "DAMAI" (PK VARCHAR PRIMARY KEY) default_column_family= ' FM '

The "DAMAI" inside the double quotes is the same as the table name in HBase, so it is automatically associated.

So, what if you want to build multiple views for a table in HBase?

② second view, can be created on the basis of Phoenix table.

CREATE VIEW my_view (new_col varchar,new_col2 VARCHAR) as SELECT * from Phoenix_table WHERE ...

③ the third view, which is built on top of the view,

CREATE VIEW My_view_on_view as SELECT * from My_view WHERE ...

So, when you create a Damai view, you can include all the fields. You can then create additional views based on this view.

As the data grows, the number of bars of data that can be seen in the view also increases in synchronization.

Use commands to view data in a view

SELECT * FROM Damai this can be done in case. No distinction

The query results now have only one column. It appears that when you create a view, no other columns are associated. It's okay, cut it out, rebuild it.

The grammar in Phoenix

Data types in Phoenix

The method in Phoenix

I use some simple statements of the process myself, as follows:

SELECT * from Shuju;

Select COUNT (1) from Shuju;

Select Cmtid,count (1) as Num from Shuju GROUP by Cmtid ORDER by num DESC;

Select AVG (To_number (AVGT)) from Shuju;

Select Cmtid,count (1) as Num,avg (To_number (AVGT)) as Avgt,avg (To_number (loss)) as loss from Shuju group by Cmtid ORDER by Num DESC;

Select Acm,dtype,cmtid,count (1) as Num,avg (To_number (AVGT)) as Avgt,avg (To_number (loss)) as loss

From Shuju

GROUP BY Acm,dtype,cmtid

ORDER BY num DESC;

Select Acm,dtype,porgcode,orgid,cmtid,count (1) as Num,avg (To_number (AVGT)) as Avgt,avg (To_number (loss)) as loss

From Shuju

GROUP BY Acm,dtype,porgcode,orgid,cmtid

ORDER BY num DESC;

where To_date (ttime, ' Yyyymmddhhmmss ') =to_date (' 20141125 ', ' yyyyMMdd ')

Select Ttime from Shuju order by ttime Desc;

where To_date (ttime, ' Yyyymmddhhmmss ') =to_date (' 20141125 ', ' yyyyMMdd ')

Select To_date (ttime, ' Yyyymmddhhmmss ') from Shuju;

Select To_date (' 20141125 ', ' yyyyMMdd ') from Shuju;

Select (To_date (ttime, ' Yyyymmddhhmmss ') =to_date (' 20141125 ', ' yyyyMMdd ')) as AAA from Shuju Order by AAA ASC;

using shell commands to operate Phoenix

Using the Shell to operate Phoenix is not very useful. When you are helpless, you can shout for help! So,say Help and Help

./sqlline.py master:2181

After entering the shell, enter help

You can see a lot of commands, preceded by an exclamation mark. Guess by the meaning of the function, and then try the effect, you can. It's not recorded.

manipulating HBase with the Phoenix Java API

This process is just like using JDBC.

① first put Phoenix's Core.jar bag and Phoenix's Client.jar package into Lib.

② Create a connection, and the process is similar to MySQL

Public Connection getconnection () {
Connection cc = NULL;
String Driver = "Org.apache.phoenix.jdbc.PhoenixDriver";
String url = "jdbc:phoenix:192.168.206.21:2181";

try {
Class.forName (driver);
} catch (ClassNotFoundException e) {
E.printstacktrace ();
}

if (cc = = null) {
try {
cc = drivermanager.getconnection (URL);
} catch (SQLException e) {
E.printstacktrace ();
}
}
return cc;
}

OK, it's done.

Phoenix implementation querying hbase with SQL

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More