HBase query based on standard SQL specification middleware Phoenix

Source: Internet
Author: User
Tags stmt

  

Phoenix is a good hbase query tool, which is also easy to install in HBase, and can be configured as a client and server Phoenix in the http://www.cnblogs.com/laov/p/4137136.html connection.

Phoenixsql has the following classes:

    • Delete data: Executableaddcolumnstatement, Executabledropcolumnstatement
    • Create/delete table: Executablecreatetablestatement, Executabledroptablestatement
    • Select operation: executableselectstatement
    • Import data: Executableupsertstatement
    • Explanation of execution: executableexplainstatement

Phoenix Architecture and Features:

The execution of SQL Query plan in Phoenix is basically done by building a series of hbase scan. In order to minimize data transfer, using coprocessor in Region server to perform aggregate related work as much as possible, the basic idea is to use Regionobserver in Postscanneropen The hook replaces the Regionscanner with a custom scanner that supports aggregation work, and the specific aggregate operation is passed to Regionscanner via the Custom Scan property. Compared to the idea of implementing plan based on the MapReduce framework, the coprocessor is basically used to execute aggregation on each node using the Regionserver itself. In addition, with various custom filters in HBase's Regionscanner scan process, irrelevant data is filtered out as early as possible. Uses the JDBC interface and application interaction.

Currently support simple table creation, modification, data deletion, filtering, retrieval and other SQL syntax, from the syntax, do not support multi-table operations, in essence, it should not support multi-table Union class operations such as a variety of joins, so in the where section can not be a long table comparison. Due to the limitations of the coprocessor and filter capabilities, it is relatively difficult to implement complex query operations, such as multi-table union operations, with the Hbaseclient API only if you do not rely entirely on the map reduce framework, or a lot of work needs to be implemented in client code , the performance may not meet the requirements.

When executing a query, a layer of abstraction such as SQL between data access and run-time execution can be extensively optimized. For example, for a group by query, we can take advantage of features such as the co-processor in HBase. With this feature, we can execute the Phoenix code on the HBase server. As a result, aggregations can be executed on the server side rather than on the client, which greatly reduces the amount of data transferred between the client and the server. In addition, Phoenix performs group by in parallel with the client, which is implemented by truncating the scan based on the range of row keys. By executing in parallel, the results are returned more quickly. All of these optimizations require no user involvement, and the user simply issues a query.

Advantages:

1: Both the command line and the Java client are simple to use. In particular, Java clients are programmed directly for the JDBC interface, encapsulating and optimizing HBase's many details.
2: Performance on single-table operations is much better than hive handler (but handler may also upgrade features such as ramp-dependent aggregation)
3: Support multi-column two-level index, the number of columns is unlimited. Where the number of columns in a mutable index is slower, the immutable index does not affect the write speed (ref.: https://github.com/forcedotcom/phoenix/wiki/Secondary-Indexing#mutable-indexing).
4: To Top-n query speed far beyond hive (reference: HTTPS://GITHUB.COM/FORCEDOTCOM/PHOENIX/WIKI/PERFORMANCE#TOP-N)
5: Provides the characteristics of the rowkey, can achieve uniform distribution of data in each region (reference: https://github.com/forcedotcom/phoenix/wiki/Performance#salting)
6: Low intrusion, basic to the use of the original hbase has no effect
7: The functions provided basically can cover the vast majority of demand
8: Unlike hive, Phoenix's SQL statement is closer to the standard SQL specification.

Basic query syntax for Phoenix:

SELECT *From shuju;select count (1) from Shuju;select Cmtid,count (1As num from Shuju GROUP by Cmtid ORDER by Num desc;select avg (to_number (AVGT)) from Shuju;select Cmtid,count (1As Num,avg (To_number (AVGT)) as Avgt,avg (To_number (loss)) as loss from Shuju group by Cmtid ORDER by Num Desc;select Acm,dtype,cmtid,count (1As Num,avg (To_number (AVGT)) as Avgt,avg (To_number (loss)) as Lossfrom Shujugroup by Acm,dtype,cmtidorder by num DESC; Select Acm,dtype,porgcode,orgid,cmtid,count (1as Num,avg (To_number (AVGT)) as Avgt,avg (To_number (loss)) as Lossfrom Shujugroup by Acm,dtype,porgcode,orgid, Cmtidorder by Num desc;where to_date (ttime,' Yyyymmddhhmmss ') =to_date (' 20141125 ', ' yyyyMMdd ') Select Ttime from Shuju order by ttime Desc; where To_date (Ttime,' Yyyymmddhhmmss ') =to_date (' 20141125 ', ' yyyyMMdd ') Select To_date (Ttime,' Yyyymmddhhmmss ') from Shuju;select to_date (' 20141125 ', ' yyyyMMdd ') from Shuju;select (To_date (Ttime,' Yyyymmddhhmmss ') =to_date (' 20141125 ', ' yyyyMMdd ')) as AAA from Shuju Order by AAA ASC;

Java calls to Phoenix's driver example: (Phoenix basic almost standard SQL specification)

  

ImportJava.sql.*; Public classPHOENIXJDBC { Public Static voidMain (String args[]) {Try {     //Register JDBC DriverClass.forName ("Org.apache.phoenix.jdbc.PhoenixDriver"). newinstance (); Connection Conn= Drivermanager.getconnection ("jdbc:phoenix:54.152.31.122", "", "" "); //Create a Statement class to execute the SQL StatementStatement stmt =conn.createstatement (); //Execute The SQL statement and get the results in a ResultsetResultSet rs = stmt.executequery ("SELECT * from Us_population"); //iterate through the ResultSet, displaying the values//For each row using the GetString method      while(Rs.next ()) System.out.println ("Name=" + rs.getstring ("host")); } Catch(SQLException e) {e.printstacktrace ();}Catch(Exception e) {e.printstacktrace ();}} }

 


HBase query based on standard SQL specification middleware Phoenix

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.