Email: colorantat163.comBLOG: blog. csdn. what is netcolorant? Wasp is an HBase-based database solution developed by Alibaba Group. Its fundamental starting point is to follow Google's External Store, "In HBase systems, we can provide cross-row transactions, indexes, and SQL functions without sacrificing linear expansion capabilities.
Email: colorant at 163.com BLOG: http://blog.csdn.net/colorant/ = What = Wasp is a database solution based on HBase developed by Alibaba Group, its fundamental starting point is to follow Google's external store, "In HBase systems, we can provide cross-row transactions, indexes, and SQL functions without sacrificing linear expansion capabilities.
Email: colorant at 163.com
BLOG: http://blog.csdn.net/colorant/
=What is=
Wasp is an HBase-based database solution developed by Alibaba Group. Its fundamental starting point is to follow Google's External Store, "HBase systems provide cross-row transactions, indexing, and SQL functions without sacrificing linear expansion capabilities"
=Architecture Principle=
For the design principle, refer to the related thesis of external store. You can find the Wasp related design documentation in the following two places.
Https://github.com/alibaba/wasp/wiki/Chinese
Http://wenku.baidu.com/view/c85f50d984254b35eefd345c.html
The core idea of the External Store framework is to divide the data into different entitygroups. The data backup of EntityGroup is stored across Datacenter and provides complete ACID support within the EntityGroup, ensure that data write operations are synchronized and backed up in all data centers.
From the specific implementation point of view, Wasp does not implement the design concept of external store in cross-Data Center, but only uses Entity Groups to divide and manage Data.
In many designs, External Store focuses on the core idea of ultra-large-scale data concurrency, such as Entity Groups's cross-region backup, when reading data, non-master-slave equal nodes are dynamically selected by Paxos. They are all designed to ensure decentralization during read operations to improve performance, however, the Wasp architecture scheme is more like HBase's own scheme. There are FMaster nodes and FServer nodes, and Zookeeper is used to determine the current FMaster. Each FServer manages several Entity Groups instances, which are basically fixed master-slave and central. In the use of Entity Group, Wasp basically retains the original design of External Store, and solves the consistency problem of concurrent read/write through RedoLog/MVCC/cross-Entity two-phase commit and other methods.
=Implementation=
Wasp uses Alibaba's Druid project to parse SQL syntax, and uses Netty and Protobuf to construct the server's Internal Communication Protocol framework.
Wasp data is mainly mapped to four types of tables on HBase. The Global _ FMETA _ TABLE records the meta information of all Wasp tables, and the entity table corresponding to each wasp table data, redolog tables and index tables corresponding to all tables under the same Entitygroup Key.
Currently, Wasp supports SQL syntax. For example, Query only supports Equal condition and Range condition of the Compare class on the index. Support for data structures such as Int also has bugs in comparison operations. Other SQL syntaxes that are slightly more complex, such as UDF, limit, having, group by, join, order by and other operations are not available at present, of course, this may also depend on the specific application scenarios of wasp, maybe only the simplest Equal and query of the Range condition class on specific fields.
In addition, from the perspective of SQL Plan implementation, it seems that HBase operations such as Get, Put, and Delete are simply converted. From the perspective of HBase, HBase is a pure client application, we didn't use any Hbase RS capabilities, such as filter and coprocessor, to optimize them. Therefore, if we want to implement the Aggregation class function, the performance will probably be greatly affected.
=Summary=
In general, Wasp cannot provide a massive data cross-Data Center solution, and its scale is limited by a single Hbase cluster, therefore, to a certain extent, there is still a big gap between the objective problems solved by external store. Wasp provides an enhanced solution on HBase and provides simple SQL interfaces, and cross-row transactions. From the perspective of SQLon HBase, there is a big gap with Phoenix of SaleForce. However, in terms of cross-row transaction support, it is better than Phoenix (Phoenix's transaction support is almost entirely dependent on HBase's own capabilities). Currently, code functions are not mature, it also depends on future development. Of course, from the perspective of the code framework and design pattern, the Programming Skill of the author is still very good and should be learned.
I just quickly learned about the implementation of Wasp, and its own capabilities are limited, so the accuracy of the above views is not guaranteed. please correct me if there is any deviation.