Analysis of a trillion-level log and behavioral data storage query Technology (continued)--tindex is a reformed Lucene and Druid

Source: Internet
Author: User

Wu, Tindex

Digital intelligence based on open source solutions, a set of data storage solutions, the index layer of the scheme through the transformation of Lucene implementation, data query and index writing framework through the expansion of Druid implementation. It can not only guarantee the real-time data and the problem of the free definition of the indicator, but also meet the demand of the second-level query of the big data, the system architecture, such as the basic realization of the article at the beginning of several goals.

(Click to enlarge image)

Tindex mainly involved in several components

Tindex-segment, responsible for file storage format, including data indexing and storage, query optimization, as well as data search and real-time aggregation in the segment. Tindex is based on the idea of Lucene reconstruction, because Lucene index content is too complex, but its index performance in open source solution is relatively perfect, the data compression and performance of a good balance between. Through the transformation, we mainly retain the necessary index information, save more storage space than the original Lucene, and also speed up the query speed. The main improvements have the following points:

1. Efficient compression storage format

For the storage of massive behavioral data, storage capacity is undoubtedly a problem that cannot be neglected. For a scenario that uses an index, the indexed data size usually expands somewhat relative to the original data. In this case, tindex for different parts of the index, respectively, using different forms of compression technology, to ensure that the ability to support efficient query and only need less capacity. For the Data Content section, encode the store using a dictionary, with each record storing only the document number. For the storage of the dictionary itself, prefix compression is used to reduce the space consumption of high cardinality dimensions. In fact, the compressed data using Tindex occupies only about 1/5 of the original data.

2. Storage of column-inverted and forward-indexed

Because in practice, it is often necessary to support both search and aggregation scenarios, both of which are completely opposite to the requirements of the index structure. For both cases, the Tindex combines two different types of indexes, the inverted index and the column forward index. For the inverted index part, using the technique of dictionary and jumping table, the fast retrieval of the data is realized, and for the positive part, the fast reading of the specified column under the mass row is realized by the efficient compression technique. At the same time, depending on the situation, you can selectively create only one of these indexes (by default, two indexes are built for each column), saving about general storage space and indexing time.

Tindex-druid, responsible for distributed query engine, indicator definition engine, real-time import of data, real-time data and meta-data management, and data cache. Druid was chosen because we found that its framework extensibility, query engine design is very good, many performance details are taken into account. For example:

    • Reuse of out-of-heap memory to avoid GC problems;
    • According to the granularity of query data, we build small batches of data in a sequence way, and the memory utilization is higher.
    • Query has bysegment level of cache, can do a wide range of fixed-mode query;
    • Multiple queries to maximize query performance, such as TOPN, TimeSeries, and more.

The framework can be flexibly extended and is also a very important element of our consideration, after we rewrite the index, the Druid community for the high cardinality dimension of the query on-line groupByV2, we quickly completed the GROUPBYV2 also visible its framework is very flexible.

In our view, Druid's query engine is very powerful, but the index layer is also a scenario for OLAP queries, which is the root cause of the index extension we chose for the Druid framework. In addition, it fully considers the distributed stability, HA strategy, for different machine equipment situation and application scenarios, flexible configuration to maximize the use of hardware performance to meet the needs of the scene is what we value.

In the open source Druid version of the self-study, inherited all the advantages of druid, while the query part of the code all re-implemented, thus in the following aspects of the major improvements:

1, remove the index pre-aggregation, the indicator can be freely defined in the query:

For data access, you do not have to differentiate between dimensions and metrics, you only need to define the data type, and the data is stored in the same way as the raw data. When aggregation is required, the indicator is defined at query time. Suppose we want to access a data that contains numbers, we now only need to define a normal dimension of type float.

2, support a variety of types:

Unlike native Druid, which only supports string type dimensions, our improved version can support multiple dimension types such as String, int, long, float, time, and so on. In native Druid, if we need a numerical dimension, then we can only do it by string, which is a big problem, that is, range-based filtering cannot take advantage of an ordered inverted table, but only by comparison (because we cannot use the string size as a numerical size, This results in a result of ' < ' 2 '), which makes the performance very poor because the numeric type dimension is prone to high kiwi. For the improved version, the problem is much simpler, and the dimension is defined as the corresponding type.

3. Dynamic loading of data:

The original data on the Druid line needs to be loaded at startup to provide query services. Through the transformation, we realized the LRU strategy, only need to load the metadata information and a small amount of segment information. On the one hand, improve the start time of the service, on the other hand, because the index file is basically read mmap, when there are a large number of data segments need to load, in the case of low memory, will be directly using disk Swap cache to change pages, seriously affect query performance. Data dynamic loading is good to avoid the use of disk swap cache page, the query as far as possible to use memory, can be configured to maximize the hardware environment to provide the best query environment.

HDFS, big Data development for so many years, HDFs has become a petabyte, ZB or even more data distributed storage standard, very mature, so the number of fruit also choose HDFs, do not have to re-build the wheel. The Tindex and HDFs can be perfectly combined as a highly compressed, self-indexed file format that is compatible with all hive,spark operations.

Kafka/metaq, Message Queuing, currently tindex support Kafka, Metaq and other Message Queuing, because the Tindex external expansion interface is based on the SPI mechanism implementation, so if necessary, can also be extended to support more Message Queuing.

Ecosystem tools, responsible for Tindex's eco-tool support, currently supports Spark, Hive, and plans to expand support for big data query engines such as Impala and drill.

Support cold data off-line, through the offline (spark/hive) query, a common problem for time-series database is that for time-sensitive data, we often do not want them to continue to occupy valuable query resources. Then we often need to query them at some point. For Tindex, you can define data that is more than a certain amount of time as cold data, so that the corresponding index data is offline from the query node. When we need to query again, we just need to call the corresponding offline interface to query.

SQL Engine, which is responsible for SQL semantic transformation and expression definition.

Zookeeper, responsible for cluster state management.

In the future, the modified Lucene indexes will be continuously optimized to achieve higher query performance. Optimization of the aggregation of indicators, including: small batches of processing data, the full use of CPU to quantify the ability of parallel computing; Using code compile to avoid frequent calls to aggregate virtual functions, and the continuous improvement of the ecological docking of big data, and so on.

Follow-up I will also explain in depth each part of the detailed implementation principles and practical experience, please pay attention! If have to ask, can add the author happyjim2010, communicate together!

About the author

James Wang, a number of fruit intelligence, founder &ceo.
Former Cool dog Music Big Data technology leader, Big Data architect, responsible for cool dog big Data technology planning, construction, application.

Analysis of a trillion-level log and behavioral data storage query Technology (continued)--tindex is a reformed Lucene and Druid

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.