Kafka-sql engine

Source: Internet
Author: User

Kafka-sql engine sharing 1. Overview

In most cases, we use Kafka just as message processing. In some cases, we need to read the data in the Kafka cluster multiple times. Of course, we can do this by invoking the Kafka API, but for different business requirements, we need to write a series of different interfaces, compiled, packaged, released, etc. Finally, we can see the results we expected. So, can we have an easy way to implement this part of the functionality, by writing SQL to visualize our results. Today, I want to share some ideas with you, by using the form of SQL to complete these requirements.

2. Content

The architecture and ideas for implementing these functions are not complex. Here the author will be the entire implementation process, through a schematic diagram to present. As shown in the following:

Here I give you a detailed description of the meaning of the message data source storage and Kafka cluster, open low-order and high-order two consumer threads, the results of consumption in the form of RPC shared out (i.e., the requestor). Once the data is shared, the reflux goes through the SQL engine, translating the in-memory data into SQL Tree, where the Apache calcite project is used to take part. We then respond to the SQL request of the Web Console through the Thrift protocol, and finally return the results to the front end, which is visualized with the implementation of the chart.

3. Plug-in configuration

Here, we need to follow calcite JSON Models, for example, for Kafka clusters, we need to configure the content:

{    version: ' 1.0 ',    defaultschema: ' Kafka ',      schemas: [          {            name: ' Kafka ',              type: ' Custom ',            Factory: ' Cn.smartloli.kafka.visual.engine.KafkaMemorySchemaFactory ',              operand: {                database: ' kafka_db '            }          }     ]}

In addition, it is best to make a statement on the table, the configuration content is as follows:

[    {        "table": "Kafka",        "schemas": {            "_plat": "varchar",            "_uid": "varchar",            "_tm": " VarChar ",            " IP ":" varchar ",            " country ":" varchar ",            " City ":" varchar ",            " location ":" Jsonarray "        }    }]
4. Operation

Below, I show you how to operate the relevant content through SQL. The correlation is as follows:

At the enquiry point, fill in the relevant SQL query statement. Click on the Table button to get the results shown below:

We can export the results obtained in the form of a report.

Of course, we can browse the query history and the currently running Query task under the profile module. As for the other modules, are the auxiliary function (display cluster information, Topic Partition information, etc.) here will not be more than repeat.

5. Summary

Analysis down, the overall structure and implementation of the ideas are not too complicated, there is no too much difficulty, you need to pay attention to some implementation of the details, such as consumer API for the cluster message parameter adjustment, especially the low-order consumption API, especially need to pay attention to the size of its fetch_size, and offset It needs our own maintenance. When using calcite as the SQL tree, we have to follow its JSON Model and standard SQL syntax to manipulate the data source.

6. Concluding remarks

This blog is to share with you here, if you study in the process of learning what is the problem, you can add groups to discuss or send e-mail to me, I will do my best to answer for you, with June encouragement!

Kafka-sql engine

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.