Recently, the Amazon Official blog post describes how to use Kinesis Connector to Elasticsearch for streaming data search and interaction, This will help group developers easily develop an application to download large-scale streaming data to Elasticsearch cluster from kinesis real-time and reliably.
According to the official introduction, Elasticsearch is an open source search and analysis engine that enables real-time indexing of structured and unstructured data. Kibana is the Elasticsearch data visualization engine, which is mainly to help technical operators and business analysts set up interactive panels. The data in the Elasticsearch cluster can also be accessed programmatically via the RESTful API or App SDKs. You can create a EC2 on Amazon elastic Compute Cloud (Elasticsearch) based on the cloudformation template in the instance, which is fully managed by auto scaling.
Kinesis,elasticsearch and Kibana
The following block diagram shows the combined relationships between them:
Create an application using the new Kinesis Connector to Elasticsearch, you can process the data by kinesis, or you can index the data to the Elasticsearch cluster, You can also convert, filter, and buffer records before sending to Elastricsearch, or you can adjust the special index operation of Elasticsearch to add fields such as live time, version number, model, and ID based on each record. The following illustration shows the documented process:
Note: You can use river to perform the entire connector pipeline through the Elasticsearch cluster.
First step
Your code should perform the following duties:
Set up specific configurations for your program to create and configure a kinesisconnectorpipeline with Transformer,filter,buffer and emitter Create a kinesisconnectorexecutor that continues to run this pipeline
All of the above components have a default setting, and you can replace them with custom logic.
Configure connector Properties
There is a. properties file and a configurator in the instance with many settings, most of which can be set by default values. For example, the following settings:
The
configuration connector to bulk load data to Elasticsearch after the lowest 1000 records are collected, and the local elasticsearch cluster endpoint test is used. Bufferrecordcountlimit = 1000elasticSearchEndpoint = localhost
Components that implement pipeline
To write transformer, Filter, buffer, and emitter, your code must implement the Ikinesisconnectorpipeline interface.
public class Elasticsearchpipeline implements Ikinesisconnectorpipeline<string,elasticsearchobject>public Iemitter<elasticsearchobject> Getemitter (Kinesisconnectorconfiguration revisit) {return new Elasticsearchemitter (revisit); Public ibuffer<string> GetBuffer (Kinesisconnectorconfiguration revisit) {return new Basicmemorybuffer <String> (revisit);} Public Itransformerbase <string, elasticsearchobject> Gettransformer (kinesisconnectorconfiguration Revisit) {return new Stringtoelasticsearchtransformer ();} Public ifilter<string> GetFilter (Kinesisconnectorconfiguration revisit) {return new allpassfilter< String> ();
The following code implements an abstract factory method that marks the pipeline you want to use:
Public kinesisconnectorrecordprocessorfactory<string,elasticsearchobject> Getkinesisconnectorrecordprocessorfactory () {return new kinesisconnectorrecordprocessorfactory<string, Elasticsearchobject> (New Elasticsearchpipeline (), config); }
Define Executor
The following code defines an input kinesis record as a string, and an output record as a Kinesisconnectorpipeline pipeline:
public class Elasticsearchexecutor extends kinesisconnectorexecutor<string,elasticsearchobject>
The following code implements the main method, creates the executor, and starts running:
public static Voidmain (string] args) {kinesisconnectorexecutor<string, elasticsearchobject> executor = new Elasticsearchexecutor (ConfigFile); Executor.run ();
to this point, make sure your AWS certificate is correct, the project settings depend on ant Setup, run the application using ant Run, all the code is on the GitHub, you can start immediately, you can post your questions in the comments, let's discuss them together.
Kinesis Client Library and Kinesis Connector library
The Kinesis Client Library was introduced in September 2013 when Amazon AWS released Kinesis. Developers can create programs that process streaming data through this client library, which can handle complex issues such as streaming data load balancing, coordinating distributed servers, adjusting volume changes, and fault-tolerant methods.
In dealing with input stream technology, Amazon AWS has released a lot of service products, developers can understand, or contribute to their own progress. The main products are Amazon DynamodB, Amazon Redshift and Amazon simple Storage Service (S3) Kinesis Connector Library, Kinesis Storm spout, Amazon EMR Connector.
SOURCE Link: Search and interact with Your streaming Data Using the Kinesis Connector to Elasticsearch (translations/Chingling Zebian/yuping)
If you need to know the latest AWS information or technical documentation to access the AWS Chinese technology community, if you have more questions please ask at the AWS Technology Forum and experts will answer later.
Subscribe to the "AWS Chinese technology Community" micro-credit public number, real-time command of AWS technology and product information!
The AWS Chinese technology community provides an Amazon Web service technical Exchange platform for the vast majority of developers, pushing the latest news, technical videos, technical documents, wonderful technical blogs and other related highlights from AWS, as well as having AWS community experts to communicate with you directly! Join the AWS Chinese technology community to quickly and better understand the AWS cloud computing technology.