Elasticsearch Client Connection Selection

Source: Internet
Author: User

Elasticsearch supports two types of protocols:

HTTP protocol.

Native Elasticsearch Binary Protocol (local Elasticsearch binary protocol): Elasticsearch protocol for inter-node communication developed independently.

You can also extend the supported protocols by using plug-ins. There are some official plugins.

A second approach is not recommended for languages other than Java, because the second approach requires a lot of custom serialization.

Supported clients

Transport

Transport is one of the local methods that are connected to Elasticsearch. It is part of the official Elasticsearch distribution and therefore requires the client to write (or at least run on the JVM) in Java. It is very fast and runs natively on the JVM. Serialization is valid, and there is little overhead in messages and operations sent to/from Elasticsearch instances. It needs to keep the Elasticsearch server and client version somewhat synchronized. Prior to Elasticsearch 1.0, the exact same version would be required, but the newer version (1.0 and later) supported the interaction between versions. Because of other potential nuances between the exception serialization and the update, it is also beneficial to run the same JVM update version on both the client and the server. Encryption or authentication is not currently supported, but it is announced that these requirements will be met soon. To use a transport client on a found.no managed cluster, you can use the Elasticsearch custom transport module, which is responsible for encryption, authentication, and retention activities.

Node Client

The node client is very similar to transport clients: it is part of the official Elasticsearch release, requires the client to run Java, and so on, but there are some notable differences. If the cluster is not very interested in whether the transport client is connected to a node in the cluster, the node client is considered to be part of the cluster. This means that the presence of the node client is stored in the cluster state, and that all other nodes in the cluster will attempt to establish several TCP connections to the client. This can be a significant disadvantage if the cluster is large or uses multiple clients. This may seem a bit absurd, but it is needed now so that the server node can propagate changes to the cluster state to the client. The end result is that the node client always has the latest cluster state and a connection to each of the other nodes in the Elasticsearch cluster, which enables it to perform operation routing locally, the coordinator of its own request, and so on. This skips the network jumps for each request and causes the remaining nodes in the cluster to work less.

HTTP Client

HTTP is well supported in most programming languages, which is the most common way to connect to Elasticsearch. If you want to use HTTP, there is an important choice: use an existing Elasticsearch HTTP-based library, or simply create a small wrapper that requires the operation of the HTTP client. Because HTTP is a common protocol and supports a wide variety of use cases, some important things need to be implemented by the client: connection pooling and maintaining activity. Connection pooling is required to avoid the need to pay for the TCP connection setup costs for each request. More importantly, if it uses HTTPS, this brings additional cryptographic handshake costs. Connection pooling is often required to maintain active support because we want to avoid interruptions due to idle connections. Although it is initially obvious that connection building is actually important, it takes three handshakes to consider establishing a TCP connection. To put it simply, using a 50-millisecond ping time, in addition to the time it takes to get and release local resources (processing client ports, connection management, etc.), it takes about 75 milliseconds to establish a connection-this does not take into account the time spent processing requests/responses (for example, serialization) at both ends. No connection pool, this time is added at the top of each request. For HTTPS, which we recommend for Security and privacy, connection establishment overhead can sometimes be measured in seconds, which is even more significant. Considering that the end user's response time must be less than 100 milliseconds to be observed as "immediate" basic advice, even the non-encrypted overhead makes this restriction almost impossible to keep. Official (non-Java) clients written and supported by Elasticsearch use the HTTP underlying to communicate with Elasticsearch. The general recommendation is to use a formal client that encapsulates the HTTP API because they are responsible for handling all of these details. HTTP client implementations can be fairly fast, some of which are even competing with the speed of native protocols. Elasticsearch's HTTP API is widely used and has considerable community support. However, performance is dependent on the client library and typically requires configuration or tuning to maximize.

Conclusion

With a high-performance HTTP client, it is easy to bind to the official language.

Using Java generally outperforms node through transport, unless the performance gain of using the node client is large enough to guarantee additional network complexity. Use benchmarks to verify performance gains.

When using other non-Java JVM-based languages (such as Scala,clojure,groovy,jruby, etc.), you need to measure the native language that can be used in node and transport two of clients.

Elasticsearch Client Connection Selection

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.