Elasticsearch Basic Knowledge Essentials QA

Source: Internet
Author: User

This paper is a record-based blog for studying and collating the achievements of others. In this unified thanks to the original author, if you do not understand the basic knowledge, you can see the Elasticsearch authoritative guide Chinese version, here Note your Elasticsearch version, the version is different, there may be deviations Q1: Elasticsearch how to achieve the master election
    1. Elasticsearch's main selection is the Zendiscovery module, which mainly consists of ping (which is found between nodes through this RPC) and unicast (the unicast module contains a list of hosts to control which nodes need to be ping).
    2. All nodes that can be master (node.master:true) are sorted according to the Nodeid dictionary, each of which elects each node to rank the node it knows, and then selects the first (No. 0) node, which is considered to be the master node.
    3. If the number of votes on a node reaches a certain value (can become the master node number n/2+1) and the node itself elects itself, then this node is master. Otherwise, it will be re-elected until the above conditions are met.

The responsibilities of the master node include management of clusters, nodes, and indexes, which are not responsible for document-level management, and the data node can turn off HTTP functionality.

Q2:elasticsearch How to avoid the phenomenon of brain fissure
    1. When the number of master candidates in a cluster is not less than 3 (node.master:true). This parameter can be set to avoid discovery.zen.minimum_master_nodes, set to (N/2) +1.

Node.master:true here means that you are qualified to be master, not that you are master. Is the prince, not the emperor. If there are 10 prince, here should be set to (10/2) +1=6, these 6 Prince conspiracy to make decisions, elect a new emperor. The other 4 Prince, even if they all gather together also only four people, insufficient collusion minimum limit, they cannot elect the new emperor. If the number of discovery.zen.minimum_master_nodes set is 5, there is exactly 10 master alternative nodes, what will happen? 5 Prince to form a wave, choose an emperor out, the other 5 prince also enough to limit the number, they can also elect an emperor. At this time a world of two emperors, in Es is the brain fissure.

    1. If the cluster master candidate node is 2, this situation is unreasonable, it is best to change the other node.master to false. If we do not change the node settings, or set the above (N/2) +1 formula, at this time discovery.zen.minimum_master_nodes should be set to 2. There is a problem, two master alternative nodes, as long as there is a hang, you can not choose Master.

I still use the Prince's example to illustrate. If the first emperor reign when the provisions, must be his two prince are in the time, in order to choose from 2 1 to inherit the throne. In case there is a prince out of an accident, there is a prince, there is no new emperor.

Q3: How does the client select a specific node to execute the request when it is connected to the cluster?
    1. Transportclient connects a elasticsearch cluster remotely with the transport module. It does not join the cluster, it simply obtains one or more initialized transport addresses and communicates with them in a polling manner.
Q4:elasticsearch Document Indexing Process description
    1. The Coordination node uses the document ID to participate in the calculation by default (also supported through routing) to provide the appropriate shard for the route.

Shard = hash (document_id)% (num_of_primary_shards)

    1. When the node on which the Shard is located receives a request from the coordination node, the request is written to memory Buffer and then timed (by default every 1 seconds) to the filesystem Cache, which is from Momery Buffer to filesystem The process of the cache is called refresh;
    2. Of course, in some cases, the existence of momery buffer and filesystem cache data may be lost, ES is through the mechanism of translog to ensure the reliability of the data. The implementation mechanism is to receive the request, but also write to Translog, when the data in the filesystem cache is written to disk, it will be erased, this process is called flush.
    3. During the flush process, the in-memory buffer is cleared, the content is written to a new segment, the Fsync of the segment creates a new commit point, and the content is flushed to disk, and the old Translog is deleted and a new Translog is started.
    4. The flush trigger is timed to trigger (default 30 minutes) or translog becomes too large (default is 512M).

A supplement to Lucene's segement (that is, the paragraph above):
    • The Lucene index is made up of multiple segments, and the segment itself is a fully functional inverted index.
    • The segment is immutable, allowing Lucene to incrementally add new documents to the index without rebuilding the index from scratch
    • For each search request, all segments in the index are searched, and each segment consumes the CPU's clock week, file handle, and memory. This means that the higher the number of segments, the lower the search performance.
    • To solve this problem, Elasticsearch will merge small segments into a larger segment, commit new merge segments to disk, and delete those old small segments
Q5:elasticsearch document update and deletion process description
    1. Deletions and updates are also write operations, but the documents in the Elasticsearch are immutable and cannot be deleted or altered to show their changes;
    2. Each segment on the disk has a corresponding. del file. When the delete request is sent, the document is not really deleted, but is marked for deletion in the. del file. The document can still match the query, but it will be filtered out in the results. When a segment is merged, documents marked for deletion in the. del file will not be written to the new segment.
    3. When a new document is created, Elasticsearch assigns a version number to the document, and when the update is performed, the old version of the document is marked for deletion in the. del file, and the new version of the document is indexed to a new segment. The old version of the document still matches the query, but it is filtered out in the results.
Process description for Q6:elasticsearch search
    1. The search is executed as a two-stage process, which we call Query then Fetch
    2. During the initial query phase, the query broadcasts to each shard copy (primary shard or Replica shard) in the index. Each shard performs a search locally and builds a priority queue that matches the size of the document to the From + size. PS: In the search time is will query filesystem cache, but some of the data is still memory Buffer, so the search is near real-time.
    3. Each shard returns the ID and sort values of all documents in their priority queue to the coordination node, which merges the values into its own priority queue to produce a globally ordered list of results.
    4. The next step is the retrieval phase, in which the coordination node identifies which documents need to be retrieved and submits multiple GET requests to the related shards. Each shard loads and enriches the document and, if necessary, returns the document to the Coordination node. Once all the documents have been retrieved, the coordination node returns the results to the client.

The search type for query then fetch refers to the data of this shard when the document relevance is scored, so that it may not be accurate when the number of documents is low, and DFS query then fetch adds a pre-query processing, asking the term and document Frequency, this score is more accurate, but the performance will become worse.

Q7: In the case of concurrency, is elasticsearch guaranteed to read and write consistent?
    1. Optimistic concurrency control can be used with the version number to ensure that the new version is not overwritten by the old version, and that the application layer handles the specific conflict;
    2. In addition, for write operations, the conformance level supports Quorum/one/all, which defaults to quorum, which allows write operations only if most shards are available. But even if most of them are available, there could be a failure of the write replica due to network and other reasons, so that the copy is considered a failure and the Shard will be rebuilt on a different node.
    3. For read operations, you can set replication to sync (the default), which causes the operation to be returned after both the primary Shard and the replica shard are complete, or by setting the search request parameter if the replication is set to Async. Preference queries the primary shard for primary to ensure that the document is the latest version.
Q8:elasticsearch when deploying, what are the optimization methods for Linux setup?
    1. Machines with up to GB of RAM are ideal, but it is also common to have up to a GB and a GB machine. Less than 8 GB is counterproductive.
    2. If you want to choose between faster CPUs and more cores, it's better to choose more cores. The extra concurrency provided by multiple cores is far better than a little bit faster clock frequency.
    3. If you can afford an SSD, it will go far beyond any spinning medium. SSD-based nodes have improved query and index performance. If you can afford it, SSDs are a good choice.
    4. Avoid clusters spanning multiple data centers, even when data centers are close by. It is absolutely essential to avoid clusters spanning large geographic distances.
    5. Make sure that the JVM that runs your application is exactly the same as that of the server. In several places in Elasticsearch, native serialization of Java is used.
    6. By setting Gateway.recover_after_nodes, Gateway.expected_nodes, gateway.recover_after_time, you can avoid excessive shard swapping when the cluster restarts. This may reduce data recovery from several hours to a few seconds.
    7. Elasticsearch is configured by default to use unicast discovery to prevent nodes from inadvertently joining the cluster. Only nodes running on the same machine will automatically make up the cluster. It is best to use unicast instead of multicast.
    8. Do not arbitrarily modify the size of the garbage collector (CMS) and the individual thread pools.
    9. Give the (less than) half of your memory to Lucene (but don't exceed gb! ), set by es_heap_size environment variable.
    10. Memory swapping to disk is fatal for server performance. If the memory is swapped to disk, a 100 microsecond operation can become 10 milliseconds. Think about so many 10 microseconds of operating time-delay add up. It's not hard to see how scary swapping is for performance.
    11. Lucene uses a lot of files. At the same time, Elasticsearch communication between the node and the HTTP client also uses a large number of sockets. All of this requires enough file descriptors. You should add your file descriptor, set a very large value, such as 64,000.
Index stage can improve the method supplement
    1. Use bulk requests and resize them: 5–15 MB per bulk data is a good starting point.
    2. Segment and segment merging: Elasticsearch The default value is ten MB/s, which should be a good setting for mechanical disks. If you are using an SSD, you can consider raising it to 100–200 MB/s. If you're doing a bulk import, you can completely turn off merge limits by simply not searching. You can also increase the index.translog.flush_threshold_size setting, from the default of MB to a larger value, such as 1 GB, which can accumulate larger segments in the transaction log when a single purge is triggered.
    3. If your search results don't require near real-time accuracy, consider changing the index.refresh_interval of each index to 30s.
    4. If you are in bulk importing, consider closing the copy by setting index.number_of_replicas:0.
Q9: For GC, what to pay attention to when using Elasticsearch?
    1. View: HTTPS://ELASTICSEARCH.CN/ARTICLE/32
    2. The index of the inverted dictionary requires resident memory, cannot GC, need to monitor the segment memory growth trend on data node.
    3. Various caches, field cache, filter cache, indexing cache, bulk queue, and so on, to set a reasonable size, and should be based on the worst case to see if the heap is adequate, that is, all kinds of caches full of time, Is there a heap space that can be assigned to other tasks? Avoid "self-deception" such as clear cache to free up memory.
    4. Avoid returning a large number of result sets for search and aggregation. It does require a lot of data fetching scenarios, which can be implemented using the scan & Scroll API.
    5. Cluster stats resides in memory and cannot scale horizontally, ultra-large clusters can be considered to be split into multiple clusters via the Tribe node connection.
    6. To know that the heap is not enough, you must combine the actual application scenarios and keep monitoring the heap usage of the cluster.

Elasticsearch basic Knowledge essentials QA

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.