Elasticsearch Concept Learning

Source: Internet
Author: User
1. What is Elasticsearch?


Elastic is an open source, extensible framework for full-text search and analysis built on Lucene. It allows us to quickly store, search, and analyze large volumes of data in real time. The search for GitHub seems to be done with Elasticsearch.


Some basic concepts of 2.elasticsearch


Cluster


1. Represents a cluster with multiple nodes in the cluster (same data on each node)
2. The default cluster name is Elasticsearch, note that a node can only belong to one cluster,


The distinction between cluster is distinguished by name, so the cluster name cannot be the same.


The folder name under the Data directory under the Elasticsearch-xx folder downloaded from the official website is the name of cluster.
Node


1. It is part of the cluster to store data and provide indexing and search capabilities for the cluster.
The 2.node will have a name that is randomly generated by default at startup, but we can specify the name ourselves.


A node can be configured to add to the cluster, which by default is added to the Elasticsearch cluster.
3. Generally on a cluster, we can configure more than one node, if you want to configure multiple node,


You can then enable the Elasticsearch service to start a process,


Note that the new boot elasticsearch inside the configuration file elasticsearch.ymal configuration cluster.name must be the same


Data/elasticsearch can be found under the Nodes folder, there is a folder 0, this folder said node.
Index


An index is a collection of document.


For example, we have some customer data index, there is also a product data index and order data index.


An index must use lowercase to specify a name, and the name is important for indexing, searching, updating, and deleting operations.


If it's hard to understand then we can think of this as a database. (If you create your own index, for example, called Firsttime,


Then under the elasticsearch-xx/data/elasticsearch/0/indices can be found Firsttime folder,


This folder name is the corresponding index name, which contains the index data)
Type


For type is a finer division of index. We can define more type based on index.


Now that we can understand index as a database, type is the table in the database.
Document


We can interpret it as a row of data in a database table. This is the most basic unit to be indexed.
Shards:


1. sharding, if our index data is large, beyond the single file limit of hardware storage (there is a limit under Linux),


Then the problem will occur, and it will also affect the speed of the search request. So Elasticsearch introduced this shards technology.
2. When we create an index, we can simply define a number of shards,


Each shard has its own search, update, delete, and so on as the full functionality of a small piece of index.


The benefit of shard shards is that we can split and expand the content index we store horizontally,


It also allows us to distribute and (possibly at multiple nodes) cross-fragment operations in parallel to improve performance/throughput.
3. Can be found under the Elasticsearch-xx/data/elasticsearch/0/indices/firsttime


There is a default of 5 shared from 0-4
Replicas:


1. You can recover from a replica when a shard of a node is damaged or missing.
2. To improve the query efficiency of ES, ES will automatically load balance the search request. The replicas can be set to 0 or more.


Once the replicas is set, each shards will have a master shards, and replica shards is a copy of the master shards


The number of shards and replicas is generated when the index is created. We are able to dynamically change the number of replicas when indexed index is generated.


But the number of shards cannot be changed.
If we have two node nodes above the cluster, such as Shard 0 (two, each node has one shard 0),


Then a primary shard is randomly set on one of the nodes, and the other is replica shards,


This way we will have 5 main shards (above the main node) and 5 other replica shards (on the other node)


There are a total of 10 shards.


If there are 3 nodes, then 0 of the primary shard is selected from all nodes, and 1-4 of each primary shard


A total of 5 primary shards, and the other 10 are replica shards. This should be understood.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.