Elastic Stack-elasticsearch Introduction

Source: Internet
Author: User
Tags unique id

First, preface

The previous article is like not many people to see, but still want to continue, I guess it may be a lot of people contact this piece is less, elasticsearch this piece has a lot to say, start it.

Second, the database, Elasticsearch choice

Traditional data because of the use of B + Tree index, when the amount of data is very large, such as a single table 1 Y or more when we want to do like operation, this is due to the kind operation will cause the full table retrieval, this time greatly affect our query efficiency, When this happens, we need to consider the Elasticsearch, yes this is a search for the birth, Elasticsearch use inverted index, here first do not understand what is inverted index, the next I will be more detailed introduction, Here, you're going to ask Elasticsearch. Why don't we take it as a persistent db? There is no individual opinion of this problem, as long as the reason is appropriate, I think it is possible, but here we have to consider the characteristics of the transaction, the traditional database is to support acid, But in Elasticsearch is not supported, if your application does not take this into account, I support you to use Elasticsearch as db, if you still want to take into account the problems of these transactions, Then I suggest you still consider Elasticsearch as a search and query display tool, this is some of my views, we have different views can be explored;

Iii.introduction of Elasticsearch

Elasticsearch is a distributed search and analysis engine that can be used for full-text retrieval, structured retrieval and analysis, and can combine these three. Elasticsearch is an open source search engine based on Apache Lucene. In both open source and proprietary areas, Lucene can be considered as the most advanced, best performing, and most functional search engine library in the world. Wikipedia, Stack Overflow, and GitHub are all based on Elasticsearch to build their search engines.

Iv.Introduction of Elasticsearch related Concepts

Cluster (Cluster)

A cluster contains one or more nodes that are used to hold all the data, and these nodes jointly provide indexing and search capabilities. The cluster uses a unique name to differentiate between different clusters, and the default name is "Elasticsearch".

nodes (node)

A elasticsearch running instance, which is the constituent unit of the cluster. Nodes in the cluster are also uniquely identified, by default a UUID is randomly assigned when the node is started. If you do not use the default name, you can give it a name, and when you want to join a cluster, you must specify the name of the cluster, and next we will describe the type of the following node:

Candidate Master nodes (master-eligible node)

Once a node is started, the Zen discovery mechanism is used to find the other nodes in the cluster and establish a connection with them. In the cluster, a primary node is elected from the candidate Master node, and the primary node is responsible for creating indexes, deleting indexes, allocating shards, and Tracking node state in the cluster. Under normal circumstances, there is only one elected master node in a cluster, when the main node due to the network or the load is too large to stop responding, at this time need to re-elect the main node, there may be multiple master nodes in the cluster phenomenon, that is, the node of the cluster state cognitive inconsistency, known as brain crack phenomenon. This is the reason why the candidate Master node should be singular; here I recommend that the candidate Master node not hold the data, configured as follows:

Truefalse
View Code

Data node

The data node mainly holds the Shard of index related data, and is responsible for the data storage and related operation, such as crud, search, aggregation and so on.

Falsetrue
View Code

The main introduction of the two, the rest of you to refer to the official documents;

indexing (Index)

The index is equivalent to the database in MySQL, storing a collection of the same type of document structure;

document (documnet)

The document is the underlying unit of information for the index, which is equivalent to the lines in MySQL, the document and the JSON form,

Document MetaData (meta data)

1._index: The index name of the document, the multi-index query, sometimes only need to query on the specially indexed name, _index field provides convenience. _index is a virtual field that does not really add to the Lucene index.

2._type: The type name of the document, which can be queried, aggregated, scripted, and sorted according to _type.

3._ID: Document unique ID;

4._UID: Combination ID, composed of _type and _id;

5._source: The original JSON data for the document, where you can get the contents of each field. The default _source field is on, or it can be closed:

6._all: Fields are stitched together, and all fields are separated by spaces, _all fields are parsed and indexed, but not stored. You need to use the _all field when you want to return only documents that contain a keyword but do not explicitly search for a field. Disabled by default;

7._parent: Specifies the parent-child relationship of the document in the same index;

The 8._routing:_routing value is the _id or _parent of the document, and the custom route can be set by the _routing parameter;

types (type)

You can define one or more types in an index. A type is a logical classification of the index;

fields (field)

The field is the smallest unit inside the Elasticsearch, equivalent to the MySQL column, similar to a key in JSON, field type:

String: Text keyword (no participle);

Numeric type: Long integer short (-32,768 to 32768) byte (-128 to 127) Double float half_float (16 bit semi-precision) scaled_ Float scale type floating point number (such as the price only need to be accurate to cent, the value of the 88.88 field scale factor is 100, save is 8888);

Boolean type: boolean;

Dates: date;

binary: binary;

Range Type: Integer_range float_range long_range double_range date_range;

Shard (shards)

elasticsearch divides the index into several parts, the default is 5 shards, each part is a shard, each shard exists on a different node, a node cannot exist the same shard, each document through the document ID hash decision to put on that node, Each shard is an independent lucene instance;

copy (Replicas)

One or more copies of the index, the default is 1 parts, the main role is the disaster tolerance, to prevent fragmentation loss, the replica shard will become the primary shard, ensure that data is not lost, and improve query performance;

Five, the next festival notice

Next An Introduction index creation, query principle, word breaker, etc., Welcome to like, Welcome to Dabigatran 438836709, welcome to follow the public number:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.