Elasticsearch Learning methods and mapping of complex data types _elasticsearch

Source: Internet
Author: User
Tags time interval
Overview

Elasticsearch is a search server based on Lucene. The following abbreviation ES, the version is about 2.3.

The ES version is more than 5.3, but many companies are not using the latest version, and so is my company. In the absence of contact with ES, I do not know the framework of this full-text indexing, plus my English is not good, when the study really brought me a lot of problems, trampled a lot of pits. Next, I will introduce the process of learning es from a 0 basic point of view. Thank you for my guidance old Shi elder brother, Xiang elder brother. One, how to learn ES

Es of Chinese documents are not many, in addition to a lot of ES, the translation has brought a great deal of work. However, I found some documents and materials in the learning process, which can quickly let you know what ES is and how to use it.

1.ES Official Document

Https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html

Learning ES in the official document of getting started, can give you a general idea of the concept of ES, including some basic concepts of ES, ES installation, indexing, query and other aspects of knowledge. But for those who are not good at English, it may take a day or two for them to understand these things and not necessarily understand them. then please see regulation 2.

2.Getting started's Chinese blog address

http://blog.csdn.net/cnweike/article/details/33736429

This blog will getting started detailed translation again, combined with English, then to understand, it is easier.

3.ES Chinese Translation Community address (es authoritative guide)

Https://es.xiaoleilu.com/index.html

This ES authoritative guide book electronic document, this website has already translated many chapters content, if you are not satisfied with the translation which the blog gives, may combine this website to study. This site has a problem, do not know whether the computer font rendering problem or the document itself, translation will be a lot of words are wrong. Another problem is that you have to wait a long time to switch documents, so you need to be patient enough. The ES Authority Guide has one of the greatest advantages, and it chooses the things that people often use to translate, which brings great convenience to development.

4.ES English website Document Learning route

If you need to use ES in a short period of time, you can start quickly with the following route and learn something advanced. The following learning route is to have many years of development experience colleagues to me planning, I hope to be useful to you. The ES version used by my company is 2.3 and encapsulates its own framework. Therefore, when you view the document, you also need to see the appropriate version, different versions, the structure of the document will change.

All the contents of ①getting started;

All the contents of ②setup Elasticsearch;

③document all content in the APIs;

④search under the APIs of search, URI search, Request body search;

⑤query DSL under Query and filter context, match all query, match query under full text queries, compound queries under Bool query, Joi Ning queries under the Nested Query.

⑥term level Queries Term query, Terms query, Rangge query. This is used when querying.

⑦mapping the array datatype, Binary datatype, Range datatype, Boolean datatype, Date datatype, Object, under field datatypes DataType, String datatype, Text datatype; here, depending on what you want to buy to see some of the other, here is the ES supported by the type of data.

The above gives, is that I contact ES Learning route, to tell the truth, the first contact with ES, I not only read the above all, but also see a lot. Only to understand the ES a little bit. Second, to build the problems encountered in the ES

The construction of ES in my opinion, the people who know a little Linux are basically not a problem, but like me this kind of vegetable, met a lot of pits. But the reason I encountered the error is the permission problem. ES official website clearly said a bit, do not use root user to start, this is fatal.

The second problem is that ES of the compression package after decompression, you can directly use, if you do the permissions of the folder changes, then, the start of the report can not find the Java environment and other errors, at first I thought that I did not build a Java environment, in the end found, is not, encountered this problem, Solve the problem using the following methods:

[Yh@centos bin]$ sudo chown-r elasticsearc installation folder  Yh.yh

For example: sudo chown-r elasticsearch yaohong.yaohong (this is a user)

After the execution of the above command, then Elasticsearch can start normally, if you encounter other problems, so sorry, Baidu Bar.

How to start ES

After extracting the file, go to the Bin directory, and then execute the following command, ES starts.

[Yh@centos bin]$./elasticsearch

If you want to start in the background using the-d parameter, the command is as follows:

[Yh@centos bin]$./elasticsearch-d

If you do not start the background, you will see the following information, if the following information does not indicate that ES failed to start.

[2017-04-24 10:36:53,706] [INFO] [Node]                     [Mist Mistress] version[2.4.4], pid[12159], build[fcbb46d/2017-01-03t11:33:16z] [2017-04-24 10:36:53,707][info][node
] [Mist mistress] initializing ... [2017-04-24 10:36:55,021] [INFO] [Plugins]                       [Mist Mistress] modules [Lang-groovy, Reindex, lang-expression], plugins [], sites [] [2017-04-24 10:36:55,069][info-][env ] [Mist mistress] using [1] data paths, mounts [[/(Rootfs]]], net usable_space [11.9GB], net Total_ space [19.5GB], spins? [Unknown], types [Rootfs] [2017-04-24 10:36:55,069][info][env] [Mist mistress] heap size [1015.6MB] , compressed ordinary object pointers [true] [2017-04-24 10:36:59,763][info][node] [Mist mistress] I
nitialized [2017-04-24 10:36:59,770][info][node] [Mist Mistress] starting ... [2017-04-24 10:37:00,014] [INFO] [Transport] [Mist Mistress] publish_address {127.0.0.1:9301}, bound_addresses {127.0.0.1:9301} [2017-04-24 10:37:00,031][info][discovery] [Mist Mistress] Elasticsearch/a-vf8e_uqz2y0k-wb_eeug [2017-04-24 10:37:03,389][info][cluster.service] [Mist Mistress] Detected _master {anelle}{k2u_6pgetsg5bifxqxfo2w}{127.0.0.1}{127.0.0.1:9300}, added {{anelle}{k2u_6pgetsg5bifxqxfo2w}{ 127.0.0.1}{127.0.0.1:9300},}, Reason:zen-disco-receive (from master [{anelle}{k2u_6pgetsg5bifxqxfo2w}{127.0.0.1}{ 127.0.0.1:9300}]) [2017-04-24 10:37:03,855][info][http] [Mist mistress] publish_address {127.0.0.1:9 
 201}, Bound_addresses {127.0.0.1:9201} [2017-04-24 10:37:03,860][info][node] [Mist mistress] Started

If you are starting in the background, you will not see any output, unless there is an error, you can use the following command to view all port conditions:

[Yh@centos bin]$ NETSTAT-APN | grep 9200
Third, some of the concept of ES

The bool query in 1.ES has three states, respectively, Must,should,must_not. To tell you the truth, I suspect that my language is taught by P.E. teachers, when I saw this, I thought for a long time did not understand. In addition to see is English, I even more Meng. If you have the basics of the SQL language, then it's a good idea.

①must is equivalent to SQL and conditions, such as you use SQL query price, you will write the following SQL statement:

SELECT * FROM table_name WHERE id = 1000 and price = 50;

Such statements are equivalent to ES must queries.

②shoud is equivalent to an OR condition of SQL, such as:

SELECT * FROM table_name WHERE id = 1000 or price = 50;

③must_not is equivalent to SQL not in, for example:

SELECT * FROM table_name where ID is not in (1000);

Must_not that does not contain so-and-so.

For greater than or equal, interval values, etc., the API documents are described in detail.

If you want to know more about the concept, please click on the connection address I explained above, they explain more clearly, I am not making wheels. Four, the application thought

1. Learn about the basic knowledge and concepts of ES;
2. Understand the project needs;
3. Consider whether the required functions of ES can solve the current project needs;
4. After consideration, the deployment of ES, if the ability to be able to encapsulate a layer, the original ES operation is more complex, data stitching error and difficult to maintain;
5. Writing mapping and indexing;
6. Import data;
7. After the data import can query, the index and so on a variety of tests, if there is no data, everything is a blind busy;
8. Apply to the actual project.

Because the small series is also just contact with Es soon, so the above steps are not necessarily suitable for you, the above steps are my experienced colleagues to teach the method, I based on his teaching method of the end. Here I just provide my own learning route, you can adjust according to your own ability. V. Mapping Mapping (Advanced)

Although ES can automatically create mapping based on the format of the data, he maps all objects into object, but often does not meet the requirements of the project, the basis of the mapping mapping I do not explain, the official document than I said clearly, do not need you to read the English instructions, just look at the code on the second understand, And a lot of searches on the internet. The mapping I encountered while doing the project was not found on the web. First, take a look at this JSON data, we need to be able to query the field "Schedule_info" in the time interval or the date range, see this JSON you may know that SQL is not possible ("Schedule_ The info field is saved in a JSON string to the database "), and Es is OK, but ES to implement this function requires a nested query, so when writing a mapping, it must be converted to an object.

{
    "id": 1000017,
    "status": 0,
    "name": "HHH",
    "Daily_budget": 410000,
    "Schedule_info": {
        " Date ": [
            {
                " start ":" 2017-04-20 ",
                " End ":" 2017-04-20 "
            },
             {
                " start ":" 2017-04-20 ",
                " End ': ' 2017-04-20 '
            }
        ],
        ' time ': [
            {
                ' start ': ' 00:00 ', ' End
                ': ' 23:59 '
            },
             {
                "Start": "00:00",
                "End": "23:59"
            }
        ]
    },
    "Serving_speed": 0,
    "create_time": 1492694481227, "
    start_schedule": 1492694481227,
    "End_schedule": 1499994481227,
    "Create_user": " Admin ",
    " Update_user ":" Unknown "

}

The "Schedule_info" field contains a date and time two properties, both of which are data, so what to write when writing mapping, when I was doing the project, I thought about it for a day to figure out the ES object nesting mappings. Here is my mapping.

{"Order": 0, "template": "yaohong-plan-*", "settings": {"index": {"Number_of_replicas": "1", "num
        Ber_of_shards ": 5", "refresh_interval": "1s"}, "mappings": {"_default_": {"Properties": {
          ' Update_user ': {' index ': ' not_analyzed ', ' type ': ' String '}, ' status ': { ' type ': ' Integer '}, ' End_schedule ': {' format ': ' Yyyy-mm-dd hh:mm:ss| |
        Epoch_millis ", type": "Date"}, "Serving_speed": {"type": "Integer"}, ' id ': {' type ': ' Long '}, ' Update_time ': {' format ': ' Yyyy-mm-dd hh:mm:ss| | Epoch_millis ", type": "Date"}, "Start_schedule": {"format": "Yyyy-mm-dd hh:mm:ss| |
        Epoch_millis ", type": "Date"}, "Daily_budget": {"type": "Integer"}, "Create_time": {"format": "Yyyy-mm-dd HH:mm:ss| | Epoch_millis ", type": "Date"}, "Create_user": {"index": "Not_analyzed", " Type ': ' String '}, ' Schedule_info ': {' type ': ' Nested ', ' properties ': {' dat E ": {" Properties ": {" start ": {" type ":" Date "," format ":" Y yyy-mm-dd| | Epoch_millis "}," End ": {" type ":" Date "," format ":" yyyy -mm-dd| |
                Epoch_millis "}}}," Time ": {" Properties ": { ' Start ': {' type ': ' Integer '}, ' end ': {' type ': ' I Nteger "}}}}}," _all ": {" Enabled ": False}}, ' aliases ': {' yaohong-plan-active ': {}}}

So if I need to query the date in a certain interval how to query it. Take a look at the following query statement:

{"bool": {"must": [{"term": {"status": "0"}}, {"nested": {"Query ': {' bool ': {' filter ': [{' range ': {] Schedule_info.date.startD Ate ': {' from ': ' 2017-03-05 ', ' to ': null, ' format ': ' Yyyy-mm-dd '
            , ' Include_lower ': true, ' Include_upper ': true}}
                  }, {"range": {"schedule_info.date.endDate": {"from": null, "To": "2017-09-12", "format": "Yyyy-mm-dd", "Include_lower": True
        , ' Include_upper ': True}}}}, "Path": "Schedule_info"}}], "should": {"match": {"name": {"Query": "Yan Ghong "," type " : ' Boolean '}}}}
 

From the query statement can be seen, ES to the object of the flat processing way, such as Schedule_ Info.date.endDate, based on other objects or the mapping of the array, the official document has a very detailed description, I give the example, in the official document is not, so to share with you, when you read this example, believe that other complex mapping can be solved.

Also need to explain, ES query date type, its query format must be the same as the format of the mapping mapping, otherwise there will be errors, for simple time query, need to convert to comparable data types, such as "06.30" time data, ES will be in the form of string comparisons, So that we don't get the right results. In this example, I converted time into an integral type to facilitate the search of time intervals.

Because the small knitting ability is limited, if has the mistake, please correct me, thanks the cooperation.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.