Common elasticsearch operations: Mappings

Source: Internet
Author: User

Tag: match missing size win attribute format Integer Ring null

[TOC]

In fact, the elasticsearch field type is automatically checked by elasticsearch or specified by ourselves. Therefore, it can be divided into dynamic ing and static ing.

1 Dynamic ing 1.1 ing rules
Data in JSON format Automatically inferred Field Type
Null No field added
True or false Boolean Type
Floating Point Number Float Type
Number Long TYPE
JSON object Object Type
Array Determined by the first non-null value in the array
String It may be of the date type (enable date Detection), double or long type, text type, or keyword type.
1.2 date Detection

Es5.4 is enabled by default. The test case is as follows:

PUT myblogGET myblog/_mappingPUT myblog/article/1{  "id":1,  "postdate":"2018-10-27"}GET myblog/_mapping{  "myblog": {    "mappings": {      "article": {        "properties": {          "id": {            "type": "long"          },          "postdate": {            "type": "date"          }        }      }    }  }}

After the date detection is disabled, it is not detected as a date, as follows:

PUT myblog{  "mappings": {    "article": {      "date_detection": false    }  }}GET myblog/_mappingPUT myblog/article/1{  "id":1,  "postdate":"2018-10-27"}GET myblog/_mapping{  "myblog": {    "mappings": {      "article": {        "date_detection": false,        "properties": {          "id": {            "type": "long"          },          "postdate": {            "type": "text",            "fields": {              "keyword": {                "type": "keyword",                "ignore_above": 256              }            }          }        }      }    }  }}
2 static ing 2.1 Basic Cases
PUT myblog{  "mappings": {    "article": {      "properties": {        "id":{"type": "long"},        "title":{"type": "text"},        "postdate":{"type": "date"}      }    }  }}GET myblog/_mappingPUT myblog/article/1{  "id":1,  "title":"elasticsearch is wonderful!",  "postdate":"2018-10-27"}GET myblog/_mapping{  "myblog": {    "mappings": {      "article": {        "properties": {          "id": {            "type": "long"          },          "postdate": {            "type": "date"          },          "title": {            "type": "text"          }        }      }    }  }}
2.2 dynamic attributes

By default, when a document is added, es will be added if a new field appears. However, this can be controlled and set through dynamic:

Dynamic Value Description
True The default value is true. fields are automatically added.
False Ignore new fields
Strict Strict mode. An exception is thrown when a new field is found.
PUT myblog{  "mappings": {    "article": {      "dynamic":"strict",      "properties": {        "id":{"type": "long"},        "title":{"type": "text"},        "postdate":{"type": "date"}      }    }  }}GET myblog/_mappingPUT myblog/article/1{  "id":1,  "title":"elasticsearch is wonderful!",  "content":"a long text",  "postdate":"2018-10-27"}{  "error": {    "root_cause": [      {        "type": "strict_dynamic_mapping_exception",        "reason": "mapping set to strict, dynamic introduction of [content] within [article] is not allowed"      }    ],    "type": "strict_dynamic_mapping_exception",    "reason": "mapping set to strict, dynamic introduction of [content] within [article] is not allowed"  },  "status": 400}
3 Field Type 3.1 Common Field Type
Level 1 Level 2 Type
Core Type String type String, text, keyword
Numeric type Long, intger, short, byte, double, float, half_float, scaled_float
Date type Date
Boolean Type Boolean
Binary type Binary
Range type Range
Composite Type Array type Array
Object Type Object
Nested type Nested
Geographic type Geographic coordinates Geo_point
Geographic chart Geo_shape
Special Type IP type IP
Range type Completion
Token count type Token_count
Attachment type Attachment
Extraction type Percolator

For more information, see the official documentation https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping.html.

3.1.1 string

Not supported after ex 5.x, but can still be added, replaced by text or keyword.

3.1.2 text

The fields used for full-text search are analyzed by the word divider. Before an inverted index is generated, the string is divided into word items by the word divider.

In practical applications, text is mostly used in long text fields, such as the content of article. Obviously, such fields are of little significance for sorting and aggregation.

3.1.3 keyword

You can only search by exact value, different from the text type.

The word items of the index are the content of the field. Therefore, it is used for comparison, sorting, aggregation, and other operations in actual applications.

3.1.4 numeric type

For more information, see the official documentation.

3.1.5 date

JSON does not have a date type, so the default elasticsearch time format can be:

  • 1. "yyyy-mm-dd" or "yyyy-mm-ddthh: mm: SSZ"
    • That is to say, "yyyy-mm-dd hh: mm: SS" needs to be written in the form of "8-8-10-22t23: 12: 22z". In fact, the time zone is added;
  • 2. indicates the number of long integers of timestamp in milliseconds.
  • 3. Integer Number of timestamp in seconds

Elasticsearch stores long integer data in milliseconds.

Of course, the above is only the default case. When setting the field type, we can also set our own time format:

PUT myblog{  "mappings": {    "article": {      "properties": {        "postdate":{          "type": "date",          "format": "yyyy-MM-dd HH:mm:ss"        }      }    }  }}

Format can also specify multiple date formats, separated by "|:

"format": "yyyy-MM-dd HH:mm:ss||yyyy/MM/dd HH:mm:ss"

Then you can write data in the defined time format:

PUT myblog/article/1{  "postdate":"2017-09-23 23:12:22"}

In my work scenario, if the time to be saved is time, it is often first processed as a timestamp in milliseconds, and then stored in Es, the display is then processed as a time string.

3.1.6 Boolean

After setting the field type to boolean, you can enter the following values: True, false, "true", and "false ".

3.1.7 binary

A base64-encoded string of the binary type.

3.1.8 Array

Elasticsearch does not have a dedicated array type. By default, any field can contain one or more values, but the values in an array must be of the same type. When data is dynamically added, the type of the first value of the array determines the type of the entire array (in fact, the type of this field). Hybrid arrays are not supported. An array can contain null values. An empty array [] is treated as a missing field. In addition, you do not need to configure the array type in advance in this document. It is supported by default.

For example, add the field data of the following array:

DELETE my_indexPUT my_index/my_type/1{  "lists":[    {      "name":"xpleaf",      "job":"es"    }  ]}

In fact, the field type will be dynamically mapped to text:

GET my_index/my_type/_mapping{  "my_index": {    "mappings": {      "my_type": {        "properties": {          "lists": {            "properties": {              "job": {                "type": "text",                "fields": {                  "keyword": {                    "type": "keyword",                    "ignore_above": 256                  }                }              },              "name": {                "type": "text",                "fields": {                  "keyword": {                    "type": "keyword",                    "ignore_above": 256                  }                }              }            }          }        }      }    }  }}

Direct Search is also supported:

GET my_index/my_type/_search{  "query": {    "term": {      "lists.name": {        "value": "xpleaf"      }    }  }}

Returned results:

{  "took": 0,  "timed_out": false,  "_shards": {    "total": 5,    "successful": 5,    "failed": 0  },  "hits": {    "total": 1,    "max_score": 0.2876821,    "hits": [      {        "_index": "my_index",        "_type": "my_type",        "_id": "1",        "_score": 0.2876821,        "_source": {          "lists": [            {              "name": "xpleaf",              "job": "es"            }          ]        }      }    ]  }}
3.1.9 object

You can directly write a JSON object to es, as shown below:

DELETE my_indexPUT my_index/my_type/1{  "object":{    "name":"xpleaf",    "job":"es"  }}

In fact, the field type will be dynamically mapped to text:

{  "my_index": {    "mappings": {      "my_type": {        "properties": {          "object": {            "properties": {              "job": {                "type": "text",                "fields": {                  "keyword": {                    "type": "keyword",                    "ignore_above": 256                  }                }              },              "name": {                "type": "text",                "fields": {                  "keyword": {                    "type": "keyword",                    "ignore_above": 256                  }                }              }            }          }        }      }    }  }}

Direct Search is also possible:

GET my_index/my_type/_search{  "query": {    "term": {      "object.name": {        "value": "xpleaf"      }    }  }}

Returned results:

{  "took": 0,  "timed_out": false,  "_shards": {    "total": 5,    "successful": 5,    "failed": 0  },  "hits": {    "total": 1,    "max_score": 0.2876821,    "hits": [      {        "_index": "my_index",        "_type": "my_type",        "_id": "1",        "_score": 0.2876821,        "_source": {          "object": {            "name": "xpleaf",            "job": "es"          }        }      }    ]  }}

Object objects are actually flat in ES. As shown above, in ES, they are actually:

{"Object. Name": "xpleaf", "object. Job": "es "}

3.1.10 nested

The nested type is a special case of the object type. It allows the object array to be indexed and queried independently. Lucene does not have the concept of an internal object, so es flat the object hierarchy and converts it into a simple list of field names and values.

Although it is a special case of the object type, its field type is fixed, that is, nested, which is the biggest difference with the object.

So why should we use the nested type? Is it okay to use objects? Here I post an official example to illustrate (https://www.elastic.co/guide/en/elasticsearch/reference/5.6/nested.html ):

Arrays of innerobjectFields do not work the way you may have CT. Lucene has no concept of inner objects, so elasticsearch flattens object hierarchies into a simple list of field names and values. For instance, the following document:

PUT my_index/my_type/1{  "group" : "fans",  "user" : [     {      "first" : "John",      "last" :  "Smith"    },    {      "first" : "Alice",      "last" :  "White"    }  ]}

Wocould be transformed internally into a document that looks more like this:

{  "group" :        "fans",  "user.first" : [ "alice", "john" ],  "user.last" :  [ "smith", "white" ]}

Theuser.firstAnduser.lastFields are flattened into multi-value fields, and the associationaliceAndwhiteIs lost. This document wocould incorrectly match a queryalice AND smith:

GET my_index/_search{  "query": {    "bool": {      "must": [        { "match": { "user.first": "Alice" }},        { "match": { "user.last":  "Smith" }}      ]    }  }}

The above is a problem caused by the direct use of objects. That is to say, this document should not be matched during the above search, but it is indeed matched. The nested object type can maintain the independence of each object in the array. The nested type indexes each object in the array as an independent hidden document, which means that each nested object can be searched independently.

If you need to index arrays of objects and to maintain the independence of each object in the array, you should usenestedDatatype instead ofobjectDatatype. Internally, nested objects index each object in the array as a separate hidden document, meaning that each nested object can be queried independently of the others, withnestedQuery:

PUT my_index{  "mappings": {    "my_type": {      "properties": {        "user": {          "type": "nested"         }      }    }  }}PUT my_index/my_type/1{  "group" : "fans",  "user" : [    {      "first" : "John",      "last" :  "Smith"    },    {      "first" : "Alice",      "last" :  "White"    }  ]}GET my_index/_search{  "query": {    "nested": {      "path": "user",      "query": {        "bool": {          "must": [            { "match": { "user.first": "Alice" }},            { "match": { "user.last":  "Smith" }}           ]        }      }    }  }}GET my_index/_search{  "query": {    "nested": {      "path": "user",      "query": {        "bool": {          "must": [            { "match": { "user.first": "Alice" }},            { "match": { "user.last":  "White" }}           ]        }      },      "inner_hits": {         "highlight": {          "fields": {            "user.first": {}          }        }      }    }  }}

Indexing a document containing 100 nested fields is actually indexing 101 documents. each nested document is indexed as an independent document. To prevent over-defining the number of nested fields, each index can define up to 50 nested fields.

3.1.11 range

The range type and its value range are as follows:

Type Range
Integer_range -2 ^ 31 ~ 2 ^ 31-1
Float_range 32-bit IEEE 754
Long_range -2 ^ 63 ~ 2 ^ 63-1
Double_range 64-bit IEEE 754
Date_range 64-bit integer, Millisecond Time
3.2 yuan Field

The meta field is the field used to describe the document. Its classification and description are as follows:

Meta field category Specific attributes Function
Meta fields of document attributes _ Index Document Index
_ Uid Include_typeAnd_id(Value:{type}#{id})
_ Type Document Type
_ Id Document ID
Meta field of the source document _ Source Original JSON string of the document
_ Size _ Source Field Size
_ All Super fields that contain all the indexed fields
_ Field_names This document contains all fields with non-null values.
The meta field of the route. _ Parent Specify the parent-child relationship between documents
_ Routing Route documents to custom route values of specific shards
Custom meta Field _ Meta Used to customize metadata

For more information about each field, see https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping-fields.html.

4. ing parameters

For more information, see https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping-params.html.

Common elasticsearch operations: Mappings

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.