Common elasticsearch operations: Mappings

Last Update:2018-10-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Tag: match missing size win attribute format Integer Ring null

[TOC]

In fact, the elasticsearch field type is automatically checked by elasticsearch or specified by ourselves. Therefore, it can be divided into dynamic ing and static ing.

1 Dynamic ing 1.1 ing rules

Data in JSON format	Automatically inferred Field Type
Null	No field added
True or false	Boolean Type
Floating Point Number	Float Type
Number	Long TYPE
JSON object	Object Type
Array	Determined by the first non-null value in the array
String	It may be of the date type (enable date Detection), double or long type, text type, or keyword type.

1.2 date Detection

Es5.4 is enabled by default. The test case is as follows:

PUT myblogGET myblog/_mappingPUT myblog/article/1{  "id":1,  "postdate":"2018-10-27"}GET myblog/_mapping{  "myblog": {    "mappings": {      "article": {        "properties": {          "id": {            "type": "long"          },          "postdate": {            "type": "date"          }        }      }    }  }}

After the date detection is disabled, it is not detected as a date, as follows:

PUT myblog{  "mappings": {    "article": {      "date_detection": false    }  }}GET myblog/_mappingPUT myblog/article/1{  "id":1,  "postdate":"2018-10-27"}GET myblog/_mapping{  "myblog": {    "mappings": {      "article": {        "date_detection": false,        "properties": {          "id": {            "type": "long"          },          "postdate": {            "type": "text",            "fields": {              "keyword": {                "type": "keyword",                "ignore_above": 256              }            }          }        }      }    }  }}

2 static ing 2.1 Basic Cases

PUT myblog{  "mappings": {    "article": {      "properties": {        "id":{"type": "long"},        "title":{"type": "text"},        "postdate":{"type": "date"}      }    }  }}GET myblog/_mappingPUT myblog/article/1{  "id":1,  "title":"elasticsearch is wonderful!",  "postdate":"2018-10-27"}GET myblog/_mapping{  "myblog": {    "mappings": {      "article": {        "properties": {          "id": {            "type": "long"          },          "postdate": {            "type": "date"          },          "title": {            "type": "text"          }        }      }    }  }}

2.2 dynamic attributes

By default, when a document is added, es will be added if a new field appears. However, this can be controlled and set through dynamic:

Dynamic Value	Description
True	The default value is true. fields are automatically added.
False	Ignore new fields
Strict	Strict mode. An exception is thrown when a new field is found.

PUT myblog{  "mappings": {    "article": {      "dynamic":"strict",      "properties": {        "id":{"type": "long"},        "title":{"type": "text"},        "postdate":{"type": "date"}      }    }  }}GET myblog/_mappingPUT myblog/article/1{  "id":1,  "title":"elasticsearch is wonderful!",  "content":"a long text",  "postdate":"2018-10-27"}{  "error": {    "root_cause": [      {        "type": "strict_dynamic_mapping_exception",        "reason": "mapping set to strict, dynamic introduction of [content] within [article] is not allowed"      }    ],    "type": "strict_dynamic_mapping_exception",    "reason": "mapping set to strict, dynamic introduction of [content] within [article] is not allowed"  },  "status": 400}

3 Field Type 3.1 Common Field Type

Level 1	Level 2	Type
Core Type	String type	String, text, keyword
	Numeric type	Long, intger, short, byte, double, float, half_float, scaled_float
	Date type	Date
	Boolean Type	Boolean
	Binary type	Binary
	Range type	Range
Composite Type	Array type	Array
	Object Type	Object
	Nested type	Nested
Geographic type	Geographic coordinates	Geo_point
	Geographic chart	Geo_shape
Special Type	IP type	IP
	Range type	Completion
	Token count type	Token_count
	Attachment type	Attachment
	Extraction type	Percolator

For more information, see the official documentation https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping.html.

3.1.1 string

Not supported after ex 5.x, but can still be added, replaced by text or keyword.

3.1.2 text

The fields used for full-text search are analyzed by the word divider. Before an inverted index is generated, the string is divided into word items by the word divider.

In practical applications, text is mostly used in long text fields, such as the content of article. Obviously, such fields are of little significance for sorting and aggregation.

3.1.3 keyword

You can only search by exact value, different from the text type.

The word items of the index are the content of the field. Therefore, it is used for comparison, sorting, aggregation, and other operations in actual applications.

3.1.4 numeric type

For more information, see the official documentation.

3.1.5 date

JSON does not have a date type, so the default elasticsearch time format can be:

1. "yyyy-mm-dd" or "yyyy-mm-ddthh: mm: SSZ"
- That is to say, "yyyy-mm-dd hh: mm: SS" needs to be written in the form of "8-8-10-22t23: 12: 22z". In fact, the time zone is added;
2. indicates the number of long integers of timestamp in milliseconds.
3. Integer Number of timestamp in seconds

Elasticsearch stores long integer data in milliseconds.

Of course, the above is only the default case. When setting the field type, we can also set our own time format:

PUT myblog{  "mappings": {    "article": {      "properties": {        "postdate":{          "type": "date",          "format": "yyyy-MM-dd HH:mm:ss"        }      }    }  }}

Format can also specify multiple date formats, separated by "|:

"format": "yyyy-MM-dd HH:mm:ss||yyyy/MM/dd HH:mm:ss"

Then you can write data in the defined time format:

PUT myblog/article/1{  "postdate":"2017-09-23 23:12:22"}

In my work scenario, if the time to be saved is time, it is often first processed as a timestamp in milliseconds, and then stored in Es, the display is then processed as a time string.

3.1.6 Boolean

After setting the field type to boolean, you can enter the following values: True, false, "true", and "false ".

3.1.7 binary

A base64-encoded string of the binary type.

3.1.8 Array

Elasticsearch does not have a dedicated array type. By default, any field can contain one or more values, but the values in an array must be of the same type. When data is dynamically added, the type of the first value of the array determines the type of the entire array (in fact, the type of this field). Hybrid arrays are not supported. An array can contain null values. An empty array [] is treated as a missing field. In addition, you do not need to configure the array type in advance in this document. It is supported by default.

For example, add the field data of the following array:

DELETE my_indexPUT my_index/my_type/1{  "lists":[    {      "name":"xpleaf",      "job":"es"    }  ]}

In fact, the field type will be dynamically mapped to text:

GET my_index/my_type/_mapping{  "my_index": {    "mappings": {      "my_type": {        "properties": {          "lists": {            "properties": {              "job": {                "type": "text",                "fields": {                  "keyword": {                    "type": "keyword",                    "ignore_above": 256                  }                }              },              "name": {                "type": "text",                "fields": {                  "keyword": {                    "type": "keyword",                    "ignore_above": 256                  }                }              }            }          }        }      }    }  }}

Direct Search is also supported:

GET my_index/my_type/_search{  "query": {    "term": {      "lists.name": {        "value": "xpleaf"      }    }  }}

Returned results:

{  "took": 0,  "timed_out": false,  "_shards": {    "total": 5,    "successful": 5,    "failed": 0  },  "hits": {    "total": 1,    "max_score": 0.2876821,    "hits": [      {        "_index": "my_index",        "_type": "my_type",        "_id": "1",        "_score": 0.2876821,        "_source": {          "lists": [            {              "name": "xpleaf",              "job": "es"            }          ]        }      }    ]  }}

3.1.9 object

You can directly write a JSON object to es, as shown below:

DELETE my_indexPUT my_index/my_type/1{  "object":{    "name":"xpleaf",    "job":"es"  }}

In fact, the field type will be dynamically mapped to text:

{  "my_index": {    "mappings": {      "my_type": {        "properties": {          "object": {            "properties": {              "job": {                "type": "text",                "fields": {                  "keyword": {                    "type": "keyword",                    "ignore_above": 256                  }                }              },              "name": {                "type": "text",                "fields": {                  "keyword": {                    "type": "keyword",                    "ignore_above": 256                  }                }              }            }          }        }      }    }  }}

Direct Search is also possible:

GET my_index/my_type/_search{  "query": {    "term": {      "object.name": {        "value": "xpleaf"      }    }  }}

Returned results:

{  "took": 0,  "timed_out": false,  "_shards": {    "total": 5,    "successful": 5,    "failed": 0  },  "hits": {    "total": 1,    "max_score": 0.2876821,    "hits": [      {        "_index": "my_index",        "_type": "my_type",        "_id": "1",        "_score": 0.2876821,        "_source": {          "object": {            "name": "xpleaf",            "job": "es"          }        }      }    ]  }}

Object objects are actually flat in ES. As shown above, in ES, they are actually:

{"Object. Name": "xpleaf", "object. Job": "es "}

3.1.10 nested

The nested type is a special case of the object type. It allows the object array to be indexed and queried independently. Lucene does not have the concept of an internal object, so es flat the object hierarchy and converts it into a simple list of field names and values.

Although it is a special case of the object type, its field type is fixed, that is, nested, which is the biggest difference with the object.

So why should we use the nested type? Is it okay to use objects? Here I post an official example to illustrate (https://www.elastic.co/guide/en/elasticsearch/reference/5.6/nested.html ):

Arrays of innerobjectFields do not work the way you may have CT. Lucene has no concept of inner objects, so elasticsearch flattens object hierarchies into a simple list of field names and values. For instance, the following document:

PUT my_index/my_type/1{  "group" : "fans",  "user" : [     {      "first" : "John",      "last" :  "Smith"    },    {      "first" : "Alice",      "last" :  "White"    }  ]}

Wocould be transformed internally into a document that looks more like this:

{  "group" :        "fans",  "user.first" : [ "alice", "john" ],  "user.last" :  [ "smith", "white" ]}

Theuser.firstAnduser.lastFields are flattened into multi-value fields, and the associationaliceAndwhiteIs lost. This document wocould incorrectly match a queryalice AND smith:

GET my_index/_search{  "query": {    "bool": {      "must": [        { "match": { "user.first": "Alice" }},        { "match": { "user.last":  "Smith" }}      ]    }  }}

The above is a problem caused by the direct use of objects. That is to say, this document should not be matched during the above search, but it is indeed matched. The nested object type can maintain the independence of each object in the array. The nested type indexes each object in the array as an independent hidden document, which means that each nested object can be searched independently.

If you need to index arrays of objects and to maintain the independence of each object in the array, you should usenestedDatatype instead ofobjectDatatype. Internally, nested objects index each object in the array as a separate hidden document, meaning that each nested object can be queried independently of the others, withnestedQuery:

PUT my_index{  "mappings": {    "my_type": {      "properties": {        "user": {          "type": "nested"         }      }    }  }}PUT my_index/my_type/1{  "group" : "fans",  "user" : [    {      "first" : "John",      "last" :  "Smith"    },    {      "first" : "Alice",      "last" :  "White"    }  ]}GET my_index/_search{  "query": {    "nested": {      "path": "user",      "query": {        "bool": {          "must": [            { "match": { "user.first": "Alice" }},            { "match": { "user.last":  "Smith" }}           ]        }      }    }  }}GET my_index/_search{  "query": {    "nested": {      "path": "user",      "query": {        "bool": {          "must": [            { "match": { "user.first": "Alice" }},            { "match": { "user.last":  "White" }}           ]        }      },      "inner_hits": {         "highlight": {          "fields": {            "user.first": {}          }        }      }    }  }}

Indexing a document containing 100 nested fields is actually indexing 101 documents. each nested document is indexed as an independent document. To prevent over-defining the number of nested fields, each index can define up to 50 nested fields.

3.1.11 range

The range type and its value range are as follows:

Type	Range
Integer_range	-2 ^ 31 ~ 2 ^ 31-1
Float_range	32-bit IEEE 754
Long_range	-2 ^ 63 ~ 2 ^ 63-1
Double_range	64-bit IEEE 754
Date_range	64-bit integer, Millisecond Time

3.2 yuan Field

The meta field is the field used to describe the document. Its classification and description are as follows:

Meta field category	Specific attributes	Function
Meta fields of document attributes	_ Index	Document Index
	_ Uid	Include`_type`And`_id`(Value:`{type}#{id}`)
	_ Type	Document Type
	_ Id	Document ID
Meta field of the source document	_ Source	Original JSON string of the document
	_ Size	_ Source Field Size
	_ All	Super fields that contain all the indexed fields
	_ Field_names	This document contains all fields with non-null values.
The meta field of the route.	_ Parent	Specify the parent-child relationship between documents
	_ Routing	Route documents to custom route values of specific shards
Custom meta Field	_ Meta	Used to customize metadata

For more information about each field, see https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping-fields.html.

4. ing parameters

For more information, see https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping-params.html.

Common elasticsearch operations: Mappings

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Common elasticsearch operations: Mappings

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Common elasticsearch operations: Mappings

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support