Elasticsearch 2.20 entry: Aggregate operations

Last Update:2016-02-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Aggregations provides the ability to group and collect documents. Aggregation is similar to the group by function in relational databases. In Elasticsearch, you can obtain the aggregate results in an aggregation query and then aggregate them again. This is a very useful function. You can get the results of multiple aggregation through one operation to avoid multiple requests and reduce the burden on the network and server.

Data preparation: insert several data records:

Request: POST localhost: 9200/customer/external /? Pretty

Parameters:

{"Name": "secisland", "age": 25, "state": "open", "gender": "woman", "balance": 87}

{"Name": "zhangsan", "age": 32, "state": "close", "gender": "man", "balance": 95}

{"Name": "zhangsan1", "age": 33, "state": "close", "gender": "man", "balance": 91}

{"Name": "lisi", "age": 34, "state": "open", "gender": "woman", "balance": 99}

{"Name": "wangwu", "age": 46, "state": "close", "gender": "woman", "balance": 78}

Five data entries are inserted for testing.

With the data, we perform the aggregation test:

Example: group all customers by status, and then return the first 10 (default) status, sorted by statistics (or default:

Request: POST http: // localhost: 9200/customer/_ search? Pretty

Parameters:

{
"Size": 0,
"Aggs ":{
"Group_by_state ":{
"Terms ":{
"Field": "state"
}
}
}
}

This query condition is similar to group by in a relational database:
SELECT state, COUNT (*) FROM customer group by state order by count (*) DESC

Returned results:

{
"Took": 1,
"Timed_out": false,
"_ Shards ":{
"Total": 5,
"Successful": 5,
"Failed": 0
},
"Hits ":{
"Total": 5,
"Max_score": 0.0,
"Hits": []
},
"Aggregations ":{
"Group_by_state ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "close ",
"Doc_count": 3
},{
"Key": "open ",
"Doc_count": 2
}]
}
}
}

We can see that there are two close-state customers and two open-State users.

Next, we will add another function based on the above to calculate the average balance of each status while calculating the statistical status.

The request is the same as the previous one, but the parameter has changed. See the following parameters:

{
"Size": 0,
"Aggs ":{
"Group_by_state ":{
"Terms ":{
"Field": "state"
},
"Aggs ":{
"Average_balance ":{
"Avg ":{
"Field": "balance"
}
}
}
}
}
}

The query result is as follows:

{
"Took": 16,
"Timed_out": false,
"_ Shards ":{
"Total": 5,
"Successful": 5,
"Failed": 0
},
"Hits ":{
"Total": 5,
"Max_score": 0.0,
"Hits": []
},
"Aggregations ":{
"Group_by_state ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "close ",
"Doc_count": 3,
"Average_balance ":{
"Value": 88.0
}
},{
"Key": "open ",
"Doc_count": 2,
"Average_balance ":{
"Value": 93.0
}
}]
}
}
}

Observe how average_balance aggregation is nested in group_by_state aggregation. This is a common mode of aggregation. You can aggregate any field after aggregation to get the desired result.

In the following example, we sort the average account amount in descending order again in the above results:

The request is the same as the previous one:

Parameters:

{
"Size": 0,
"Aggs ":{
"Group_by_state ":{
"Terms ":{
"Field": "state ",
"Order ":{
"Average_balance": "desc"
}
},
"Aggs ":{
"Average_balance ":{
"Avg ":{
"Field": "balance"
}
}
}
}
}
}

The query result is as follows:

{
"Took": 1,
"Timed_out": false,
"_ Shards ":{
"Total": 5,
"Successful": 5,
"Failed": 0
},
"Hits ":{
"Total": 5,
"Max_score": 0.0,
"Hits": []
},
"Aggregations ":{
"Group_by_state ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "open ",
"Doc_count": 2,
"Average_balance ":{
"Value": 93.0
}
},{
"Key": "close ",
"Doc_count": 3,
"Average_balance ":{
"Value": 88.0
}
}]
}
}
}

This article is original by secisland. For more information, see the author and source.

The following example is more complex: demonstrate how to use the age group (age: 20-29 years old, 30-39 years old, 40-49), and then use the gender to get the final result of each age group, average account balance for each gender:

{
"Size": 0,
"Aggs ":{
"Group_by_age ":{
"Range ":{
"Field": "age ",
"Ranges ":[
{
"From": 20,
"To": 30
},
{
"From": 30,
"To": 40
},
{
"From": 40,
"To": 50
}
]
},
"Aggs ":{
"Group_by_gender ":{
"Terms ":{
"Field": "gender"
},
"Aggs ":{
"Average_balance ":{
"Avg ":{
"Field": "balance"
}
}
}
}
}
}
}
}

Returned results:

{
"Took": 15,
"Timed_out": false,
"_ Shards ":{
"Total": 5,
"Successful": 5,
"Failed": 0
},
"Hits ":{
"Total": 5,
"Max_score": 0.0,
"Hits": []
},
"Aggregations ":{
"Group_by_age ":{
"Buckets ":[{
"Key": "20.0-30.0 ",
"From": 20.0,
"From_as_string": "20.0 ",
"To": 30.0,
"To_as_string": "30.0 ",
"Doc_count": 1,
"Group_by_gender ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "woman ",
"Doc_count": 1,
"Average_balance ":{
"Value": 87.0
}
}]
}
},{
"Key": "30.0-40.0 ",
"From": 30.0,
"From_as_string": "30.0 ",
"To": 40.0,
"To_as_string": "40.0 ",
"Doc_count": 3,
"Group_by_gender ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "man ",
"Doc_count": 2,
"Average_balance ":{
"Value": 93.0
}
},{
"Key": "woman ",
"Doc_count": 1,
"Average_balance ":{
"Value": 99.0
}
}]
}
},{
"Key": "40.0-50.0 ",
"From": 40.0,
"From_as_string": "40.0 ",
"To": 50.0,
"To_as_string": "50.0 ",
"Doc_count": 1,
"Group_by_gender ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "woman ",
"Doc_count": 1,
"Average_balance ":{
"Value": 78.0
}
}]
}
}]
}
}
}

From the above example, we can see that Elasticsearch's aggregation capability is very powerful.

ElasticSearch latest version 2.20 released and downloaded

Full record of installation and deployment of ElasticSearch on Linux

Elasticsearch installation and usage tutorial

ElasticSearch configuration file Translation

ElasticSearch cluster creation instance

Build a standalone and server environment for distributed search ElasticSearch

Working Mechanism of ElasticSearch

Use Elasticsearch + Logstash + Kibana to build a centralized Log Analysis Platform

ElasticSearch details: click here
ElasticSearch: click here

This article permanently updates the link address:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Elasticsearch 2.20 entry: Aggregate operations

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support