Elasticsearch 2.20 entry: Aggregate operations

Source: Internet
Author: User

Elasticsearch 2.20 entry: Aggregate operations

Aggregations provides the ability to group and collect documents. Aggregation is similar to the group by function in relational databases. In Elasticsearch, you can obtain the aggregate results in an aggregation query and then aggregate them again. This is a very useful function. You can get the results of multiple aggregation through one operation to avoid multiple requests and reduce the burden on the network and server.

Aggregations provides the ability to group and collect documents. Aggregation is similar to the group by function in relational databases. In Elasticsearch, you can obtain the aggregate results in an aggregation query and then aggregate them again. This is a very useful function. You can get the results of multiple aggregation through one operation to avoid multiple requests and reduce the burden on the network and server.

Data preparation: insert several data records:

Request: POST localhost: 9200/customer/external /? Pretty

Parameters:

{"Name": "secisland", "age": 25, "state": "open", "gender": "woman", "balance": 87}

{"Name": "zhangsan", "age": 32, "state": "close", "gender": "man", "balance": 95}

{"Name": "zhangsan1", "age": 33, "state": "close", "gender": "man", "balance": 91}

{"Name": "lisi", "age": 34, "state": "open", "gender": "woman", "balance": 99}

{"Name": "wangwu", "age": 46, "state": "close", "gender": "woman", "balance": 78}

Five data entries are inserted for testing.

With the data, we perform the aggregation test:

Example: group all customers by status, and then return the first 10 (default) status, sorted by statistics (or default:

Request: POST http: // localhost: 9200/customer/_ search? Pretty

Parameters:

{
"Size": 0,
"Aggs ":{
"Group_by_state ":{
"Terms ":{
"Field": "state"
}
}
}
}

This query condition is similar to group by in a relational database:
SELECT state, COUNT (*) FROM customer group by state order by count (*) DESC

Returned results:

{
"Took": 1,
"Timed_out": false,
"_ Shards ":{
"Total": 5,
"Successful": 5,
"Failed": 0
},
"Hits ":{
"Total": 5,
"Max_score": 0.0,
"Hits": []
},
"Aggregations ":{
"Group_by_state ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "close ",
"Doc_count": 3
},{
"Key": "open ",
"Doc_count": 2
}]
}
}
}

We can see that there are two close-state customers and two open-State users.

Next, we will add another function based on the above to calculate the average balance of each status while calculating the statistical status.

The request is the same as the previous one, but the parameter has changed. See the following parameters:

{
"Size": 0,
"Aggs ":{
"Group_by_state ":{
"Terms ":{
"Field": "state"
},
"Aggs ":{
"Average_balance ":{
"Avg ":{
"Field": "balance"
}
}
}
}
}
}

The query result is as follows:

{
"Took": 16,
"Timed_out": false,
"_ Shards ":{
"Total": 5,
"Successful": 5,
"Failed": 0
},
"Hits ":{
"Total": 5,
"Max_score": 0.0,
"Hits": []
},
"Aggregations ":{
"Group_by_state ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "close ",
"Doc_count": 3,
"Average_balance ":{
"Value": 88.0
}
},{
"Key": "open ",
"Doc_count": 2,
"Average_balance ":{
"Value": 93.0
}
}]
}
}
}

Observe how average_balance aggregation is nested in group_by_state aggregation. This is a common mode of aggregation. You can aggregate any field after aggregation to get the desired result.

In the following example, we sort the average account amount in descending order again in the above results:

The request is the same as the previous one:

Parameters:

{
"Size": 0,
"Aggs ":{
"Group_by_state ":{
"Terms ":{
"Field": "state ",
"Order ":{
"Average_balance": "desc"
}
},
"Aggs ":{
"Average_balance ":{
"Avg ":{
"Field": "balance"
}
}
}
}
}
}

The query result is as follows:

{
"Took": 1,
"Timed_out": false,
"_ Shards ":{
"Total": 5,
"Successful": 5,
"Failed": 0
},
"Hits ":{
"Total": 5,
"Max_score": 0.0,
"Hits": []
},
"Aggregations ":{
"Group_by_state ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "open ",
"Doc_count": 2,
"Average_balance ":{
"Value": 93.0
}
},{
"Key": "close ",
"Doc_count": 3,
"Average_balance ":{
"Value": 88.0
}
}]
}
}
}

This article is original by secisland. For more information, see the author and source.

The following example is more complex: demonstrate how to use the age group (age: 20-29 years old, 30-39 years old, 40-49), and then use the gender to get the final result of each age group, average account balance for each gender:

{
"Size": 0,
"Aggs ":{
"Group_by_age ":{
"Range ":{
"Field": "age ",
"Ranges ":[
{
"From": 20,
"To": 30
},
{
"From": 30,
"To": 40
},
{
"From": 40,
"To": 50
}
]
},
"Aggs ":{
"Group_by_gender ":{
"Terms ":{
"Field": "gender"
},
"Aggs ":{
"Average_balance ":{
"Avg ":{
"Field": "balance"
}
}
}
}
}
}
}
}

Returned results:

{
"Took": 15,
"Timed_out": false,
"_ Shards ":{
"Total": 5,
"Successful": 5,
"Failed": 0
},
"Hits ":{
"Total": 5,
"Max_score": 0.0,
"Hits": []
},
"Aggregations ":{
"Group_by_age ":{
"Buckets ":[{
"Key": "20.0-30.0 ",
"From": 20.0,
"From_as_string": "20.0 ",
"To": 30.0,
"To_as_string": "30.0 ",
"Doc_count": 1,
"Group_by_gender ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "woman ",
"Doc_count": 1,
"Average_balance ":{
"Value": 87.0
}
}]
}
},{
"Key": "30.0-40.0 ",
"From": 30.0,
"From_as_string": "30.0 ",
"To": 40.0,
"To_as_string": "40.0 ",
"Doc_count": 3,
"Group_by_gender ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "man ",
"Doc_count": 2,
"Average_balance ":{
"Value": 93.0
}
},{
"Key": "woman ",
"Doc_count": 1,
"Average_balance ":{
"Value": 99.0
}
}]
}
},{
"Key": "40.0-50.0 ",
"From": 40.0,
"From_as_string": "40.0 ",
"To": 50.0,
"To_as_string": "50.0 ",
"Doc_count": 1,
"Group_by_gender ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "woman ",
"Doc_count": 1,
"Average_balance ":{
"Value": 78.0
}
}]
}
}]
}
}
}

From the above example, we can see that Elasticsearch's aggregation capability is very powerful.

ElasticSearch latest version 2.20 released and downloaded

Full record of installation and deployment of ElasticSearch on Linux

Elasticsearch installation and usage tutorial

ElasticSearch configuration file Translation

ElasticSearch cluster creation instance

Build a standalone and server environment for distributed search ElasticSearch

Working Mechanism of ElasticSearch

Use Elasticsearch + Logstash + Kibana to build a centralized Log Analysis Platform

ElasticSearch details: click here
ElasticSearch: click here

This article permanently updates the link address:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.