Elasticsearch 2.20 entry: Aggregate operations
Aggregations provides the ability to group and collect documents. Aggregation is similar to the group by function in relational databases. In Elasticsearch, you can obtain the aggregate results in an aggregation query and then aggregate them again. This is a very useful function. You can get the results of multiple aggregation through one operation to avoid multiple requests and reduce the burden on the network and server.
Aggregations provides the ability to group and collect documents. Aggregation is similar to the group by function in relational databases. In Elasticsearch, you can obtain the aggregate results in an aggregation query and then aggregate them again. This is a very useful function. You can get the results of multiple aggregation through one operation to avoid multiple requests and reduce the burden on the network and server.
Data preparation: insert several data records:
Request: POST localhost: 9200/customer/external /? Pretty
Parameters:
{"Name": "secisland", "age": 25, "state": "open", "gender": "woman", "balance": 87}
{"Name": "zhangsan", "age": 32, "state": "close", "gender": "man", "balance": 95}
{"Name": "zhangsan1", "age": 33, "state": "close", "gender": "man", "balance": 91}
{"Name": "lisi", "age": 34, "state": "open", "gender": "woman", "balance": 99}
{"Name": "wangwu", "age": 46, "state": "close", "gender": "woman", "balance": 78}
Five data entries are inserted for testing.
With the data, we perform the aggregation test:
Example: group all customers by status, and then return the first 10 (default) status, sorted by statistics (or default:
Request: POST http: // localhost: 9200/customer/_ search? Pretty
Parameters:
{
"Size": 0,
"Aggs ":{
"Group_by_state ":{
"Terms ":{
"Field": "state"
}
}
}
}
This query condition is similar to group by in a relational database:
SELECT state, COUNT (*) FROM customer group by state order by count (*) DESC
Returned results:
{
"Took": 1,
"Timed_out": false,
"_ Shards ":{
"Total": 5,
"Successful": 5,
"Failed": 0
},
"Hits ":{
"Total": 5,
"Max_score": 0.0,
"Hits": []
},
"Aggregations ":{
"Group_by_state ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "close ",
"Doc_count": 3
},{
"Key": "open ",
"Doc_count": 2
}]
}
}
}
We can see that there are two close-state customers and two open-State users.
Next, we will add another function based on the above to calculate the average balance of each status while calculating the statistical status.
The request is the same as the previous one, but the parameter has changed. See the following parameters:
{
"Size": 0,
"Aggs ":{
"Group_by_state ":{
"Terms ":{
"Field": "state"
},
"Aggs ":{
"Average_balance ":{
"Avg ":{
"Field": "balance"
}
}
}
}
}
}
The query result is as follows:
{
"Took": 16,
"Timed_out": false,
"_ Shards ":{
"Total": 5,
"Successful": 5,
"Failed": 0
},
"Hits ":{
"Total": 5,
"Max_score": 0.0,
"Hits": []
},
"Aggregations ":{
"Group_by_state ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "close ",
"Doc_count": 3,
"Average_balance ":{
"Value": 88.0
}
},{
"Key": "open ",
"Doc_count": 2,
"Average_balance ":{
"Value": 93.0
}
}]
}
}
}
Observe how average_balance aggregation is nested in group_by_state aggregation. This is a common mode of aggregation. You can aggregate any field after aggregation to get the desired result.
In the following example, we sort the average account amount in descending order again in the above results:
The request is the same as the previous one:
Parameters:
{
"Size": 0,
"Aggs ":{
"Group_by_state ":{
"Terms ":{
"Field": "state ",
"Order ":{
"Average_balance": "desc"
}
},
"Aggs ":{
"Average_balance ":{
"Avg ":{
"Field": "balance"
}
}
}
}
}
}
The query result is as follows:
{
"Took": 1,
"Timed_out": false,
"_ Shards ":{
"Total": 5,
"Successful": 5,
"Failed": 0
},
"Hits ":{
"Total": 5,
"Max_score": 0.0,
"Hits": []
},
"Aggregations ":{
"Group_by_state ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "open ",
"Doc_count": 2,
"Average_balance ":{
"Value": 93.0
}
},{
"Key": "close ",
"Doc_count": 3,
"Average_balance ":{
"Value": 88.0
}
}]
}
}
}
This article is original by secisland. For more information, see the author and source.
The following example is more complex: demonstrate how to use the age group (age: 20-29 years old, 30-39 years old, 40-49), and then use the gender to get the final result of each age group, average account balance for each gender:
{
"Size": 0,
"Aggs ":{
"Group_by_age ":{
"Range ":{
"Field": "age ",
"Ranges ":[
{
"From": 20,
"To": 30
},
{
"From": 30,
"To": 40
},
{
"From": 40,
"To": 50
}
]
},
"Aggs ":{
"Group_by_gender ":{
"Terms ":{
"Field": "gender"
},
"Aggs ":{
"Average_balance ":{
"Avg ":{
"Field": "balance"
}
}
}
}
}
}
}
}
Returned results:
{
"Took": 15,
"Timed_out": false,
"_ Shards ":{
"Total": 5,
"Successful": 5,
"Failed": 0
},
"Hits ":{
"Total": 5,
"Max_score": 0.0,
"Hits": []
},
"Aggregations ":{
"Group_by_age ":{
"Buckets ":[{
"Key": "20.0-30.0 ",
"From": 20.0,
"From_as_string": "20.0 ",
"To": 30.0,
"To_as_string": "30.0 ",
"Doc_count": 1,
"Group_by_gender ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "woman ",
"Doc_count": 1,
"Average_balance ":{
"Value": 87.0
}
}]
}
},{
"Key": "30.0-40.0 ",
"From": 30.0,
"From_as_string": "30.0 ",
"To": 40.0,
"To_as_string": "40.0 ",
"Doc_count": 3,
"Group_by_gender ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "man ",
"Doc_count": 2,
"Average_balance ":{
"Value": 93.0
}
},{
"Key": "woman ",
"Doc_count": 1,
"Average_balance ":{
"Value": 99.0
}
}]
}
},{
"Key": "40.0-50.0 ",
"From": 40.0,
"From_as_string": "40.0 ",
"To": 50.0,
"To_as_string": "50.0 ",
"Doc_count": 1,
"Group_by_gender ":{
"Doc_count_error_upper_bound": 0,
"Sum_other_doc_count": 0,
"Buckets ":[{
"Key": "woman ",
"Doc_count": 1,
"Average_balance ":{
"Value": 78.0
}
}]
}
}]
}
}
}
From the above example, we can see that Elasticsearch's aggregation capability is very powerful.
ElasticSearch latest version 2.20 released and downloaded
Full record of installation and deployment of ElasticSearch on Linux
Elasticsearch installation and usage tutorial
ElasticSearch configuration file Translation
ElasticSearch cluster creation instance
Build a standalone and server environment for distributed search ElasticSearch
Working Mechanism of ElasticSearch
Use Elasticsearch + Logstash + Kibana to build a centralized Log Analysis Platform
ElasticSearch details: click here
ElasticSearch: click here
This article permanently updates the link address: