Elasticsearch How to get the full results of a query

Source: Internet
Author: User
Tags json

In general, when using queries in ES, the first 10 results returned by default, and how we get all the data when we have tens of thousands of results for a query. Although we can set the number of bars returned after a query by size.


The ES API provides scan and scroll, a type of cursor in a traditional database.

Method 1: Directly use the scroll provided by ES

The first step: send the following GET request to the ES server. The contents of {} are written in the request body. Wherein, scroll=1m, set Scroll to remain open within 1min
get/old_index/_search?scroll=10m
{
"Query": {"Match_all": {}},
"Size": 1000
}


After invoking this request, the ES service responds with a JSON similar to the following:
{"_scroll_id": " c2nhbjszozm1mtpvrkjrrhnwbfniv2rplvhlbwlyc1h3ozm1mdpvrkjrrhnwbfniv2rplvhlbwlyc1h3oziznzpnv2jcmkq1rvfbdv90d3zjoevhotl3oze7d g90ywxfagl0czo2njuyow== "," took ": 3," timed_out ": false," _shards ": {" Total ": 3," successful ": 3," Failed ": 0}," hits ": {" Total ": 6652," Max_score ": 0.0," hits ": []}
, where _scroll_id is important in the next use, and _scroll_id is equivalent to a cursor object in a traditional database.


Step two: Send the following GET request to the server. Pass the returned _scroll_id as a parameter to the server. The contents of the second line are written in the request body.
GET/_search/scroll?scroll=1m
c2nhbjszozm1mtpvrkjrrhnwbfniv2rplvhlbwlyc1h3ozm1mdpvrkjrrhnwbfniv2rplvhlbwlyc1h3oziznzpnv2jcmkq1rvfbdv90d3zjoevhotl3oze7d g90ywxfagl0czo2njuyow==


After invoking this request, the ES service responds with a JSON similar to the following:
{"_scroll_id": " c2nhbjszozm1mtpvrkjrrhnwbfniv2rplvhlbwlyc1h3ozm1mdpvrkjrrhnwbfniv2rplvhlbwlyc1h3oziznzpnv2jcmkq1rvfbdv90d3zjoevhotl3oze7d g90ywxfagl0czo2njuyow== "," took ": 2," timed_out ": false," _shards ": {" Total ": 3," successful ": 3," Failed ": 0}," hits ": {" Total ": 101," Max_score ": null," hits ": [{" _index ":" Old_index "," _type ":" 3 "," _id ":" avcoh6dlybq5kuct6s7a "," _score " : 1.0, "_source": {document}},{"_index": "Old_index", "_type": "3", "_id": "avcoh6dlybq5kuct6s7a", "_score": 1.0, "_source": {document }}]}}


Carefully we will find that ES returns the same _scroll_id as the _scroll_id value sent to the server. The description is the same object.
Step three: Repeat the second step until the data in the hits is empty. At this point, all the data for the query is finished
Fourth step: Delete the _scroll_id. The GET request looks like this:
DELETE/_search/scroll
c2nhbjszozm1mtpvrkjrrhnwbfniv2rplvhlbwlyc1h3ozm1mdpvrkjrrhnwbfniv2rplvhlbwlyc1h3oziznzpnv2jcmkq1rvfbdv90d3zjoevhotl3oze7d g90ywxfagl0czo2njuyow==


Attention:
The response to this scroll request includes the first batch of results. Although we specified a size of $, we get back many more documents. When scanning, the size are applied to all shard, so you'll get back a maximum of size * Number_of_primary_shards Docume NTS in each batch.


Method 2: Use the Helpers.scan method provided by Python
Scan uses code:

Scanresp = Helpers.scan (es, _body, scroll= "10m", index= _index, doc_type= _doc_type, timeout= "10m") for
   
resp in Scanr ESP:
   Print resp




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.