A search request can only return a single page (ten Records) of the result, while the scroll API can be used from a single Search Requests Retrieve a large number of results (or even all)
, this behaves as if you were using a cursor within a traditional database.
Scrolling is not intended for practical user requests, but for processing large amounts of data. For example, to reinsert the contents of an index into a new index with a different configuration.
The results returned by the scroll request reflect the state of the index at the time the initial search request was established. It's like a real-time snapshot, subsequent changes to the text (insert, update, or delete)
Has only affected the later Search request.
In order to use scrolling, the initial search request must be specified in the query string scroll parameter to tell Elasticsearch How long the ' search context ' must be persisted (consult keeping the search context alive). Like scroll=1m .
Curl-xget ' localhost:9200/twitter/tweet/_search?scroll=1m '-d ' { "query": { "match": { "title": " Elasticsearch "}} } '
the results of the above query will include a ' _scroll_id '
"_scroll_id": " Cxvlcnluagvurmv0y2g7nts2okzlnejsy014vhbhvfneela0zli3ync7nzpgztrcbgnnefrwr1rtrhpqngzsn2j3ozg6rmu0qmx
jtxhuceduu0r6udrmujdidzs5okzlnejsy014vhbhvfneela0zli3ync7mta6rmu0qmxjtxhuceduu0r6udrmujdidzswow== ", " took " : 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits ": { " total ": 0, " Max_score ": null, " hits ": [] }
,in order to retrieve the next batchResults, thisIDmust be passed toScroll API.
Curl-xget ' localhost:9200/_search/scroll?scroll=1m '- D ' cxvlcnluagvurmv0y2g7nts2okzlnejsy014vhbhvfneela0zli3ync7nzpgztrcbgnnefrwr1rtrhpqngzsn2j3ozg6rmu0q
mxjtxhuceduu0r6udrmujdidzs5okzlnejsy014vhbhvfneela0zli3ync7mta6rmu0qmxjtxhuceduu0r6udrmujdidzswow== '
The URL cannot include the index or type name, but should be specified on the original search request.
Scroll parameter notification Elasticsearch Keep the search context for another 1 minutes (1m)
scroll_id can be passed inside the request body or in the query string as " scroll_id ' = pass.
Each request to Scrollapi will return the next batch of results until no more results are returned. For example, when the hits array is empty.
The initial search request and each subsequent scroll request will return a new scroll_id--You must use the latest scroll_id .
Remember: If the request specifies an aggregation, only the initial search return will contain the aggregated result.
use Scroll-scan to achieve efficient scrolling.
In this example, the from and sizeare used-for example, size=10&from=10000 Implementation of deep paging is very inefficient. Because just in order to return the ten results,
must be from each Shard retrieves the 100,000 The result of a good order. And this process must be repeated each time a paging request is requested.
The scroll API is able to track those results that have been returned, so it is more efficient to return sorted results than deep paging. However, sorting results (which are performed by default) still costs
Usually, you just want to retrieve all the results and don't care about the order. scrolling ability to merge scan query types to cancel scoring and sorting to return results in the most likely and effective way.
all of this is only needed in the original Search of the request Query the string adds ' Search_type=scan ' :
Curl ' Localhost:9200/twitter/tweet/_search?scroll=1m&search_type=scan ' ①-d ' { "query": { "match": { "title": "Elasticsearch"}} '
① willSearch_typeSet toScan,The sorting is prohibited, makingscrollingMore effective.
A scan scroll request and a standard scroll request differ in the following four ways:
1. do not calculate the score, do not sort. What is the order in the index, and what is the order of the returned results;
2. aggregation is not supported.
3. the return of the original search request does not contain any results within hits. The initial result is returned in the first scroll request.
The 4.size parameter does not control the number of results per request, but instead controls The number of results per shard, so size=10 and hit 5 a Shard will be in every Scroll returns the maximum in the request - a result.
If you want to score, even if there is no sorting, set the track_scores parameter to true on the line.
Keep Search Context Active
The scroll parameter (passed to search request and each scroll request) informs Elasticsearch How long the search context must be kept. Its value (for example , 1m) does not need to be long enough to handle
all the data--it just needs to meet the results of being able to handle the pre-batch. Each scroll request is set to a new delay time by the Scoll parameter.
Typically, the background merge process optimizes the index by merging small chunks into new, larger chunks, while small chunks are removed. During the scrolling , the process will continue.
But if old chunks are found to be in use, an open search context prevents them from being deleted. This is how Elasticsearch can ignore subsequent changes to the text and return to the initial search
The reason for the requested result.
Keeping the old tiles active means that more file handles are needed. So make sure that you have enough free file handles configured in the node.
You can use the node stats API to see how many search contexts are open:
Curl-xget Localhost:9200/_nodes/stats/indices/search?pretty
ClearScroll API
The search context is automatically removed when all results have been retrieved or scroll execution times out. Of course, you can also Manually clear a search context using the Clear-scroll API.
Curl-xdelete Localhost:9200/_search/scroll- d ' C2NHBJS2OZM0NDG1ODPZRLBLC0FXNLNYNM5JWUC1 '
scroll_id can be passed in the request body or the query string.
Multiple Scroll IDs can be passed with comma-separated values.
Curl-xdelete localhost:9200/_search/scroll -d ' c2nhbjs2ozm0ndg1odpzrlblc0fxnlnynm5jwuc1, Agvurmv0y2g7ntsxonkxadz '
All search contexts can be deleted once with the _all parameter:
Curl-xdelete Localhost:9200/_search/scroll/_all
Original: http://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html
The Scroll of Elasticsearch