2016-01-12 11:40:34 -05:00
|
|
|
[[search-request-search-after]]
|
|
|
|
=== Search After
|
|
|
|
|
|
|
|
Pagination of results can be done by using the `from` and `size` but the cost becomes prohibitive when the deep pagination is reached.
|
|
|
|
The `index.max_result_window` which defaults to 10,000 is a safeguard, search requests take heap memory and time proportional to `from + size`.
|
|
|
|
The <<search-request-scroll,Scroll>> api is recommended for efficient deep scrolling but scroll contexts are costly and it is not
|
|
|
|
recommended to use it for real time user requests.
|
|
|
|
The `search_after` parameter circumvents this problem by providing a live cursor.
|
|
|
|
The idea is to use the results from the previous page to help the retrieval of the next page.
|
|
|
|
|
|
|
|
Suppose that the query to retrieve the first page looks like this:
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
2016-05-17 22:35:48 -04:00
|
|
|
GET twitter/tweet/_search
|
2016-01-12 11:40:34 -05:00
|
|
|
{
|
2016-05-17 22:35:48 -04:00
|
|
|
"size": 10,
|
2016-01-12 11:40:34 -05:00
|
|
|
"query": {
|
|
|
|
"match" : {
|
|
|
|
"title" : "elasticsearch"
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"sort": [
|
2016-05-17 22:35:48 -04:00
|
|
|
{"date": "asc"},
|
2016-01-12 11:40:34 -05:00
|
|
|
{"_uid": "desc"}
|
|
|
|
]
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2016-05-17 22:35:48 -04:00
|
|
|
// CONSOLE
|
|
|
|
// TEST[setup:twitter]
|
2016-01-12 11:40:34 -05:00
|
|
|
|
|
|
|
NOTE: A field with one unique value per document should be used as the tiebreaker of the sort specification.
|
|
|
|
Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use
|
|
|
|
the field `_uid` which is certain to contain one unique value for each document.
|
|
|
|
|
|
|
|
The result from the above request includes an array of `sort values` for each document.
|
|
|
|
These `sort values` can be used in conjunction with the `search_after` parameter to start returning results "after" any
|
|
|
|
document in the result list.
|
|
|
|
For instance we can use the `sort values` of the last document and pass it to `search_after` to retrieve the next page of results:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
2016-05-17 22:35:48 -04:00
|
|
|
GET twitter/tweet/_search
|
2016-01-12 11:40:34 -05:00
|
|
|
{
|
2016-05-17 22:35:48 -04:00
|
|
|
"size": 10,
|
2016-01-12 11:40:34 -05:00
|
|
|
"query": {
|
|
|
|
"match" : {
|
|
|
|
"title" : "elasticsearch"
|
|
|
|
}
|
|
|
|
},
|
2016-05-17 22:35:48 -04:00
|
|
|
"search_after": [1463538857, "tweet#654323"],
|
2016-01-12 11:40:34 -05:00
|
|
|
"sort": [
|
2016-05-17 22:35:48 -04:00
|
|
|
{"date": "asc"},
|
2016-01-12 11:40:34 -05:00
|
|
|
{"_uid": "desc"}
|
|
|
|
]
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2016-05-17 22:35:48 -04:00
|
|
|
// CONSOLE
|
|
|
|
// TEST[setup:twitter]
|
2016-01-12 11:40:34 -05:00
|
|
|
|
|
|
|
NOTE: The parameter `from` must be set to 0 (or -1) when `search_after` is used.
|
|
|
|
|
|
|
|
`search_after` is not a solution to jump freely to a random page but rather to scroll many queries in parallel.
|
|
|
|
It is very similar to the `scroll` API but unlike it, the `search_after` parameter is stateless, it is always resolved against the latest
|
|
|
|
version of the searcher. For this reason the sort order may change during a walk depending on the updates and deletes of your index.
|