OpenSearch/docs/reference/search/request/search-after.asciidoc

[[request-body-search-search-after]]
=== Search After

Pagination of results can be done by using the `from` and `size` but the cost becomes prohibitive when the deep pagination is reached.
The `index.max_result_window` which defaults to 10,000 is a safeguard, search requests take heap memory and time proportional to `from + size`.
The <<search-request-scroll,Scroll>> api is recommended for efficient deep scrolling but scroll contexts are costly and it is not
recommended to use it for real time user requests.
The `search_after` parameter circumvents this problem by providing a live cursor.
The idea is to use the results from the previous page to help the retrieval of the next page.

Suppose that the query to retrieve the first page looks like this:
[source,js]
--------------------------------------------------
GET twitter/_search
{
    "size": 10,
    "query": {
        "match" : {
            "title" : "elasticsearch"
        }
    },
    "sort": [
        {"date": "asc"},
        {"tie_breaker_id": "asc"}      <1>
    ]
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
// TEST[s/"tie_breaker_id": "asc"/"tie_breaker_id": {"unmapped_type": "keyword"}/]

<1> A copy of the `_id` field with `doc_values` enabled

[IMPORTANT]
A field with one unique value per document should be used as the tiebreaker
of the sort specification. Otherwise the sort order for documents that have
the same sort values would be undefined and could lead to missing or duplicate
results. The <<mapping-id-field,`_id` field>> has a unique value per document
but it is not recommended to use it as a tiebreaker directly.
Beware that `search_after` looks for the first document which fully or partially
matches tiebreaker's provided value. Therefore if a document has a tiebreaker value of
`"654323"` and you `search_after` for `"654"` it would still match that document
and return results found after it.
<<doc-values,doc value>> are disabled on this field so sorting on it requires
to load a lot of data in memory. Instead it is advised to duplicate (client side
 or with a <<ingest-processors,set ingest processor>>) the content
of the <<mapping-id-field,`_id` field>> in another field that has
<<doc-values,doc value>> enabled and to use this new field as the tiebreaker
for the sort.

The result from the above request includes an array of `sort values` for each document.
These `sort values` can be used in conjunction with the `search_after` parameter to start returning results "after" any
document in the result list.
For instance we can use the `sort values` of the last document and pass it to `search_after` to retrieve the next page of results:

[source,js]
--------------------------------------------------
GET twitter/_search
{
    "size": 10,
    "query": {
        "match" : {
            "title" : "elasticsearch"
        }
    },
    "search_after": [1463538857, "654323"],
    "sort": [
        {"date": "asc"},
        {"tie_breaker_id": "asc"}
    ]
}
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
// TEST[s/"tie_breaker_id": "asc"/"tie_breaker_id": {"unmapped_type": "keyword"}/]

NOTE: The parameter `from` must be set to 0 (or -1) when `search_after` is used.

`search_after` is not a solution to jump freely to a random page but rather to scroll many queries in parallel.
It is very similar to the `scroll` API but unlike it, the `search_after` parameter is stateless, it is always resolved against the latest
 version of the searcher. For this reason the sort order may change during a walk depending on the updates and deletes of your index.
[DOCS] Move Elasticsearch APIs to REST APIs section. (#44238) (#44372) Moves the following API sections under the REST APIs navigations: - API Conventions - Document APIs - Search APIs - Index APIs (previously named Indices APIs) - cat APIs - Cluster APIs Other supporting changes: - Removes the previous index APIs page under REST APIs. Adds a redirect for the removed page. - Removes several [partintro] macros so the docs build correctly. - Changes anchors for pages that become sections of a parent page. - Adds several redirects for existing pages that become sections of a parent page. This commit re-applies changes from #44238. Changes from that PR were reverted due to broken links in several repos. This commit adds redirects for those broken links. 2019-07-17 08:49:22 -04:00			`[[request-body-search-search-after]]`
Add search_after parameter in the Search API. The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values. This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on. NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document. Fixes #8192 2016-01-12 11:40:34 -05:00			`=== Search After`

			Pagination of results can be done by using the `from` and `size` but the cost becomes prohibitive when the deep pagination is reached.
			The `index.max_result_window` which defaults to 10,000 is a safeguard, search requests take heap memory and time proportional to `from + size`.
			`The <<search-request-scroll,Scroll>> api is recommended for efficient deep scrolling but scroll contexts are costly and it is not`
			`recommended to use it for real time user requests.`
			The `search_after` parameter circumvents this problem by providing a live cursor.
			`The idea is to use the results from the previous page to help the retrieval of the next page.`

			`Suppose that the query to retrieve the first page looks like this:`
			`[source,js]`
			`--------------------------------------------------`
Allow `_doc` as a type. (#27816) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751 2017-12-14 11:47:53 -05:00			`GET twitter/_search`
Add search_after parameter in the Search API. The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values. This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on. NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document. Fixes #8192 2016-01-12 11:40:34 -05:00			`{`
Convert search-after tests to // CONSOLE Relates to #18160 2016-05-17 22:35:48 -04:00			`"size": 10,`
Add search_after parameter in the Search API. The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values. This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on. NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document. Fixes #8192 2016-01-12 11:40:34 -05:00			`"query": {`
			`"match" : {`
			`"title" : "elasticsearch"`
			`}`
			`},`
			`"sort": [`
Convert search-after tests to // CONSOLE Relates to #18160 2016-05-17 22:35:48 -04:00			`{"date": "asc"},`
Do not recommend to use the _id field in search_after docs (#35370) The documentation of `search_after` recommends to use the `_id` field as a tiebreaker for the sort without warning against the additional memory required. This change changes the recommandation to use a copy of the `_id` field with doc_values enabled. 2018-11-14 04:50:31 -05:00			`{"tie_breaker_id": "asc"} <1>`
Add search_after parameter in the Search API. The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values. This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on. NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document. Fixes #8192 2016-01-12 11:40:34 -05:00			`]`
			`}`
			`--------------------------------------------------`
Convert search-after tests to // CONSOLE Relates to #18160 2016-05-17 22:35:48 -04:00			`// CONSOLE`
			`// TEST[setup:twitter]`
Do not recommend to use the _id field in search_after docs (#35370) The documentation of `search_after` recommends to use the `_id` field as a tiebreaker for the sort without warning against the additional memory required. This change changes the recommandation to use a copy of the `_id` field with doc_values enabled. 2018-11-14 04:50:31 -05:00			`// TEST[s/"tie_breaker_id": "asc"/"tie_breaker_id": {"unmapped_type": "keyword"}/]`
Add search_after parameter in the Search API. The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values. This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on. NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document. Fixes #8192 2016-01-12 11:40:34 -05:00
Do not recommend to use the _id field in search_after docs (#35370) The documentation of `search_after` recommends to use the `_id` field as a tiebreaker for the sort without warning against the additional memory required. This change changes the recommandation to use a copy of the `_id` field with doc_values enabled. 2018-11-14 04:50:31 -05:00			<1> A copy of the `_id` field with `doc_values` enabled

			`[IMPORTANT]`
			`A field with one unique value per document should be used as the tiebreaker`
			`of the sort specification. Otherwise the sort order for documents that have`
			`the same sort values would be undefined and could lead to missing or duplicate`
			results. The <<mapping-id-field,`_id` field>> has a unique value per document
			`but it is not recommended to use it as a tiebreaker directly.`
[Docs] Clarify search_after behavior Closes #34232 2018-11-30 08:30:23 -05:00			Beware that `search_after` looks for the first document which fully or partially
			`matches tiebreaker's provided value. Therefore if a document has a tiebreaker value of`
			`"654323"` and you `search_after` for `"654"` it would still match that document
			`and return results found after it.`
Do not recommend to use the _id field in search_after docs (#35370) The documentation of `search_after` recommends to use the `_id` field as a tiebreaker for the sort without warning against the additional memory required. This change changes the recommandation to use a copy of the `_id` field with doc_values enabled. 2018-11-14 04:50:31 -05:00			`<<doc-values,doc value>> are disabled on this field so sorting on it requires`
			`to load a lot of data in memory. Instead it is advised to duplicate (client side`
			`or with a <<ingest-processors,set ingest processor>>) the content`
			of the <<mapping-id-field,`_id` field>> in another field that has
			`<<doc-values,doc value>> enabled and to use this new field as the tiebreaker`
			`for the sort.`
Add search_after parameter in the Search API. The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values. This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on. NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document. Fixes #8192 2016-01-12 11:40:34 -05:00
			The result from the above request includes an array of `sort values` for each document.
			These `sort values` can be used in conjunction with the `search_after` parameter to start returning results "after" any
			`document in the result list.`
			For instance we can use the `sort values` of the last document and pass it to `search_after` to retrieve the next page of results:

			`[source,js]`
			`--------------------------------------------------`
Allow `_doc` as a type. (#27816) Allowing `_doc` as a type will enable users to make the transition to 7.0 smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`. This also moves most of the documentation to `_doc` as a type name. Closes #27750 Closes #27751 2017-12-14 11:47:53 -05:00			`GET twitter/_search`
Add search_after parameter in the Search API. The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values. This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on. NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document. Fixes #8192 2016-01-12 11:40:34 -05:00			`{`
Convert search-after tests to // CONSOLE Relates to #18160 2016-05-17 22:35:48 -04:00			`"size": 10,`
Add search_after parameter in the Search API. The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values. This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on. NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document. Fixes #8192 2016-01-12 11:40:34 -05:00			`"query": {`
			`"match" : {`
			`"title" : "elasticsearch"`
			`}`
			`},`
Identify documents by their `_id`. (#24460) Now that indices have a single type by default, we can move to the next step and identify documents using their `_id` rather than the `_uid`. One notable change in this commit is that I made deletions implicitly create types. This helps with the live version map in the case that documents are deleted before the first type is introduced. Otherwise there would be no way to differenciate `DELETE index/foo/1` followed by `PUT index/foo/1` from `DELETE index/bar/1` followed by `PUT index/foo/1`, even though those are different if versioning is involved. 2017-05-09 10:33:52 -04:00			`"search_after": [1463538857, "654323"],`
Add search_after parameter in the Search API. The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values. This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on. NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document. Fixes #8192 2016-01-12 11:40:34 -05:00			`"sort": [`
Convert search-after tests to // CONSOLE Relates to #18160 2016-05-17 22:35:48 -04:00			`{"date": "asc"},`
Do not recommend to use the _id field in search_after docs (#35370) The documentation of `search_after` recommends to use the `_id` field as a tiebreaker for the sort without warning against the additional memory required. This change changes the recommandation to use a copy of the `_id` field with doc_values enabled. 2018-11-14 04:50:31 -05:00			`{"tie_breaker_id": "asc"}`
Add search_after parameter in the Search API. The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values. This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on. NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document. Fixes #8192 2016-01-12 11:40:34 -05:00			`]`
			`}`
			`--------------------------------------------------`
Convert search-after tests to // CONSOLE Relates to #18160 2016-05-17 22:35:48 -04:00			`// CONSOLE`
			`// TEST[setup:twitter]`
Do not recommend to use the _id field in search_after docs (#35370) The documentation of `search_after` recommends to use the `_id` field as a tiebreaker for the sort without warning against the additional memory required. This change changes the recommandation to use a copy of the `_id` field with doc_values enabled. 2018-11-14 04:50:31 -05:00			`// TEST[s/"tie_breaker_id": "asc"/"tie_breaker_id": {"unmapped_type": "keyword"}/]`
Add search_after parameter in the Search API. The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values. This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on. NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document. Fixes #8192 2016-01-12 11:40:34 -05:00
			NOTE: The parameter `from` must be set to 0 (or -1) when `search_after` is used.

			`search_after` is not a solution to jump freely to a random page but rather to scroll many queries in parallel.
			It is very similar to the `scroll` API but unlike it, the `search_after` parameter is stateless, it is always resolved against the latest
			`version of the searcher. For this reason the sort order may change during a walk depending on the updates and deletes of your index.`