Mention the cost of tracking live docs in scrolls (#41375)

Relates #41337, in which a heap dump shows hundreds of MBs allocated on the heap for tracking the live docs for each scroll.
2019-04-23 15:26:14 +01:00 · 2019-04-23 15:26:14 +01:00 · 411994b489
parent e2f8ffdde8
commit 411994b489
1 changed files with 20 additions and 10 deletions
--- a/docs/reference/search/request/scroll.asciidoc
+++ b/docs/reference/search/request/scroll.asciidoc
@ -103,6 +103,12 @@ GET /_search?scroll=1m
 [[scroll-search-context]]
 ==== Keeping the search context alive

+A scroll returns all the documents which matched the search at the time of the
+initial search request. It ignores any subsequent changes to these documents.
+The `scroll_id` identifies a _search context_ which keeps track of everything
+that {es} needs to return the correct documents. The search context is created
+by the initial request and kept alive by subsequent requests.
+
 The `scroll` parameter (passed to the `search` request and to every `scroll`
 request) tells Elasticsearch how long it should keep the search context alive.
 Its value (e.g. `1m`, see <<time-units>>) does not need to be long enough to
@ -112,17 +118,21 @@ new  expiry time. If a `scroll` request doesn't pass in the `scroll`
 parameter, then the search context will be freed as part of _that_ `scroll`
 request.

-Normally, the background merge process optimizes the
-index by merging together smaller segments to create new bigger segments, at
-which time the smaller segments are deleted. This process continues during
-scrolling, but an open search context prevents the old segments from being
-deleted while they are still in use.  This is how Elasticsearch is able to
-return the results of the initial search request, regardless of subsequent
-changes to documents.
+Normally, the background merge process optimizes the index by merging together
+smaller segments to create new, bigger segments. Once the smaller segments are
+no longer needed they are deleted. This process continues during scrolling, but
+an open search context prevents the old segments from being deleted since they
+are still in use.

-TIP: Keeping older segments alive means that more file handles are needed.
-Ensure that you have configured your nodes to have ample free file handles.
-See <<file-descriptors>>.
+TIP: Keeping older segments alive means that more disk space and file handles
+are needed. Ensure that you have configured your nodes to have ample free file
+handles. See <<file-descriptors>>.
+
+Additionally, if a segment contains deleted or updated documents then the
+search context must keep track of whether each document in the segment was live
+at the time of the initial search request. Ensure that your nodes have
+sufficient heap space if you have many open scrolls on an index that is subject
+to ongoing deletes or updates.

 NOTE: To prevent against issues caused by having too many scrolls open, the
 user is not allowed to open scrolls past a certain limit. By default, the