Clarify doc count stats (#56665)

Today we report some statistics in terms of Lucene-level documents, which
differ from Elasticsearch-level documents in a number of ways and include
things like document tombstones which users cannot directly observe. This
commit clarifies the internal nature of these statistics.

Closes #56497
This commit is contained in:
David Turner 2020-05-13 15:07:15 +01:00
parent 9f1ecd52eb
commit 26382dff19

View File

@ -181,15 +181,20 @@ given in the query string.
end::df[] end::df[]
tag::docs-count[] tag::docs-count[]
Number of non-deleted documents in the segment, such as `25`. This The number of documents as reported by Lucene. This excludes deleted documents
number is based on Lucene documents and may include documents from and counts any <<nested,nested documents>> separately from their parents. It
<<nested,nested>> fields. also excludes documents which were indexed recently and do not yet belong to a
segment.
end::docs-count[] end::docs-count[]
tag::docs-deleted[] tag::docs-deleted[]
Number of deleted documents in the segment, such as `0`. This number The number of deleted documents as reported by Lucene, which may be higher or
is based on Lucene documents. {es} reclaims the disk space of deleted Lucene lower than the number of delete operations you have performed. This number
documents when a segment is merged. excludes deletes that were performed recently and do not yet belong to a
segment. Deleted documents are cleaned up by the
<<index-modules-merge,automatic merge process>> if it makes sense to do so.
Also, {es} creates extra deleted documents to internally track the recent
history of operations on a shard.
end::docs-deleted[] end::docs-deleted[]
tag::docs-indexed[] tag::docs-indexed[]