Today we document the settings used to control rebalancing and
disk-based shard allocation but there isn't really any discussion around
what these processes do so it's hard to know what, if any, adjustments
to make.
This commit adds some words to help folk understand this area better.
In case the local agg sorter queue gets full and no limit has been provided,
the local sorter will now erroneously call the failure callback for every
single row in the original rowset that's left over the local queue limit
(instead for just the first one). The failure response is dispatched in any
case, so this is relatively harmless. The sorter continues iterating on the
original response fetching subsequent pages. In case of correct Elasticsearch
behaviour, this is also harmless, it'll just trigger a number of internal
exceptions. However, in case of a pagination defect in Elasticsearch (like
GH#65685, where the same search_after is returned), this will result in an
effective spin loop, potentially rendering eventually the node unresponsive.
This PR simply breaks both the inner loop iterating over the current unsorted
rowset, as well as the outer one, iterating over the left pages.
It also fixes an outdated documentation limitation.
(cherry picked from commit 638402c387faf79bba38fcc95f371a73146efc0b)
* Add release notes for 7.10.1
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
The `lucene_version_path` variable is used in urls pointing to the Lucene docs.
They use underscore separators for the major and minor versions there.
Correcting this since this has been lost in the latest update on the 7.x branch.
Whether the cold tier can handle years depends a lot on the use case and
for instance our BWC guarantees. This would need to be part of a
specific sizing exercise, so in the spirit of not over-promising, the
description of the cold tier has been changed to not mention years.
Today we describe snapshots as "incremental" but their incrementality is
rather different beast from e.g. incremental filesystem backups. With
traditional backups you take a large and relatively infrequent "full"
backup and then a sequence of smaller "incremental" ones, and this whole
sequence of backups is required for a restore so it must be kept around
until at least the next full backup. In contrast, Elasticsearch
snapshots are logically independent and each can be deleted without
affecting the integrity of the others.
This distinction frequently causes confusion amongst newer users, so
this commit clarifies what we mean by "incremental" in the docs.
Clarify that searchable snapshots only result in cost savings for less
frequently accessed data and that the savings do not apply to the entire
cluster.
This PR adds detail to the explanation of the soft_limit
memory_status in ML job stats. A consequence that was not
mentioned before is that examples are not added to category
definitions.
Relates elastic/ml-cpp#1590
The _Important Elasticsearch configuration_ docs lists a number of items
that you should consider before moving to production. Today this list
does not include configuring snapshots, even though they're very
important to have in production. This commit addresses that omission,
removes some repetition from the introductory paragraphs, and notes that
this config is handled for you on Cloud.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
* Clarify that field data cache includes global ordinals
* Describe that the cache should be cleared once the limit is reached
* Clarify that the `_id` field does not supported aggregations anymore
* Fold the `fielddata` mapping parameter page into the `text field docs
* Improve cross-linking