Co-authored-by: Daniel Huang <danielhuang@tencent.com>
This is a spinoff of #48130 that generalizes the proposal to allow early termination with the composite aggregation when leading sources match a prefix or the entire index sort specification.
In such case the composite aggregation can use the index sort natural order to early terminate the collection when it reaches a composite key that is greater than the bottom of the queue.
The optimization is also applicable when a query other than match_all is provided. However the optimization is deactivated for sources that match the index sort in the following cases:
* Multi-valued source, in such case early termination is not possible.
* missing_bucket is set to true
* Update remote cluster stats to support simple mode (#49961)
Remote cluster stats API currently only returns useful information if
the strategy in use is the SNIFF mode. This PR modifies the API to
provide relevant information if the user is in the SIMPLE mode. This
information is the configured addresses, max socket connections, and
open socket connections.
* Send hostname in SNI header in simple remote mode (#50247)
Currently an intermediate proxy must route conncctions to the
appropriate remote cluster when using simple mode. This commit offers
a additional mechanism for the proxy to route the connections by
including the hostname in the TLS SNI header.
* Rename the remote connection mode simple to proxy (#50291)
This commit renames the simple connection mode to the proxy connection
mode for remote cluster connections. In order to do this, the mode specific
settings which we namespaced by their mode (ex: sniff.seed and
proxy.addresses) have been reverted.
* Modify proxy mode to support a single address (#50391)
Currently, the remote proxy connection mode uses a list setting for the
proxy address. This commit modifies this so that the setting is
proxy_address and only supports a single remote proxy address.
The freeze index API docs state that frozen indices are blocked for
write operations.
While this implies frozen indices are read-only, it does not explicitly
use the term "read-only", which is found in other docs, such as the
force merge docs.
This adds the "ready-only" term to the freeze index API docs as well
as other clarification.
The `filter` rule is not allowed on the top-level of the query, so removing it
from the list of allowed rules. Where it can be nested inside other rules, those
rules already mention it.
Docker bypasses the Uncomplicated Firewall (UFW) on Linux by editing the `iptables` config directly, which leads to the exposure of port 9200, even if you blocked it via UFW.
This adds a warning along with work-arounds to the docs.
Signed-off-by: Kovah <mail@kovah.de>
Users often mistakenly map numeric IDs to numeric datatypes. However,
this is often slow for the `term` and other term-level queries.
The "Tune for search speed" docs includes advice for mapping numeric
IDs to `keyword` fields. However, this tip is not included in the
`numeric` or `keyword` field datatype doc pages.
This rewords the tip in the "Tune for search speed" docs, relocates it
to the `numeric` field docs, and reuses it using tagged regions.
For 7.x and earlier branches, `_cat/repositories` API requests require a
repository name.
This removes an erroneous request example without a repository name
added with a8e0275.
Lucene 8.4 added support for "CONTAINS", therefore in this commit those
changes are integrated in Elasticsearch. This commit contains as well a
bug fix when querying with a geometry collection with "DISJOINT" relation.
* [DOCS] Document JVM node stats
Documents the `jvm` parameters returned by the `_nodes/stats` API.
Co-Authored-By: James Baiera <james.baiera@gmail.com>
The example snippets in the percentile rank agg docs use a test dataset
named `latency`, which is generated from docs/gradle.build.
At some point the dataset and example snippets were updated, but the
text surrounding the snippets was not. This means the text and the
example snippets shown no longer match up.
This corrects that by changing the snippets using /TESTRESPONSE magic comments.
* CSV ingest processor (#49509)
This change adds new ingest processor that breaks line from CSV file into separate fields.
By default it conforms to RFC 4180 but can be tweaked.
Closes#49113
As we discussed in #36371, interval notation is confusing to some users. This makes the intention clearer by just explaining inclusivity and exclusivity in the docs.
Removes a reference to shadow replicas from the cat shards API docs
and a comment in cluster/routing/UnassignedInfo.java.
Shadow replicas were removed with #23906.
This adds a new `randomize_seed` for regression and classification.
When not explicitly set, the seed is randomly generated. One can
reuse the seed in a similar job in order to ensure the same docs
are picked for training.
Backport of #49990
The current snippets in the synced flush docs can cause conflicts with
other background syncs, such as the global checkpoint sync or retention
lease sync, in the docs tests.
This skips tests for those snippets to avoid conflicts.
In the shape query docs, the index mapping snippet uses the "geometry"
shape field mapping. However, the doc index snippet uses the "location"
property.
This changes the "location" property to "geometry". It also adds a
comment containing the search result snippet. This should prevent
similar issues in the future.
This commit changes the recommended repository file for rpm based
systems to be disabled by default. This is a safer practice so upgrades
of the system do no accidentally upgrade elasticsearch itself.
closes#30660
* Allow list of IPs in geoip ingest processor
This change lets you use array of IPs in addition to string in geoip processor source field.
It will set array containing geoip data for each element in source, unless first_only parameter
option is enabled, then only first found will be returned.
Closes#46193