To be consistent with the `search.max_buckets` default setting,
set the hard limit of the PriorityQueue used for in memory sorting,
when sorting on an aggregate function, to 10000.
Fixes: #43168
(cherry picked from commit 079e012fdea68ea0a7daae078359495047e9c407)
The machine learning feature of xpack has native binaries with a
different commit id than the rest of code. It is currently exposed in
the xpack info api. This commit adds that commit information to the ML
info api, so that it may be removed from the info api.
Now emphasises the test is for indexed values.
Previous documentation only mentioned the state of the input JSON doc (null values) but this is only one of several reasons why an indexed value may not exist.
Closes#24256
The description field of xpack featuresets is optionally part of the
xpack info api, when using the verbose flag. However, this information
is unnecessary, as it is better left for documentation (and the existing
descriptions describe anything meaningful). This commit removes the
description field from feature sets.
Rest docs page update
- have the section be on separate pages
- add an Overview page
- add other formats examples
(cherry picked from commit 309bd691ff3f8625f67ca09fc1dd8e265f7e6c92)
* [ML] Adding support for geo_shape, geo_centroid, geo_point in datafeeds
* only supporting doc_values for geo_point fields
* moving validation into GeoPointField ctor
Previously, a reindex request had two different size specifications in the body:
* Outer level, determining the maximum documents to process
* Inside the source element, determining the scroll/batch size.
The outer level size has now been renamed to max_docs to
avoid confusion and clarify its semantics, with backwards compatibility and
deprecation warnings for using size.
Similarly, the size parameter has been renamed to max_docs for
update/delete-by-query to keep the 3 interfaces consistent.
Finally, all 3 endpoints now support max_docs in both body and URL.
Relates #24344
This change adds the earliest and latest timestamps into
the field stats for fields of type "date" in the output of
the ML find_file_structure endpoint. This will enable the
cards for date fields in the file data visualizer in the UI
to be made to look more similar to the cards for date
fields in the index data visualizer in the UI.
Adds a metadata field to snapshots which can be used to store arbitrary
key-value information. This may be useful for attaching a description of
why a snapshot was taken, tagging snapshots to make categorization
easier, or identifying the source of automatically-created snapshots.
The `replace` option in the phonetic token filter can have suprising side
effects, e.g. such as described in #26921. This PR adds a note to be mindful
about such scenarios and offers alternatives to using the `replace` option.
Closes#26921
This commit adds functionality so that aliases that are manipulated on
leader indices are replicated by the shard follow tasks to the follower
indices. Note that we ignore write indices. This is due to the fact that
follower indices do not receive direct writes so the concept is not
useful.
Relates #41815
Adding notes to the existing docs about how using `preference` might increase
request cache utilization but also add warning about the downsides.
Closes#24278
This commit addresses a few more frequently-asked questions:
* clarifies that bootstrapping doesn't happen even after a full cluster
restart.
* removes the example that uses IP addresses, to try and further encourage the
use of node names for bootstrapping.
* clarifies that auto-bootstrapping might form different clusters on different
hosts, and gives a process for starting again if this wasn't what you wanted.
* adds the "do not stop half-or-more of the master-eligible nodes" slogan that
was notably absent.
* reformats one of the console examples to a narrower width
For `multi_match` query: link `boost` param to the generic reference
for query usage and `slop` to the `match_phrase` query where its usage
is documented.
Fixes: #40091
(cherry picked from commit 69993049a8bd9e7f042935729fe69a8266d95a0a)
Add an explanatory NOTE section to draw attention to the difference
between small and capital letters used for the index date patterns.
e.g.: HH vs hh, MM vs mm.
Closes: #22322
(cherry picked from commit c8125417dc33215651f9bb76c9b1ffaf25f41caf)
Fix a couple of wrong links because of the order of the anchor
and the usage of backquotes.
(cherry picked from commit 4e0c6525153b60a57202937c2ae57968c8e35285)
When analysing a semi-structured text file the
find_file_structure endpoint merges lines to form
multi-line messages using the assumption that the
first line in each message contains the timestamp.
However, if the timestamp is misdetected then this
can lead to excessive numbers of lines being merged
to form massive messages.
This commit adds a line_merge_size_limit setting
(default 10000 characters) that halts the analysis
if a message bigger than this is created. This
prevents significant CPU time being spent subsequently
trying to determine the internal structure of the
huge bogus messages.
This PR updates the docs for `docvalue_fields` and `stored_fields` to clarify
that nested fields must be accessed through `inner_hits`. It also tweaks the
nested fields documentation to make this point more visible.
Addresses #23766.