Prepares classification analysis to support more than just
two classes. It introduces a new parameter to the process config
which dictates the `num_classes` to the process. It also
changes the max classes limit to `30` provisionally.
Backport of #53539
Add missing asScript() implementation for LIKE/RLIKE expressions.
When LIKE/RLIKE are used for example in GROUP BY or are wrapped with
scalar functions in a WHERE clause, the translation must produce a
painless script which will be executed to implement the correct
behaviour and previously this was completely missing, and as a
consquence wrong results were silently (no error) returned.
Fixes: #53486
(cherry picked from commit eaa8ead6742a8e7dcf343bcbaff8de031550fd77)
When 'rest_track_total_hits_as_int' is set to true, the total hits count in the response should be accurate. So we should set trackTotalHits to true if need when parsing the inline script of a search template request.
Closes#52801
Today the NodeConnectionsService emits a DEBUG-level log message each time it
calls TransportService#connectToNode, which happens for every node in the
cluster every ten seconds, and also at every cluster state update. That's a lot
of log messages. Most of these calls are no-ops and can be ignored, but if the
call was not a no-op then it may be worth investigating further. Since the logs
do not distinguish the interesting and uninteresting cases, they are not
useful.
This commit distinguishes the two cases and pushes the noisy logging for the
common no-op case down to TRACE level, leaving only useful and actionable
information in the DEBUG-level logs.
the ML portion of the x-pack info API was erroneously counting configuration documents and definition documents. The underlying implementation of our storage separates the two out.
This PR filters the query so that only trained model config documents are counted.
Previously, Term Vectors API was returning empty results for
artificial documents with keyword fields. Checking only for `string()`
on `IndexableField` is not enough, since for `KeywordFieldType`
`binaryValue()` must be used instead.
Fixes#53494
(cherry picked from commit 1fc3fe3d32f41eab2101c0536751b7c47e63cc48)
Adds a new parameter for classification that enables choosing whether to assign labels to
maximise accuracy or to maximise the minimum class recall.
Fixes#52427.
Remove excessive testing and keep only the checks for when the queries
are disallowed. Fix also the check for the initial value of the setting
to be conmbatible with Go client tests.
(cherry picked from commit 314145294ea926e069c6f8629dfc622a7f31a0fb)
* Use snake case for nodes stats/info metric names (#53446)
The REST API uses "thread_pool" as the name of the thread pool metric.
If we use this name internally when we serialize nodes stats and info
requests, we won't need to do any fancy logic to check for and switch
out "threadPool", which was the previous internal name.
This change makes it possible to send secondary authentication
credentials to select endpoints that need to perform a single action
in the context of two users.
Typically this need arises when a server process needs to call an
endpoint that users should not (or might not) have direct access to,
but some part of that action must be performed using the logged-in
user's identity.
Backport of: #52093
This change adds a new parameter to the authenticate methods in the
AuthenticationService to optionally exclude support for the anonymous
user (if an anonymous user exists).
Backport of: #52094
This commit fixes a bug on sorted queries with a primary sort field
that uses different types in the requested indices. In this scenario
the returned min/max values to sort the shards are not comparable so
we should avoid the sorting rather than throwing an obscure exception.
This commit upgrades the jackson-databind depdendency to
2.8.11.6. Additionally, we revert a previous change that put
ingest-geoip on the version of jackson-databind from the version
properties file. This is because upgrading ingest-geoip to a later
version of jackson-databind also requires an upgrade to the geoip2
dependency which is currently blocked. Therefore, if we can get to a
point where we otherwise upgrade our Jackson dependencies, we do not
want ingest-geoip to automatically come along with it.
* Add ComponentTemplate to MetaData (#53290)
* Add ComponentTemplate to MetaData
This adds a `ComponentTemplate` datastructure that will be used as part of #53101 (Index Templates
v2) to the `MetaData` class. Currently there are no APIs for interacting with this class, so it will
always be an empty map (other than in tests). This infrastructure will be built upon to add APIs in
a subsequent commit.
A `ComponentTemplate` is made up of a `Template`, a version, and a MetaData.Custom class. The
`Template` contains similar information to an `IndexTemplateMetaData` object— settings, mappings,
and alias configuration.
* Update minimal supported version constant
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This changes the `top_metrics` aggregation to return metrics in their
original type. Since it only supports numerics, that means that dates,
longs, and doubles will come back as stored, with their appropriate
formatter applied.
This change removes the Lucene's experimental flag from the documentations of the following
tokenizer/filters:
* Simple Pattern Split Tokenizer
* Simple Pattern tokenizer
* Flatten Graph Token Filter
* Word Delimiter Graph Token Filter
The flag is still present in Lucene codebase but we're fully supporting these tokenizers/filters
in ES for a long time now so the docs flag is misleading.
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
The spirit of StreamInput/StreamOutput is that common I/O patterns should
be handled by these classes so that the persistence methods in
application classes can be kept short, which facilitates easy visual
comparison between read and write methods, and reduces risks of having
serialization issues due to mismatched implementations.
To this end, this change adds readOptionalVLong and writeOptionalVLong
methods to these classes as we have started to build up cases where
that conditional/null logic has been implemented directly in the read &
write methods.
Co-authored-by: Tim Vernum <tim.vernum@elastic.co>
The setting, `xpack.logstash.enabled`, exists to enable or disable the
logstash extensions found within x-pack. In practice, this setting had
no effect on the functionality of the extension. Given this, the
setting is now deprecated in preparation for removal.
Backport of #53367
The tests in this class had been failing for a while, but went unnoticed as not tested by CI (see #53442).
The reason the tests fail is that the can-match phase is smarter now, and filters out access to a non-existing field.
Closes#53442
Restructures the 'Update an enrich policy' section to:
* Migrate the content to the section. It was previously stored in the
Put Enrich Policy API docs.
* Remove the warning tag admonition from the section content.
* Replace a reused section earlier in the "Set up an enrich processor"
page with a link.
No substantive changes were made to the content.
If an index was created in version 6 and contain a date field with a joda-style pattern it should still be allowed to search and insert document into it.
Those created in 6 but date pattern starts with 8, should be considered as java style.
The jdk and distribution download plugins create fake ivy repositories,
and use group based repository filtering to ensure no other artifacts
try to resolve against the fake repos. Currently this works by adding a
blanket exclude to all repositories for the given group name. This
commit changes to using the new exclusiveContent feature in
Gradle to do the exclusion.
Today we do not have any infrastructure for adding a deprecation check
for settings that are removed. This commit enables this by adding such
infrastructure. Note that this infrastructure is unused in this commit,
which is deliberate. However, the primary target for this commit is 7.x
where this infrastructue will be used, in a follow-up.
Adds a new `default_field_map` field to trained model config objects.
This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data.
The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on `foo.keyword` but the `_source` only references `foo`.
It looks like `date_nanos` fields weren't likely to work properly in
composite aggs because composites iterate field values using points and
we weren't converting the points into milliseconds. Because the doc
values were coming back in milliseconds we ended up geting very confused
and just never collecting sub-aggregations.
This fixes that by adding a method to `DateFieldMapper.Resolution` to
`parsePointAsMillis` which is similarly in name and function to
`NumberFieldMapper.NumberType`'s `parsePoint` except that it normalizes
to milliseconds which is what aggs need at the moment.
Closes#53168
Using a Long alone is not strong enough for the id of search contexts
because we reset the id generator whenever a data node is restarted.
This can lead to two issues:
1. Fetch phase can fetch documents from another index
2. A scroll search can return documents from another index
This commit avoids these issues by adding a UUID to SearchContexId.