Currently, there are a few tests that use autoMinMasterNodes=false and
hence override addExtraClusterBootstrapSettings, mostly this is 10-30
lines of codes that are copy-pasted from class to class.
This PR introduces `InternalTestCluster.setBootstrapMasterNodeIndex`
which is suitable for all classes and copy-paste could be removed.
Removing code is always a good thing!
When the ingest node user agent parses the device field, it
will result in a string value. To match the ecs schema
this commit moves the value of the parsed device to an
object with an inner field named 'name'. There are not
any passivity concerns since this modifies an unreleased change.
closes#38094
relates #37329
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.
This is a continuation of #28667, #36137 and also fixes#37708.
With #37566 we have introduced the ability to merge multiple search responses into one. That makes it possible to expose a new way of executing cross-cluster search requests, that makes CCS much faster whenever there is network latency between the CCS coordinating node and the remote clusters. The coordinating node can now send a single search request to each remote cluster, which gets reduced by each one of them. from + size results are requested to each cluster, and the reduce phase in each cluster is non final (meaning that buckets are not pruned and pipeline aggs are not executed). The CCS coordinating node performs an additional, final reduction, which produces one search response out of the multiple responses received from the different clusters.
This new execution path will be activated by default for any CCS request unless a scroll is provided or inner hits are requested as part of field collapsing. The search API accepts now a new parameter called ccs_minimize_roundtrips that allows to opt-out of the default behaviour.
Relates to #32125
The existing implementation was slow due to exceptions being thrown if
an accessor did not have a time zone. This implementation queries for
having a timezone, local time and local date and also checks for an
instant preventing to throw an exception and thus speeding up the conversion.
This removes the existing method and create a new one named
DateFormatters.from(TemporalAccessor accessor) to resemble the naming of
the java time ones.
Before this change an epoch millis parser using the toZonedDateTime
method took approximately 50x longer.
Relates #37826
Adds reindex.ssl.* settings for reindex from remote.
This uses the ssl-config/ internal library to parse and load SSL
configuration and files. This is applied when using the low level
rest client to connect to a remote ES node
Relates: #37287Resolves: #29755
Today we pass `discovery.zen.minimum_master_nodes` to nodes started up in
tests, but for 7.x nodes this setting is not required as it has no effect.
This commit removes this setting so that nodes are started with more realistic
configurations, and deprecates it.
* Add ECS schema for user-agent ingest processor (#37727)
This switches the format of the user agent processor to use the schema from [ECS](https://github.com/elastic/ecs).
So rather than something like this:
```
{
"patch" : "3538",
"major" : "70",
"minor" : "0",
"os" : "Mac OS X 10.14.1",
"os_minor" : "14",
"os_major" : "10",
"name" : "Chrome",
"os_name" : "Mac OS X",
"device" : "Other"
}
```
The structure is now like this:
```
{
"name" : "Chrome",
"original" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36",
"os" : {
"name" : "Mac OS X",
"version" : "10.14.1",
"full" : "Mac OS X 10.14.1"
},
"device" : "Other",
"version" : "70.0.3538.102"
}
```
This is now the default for 7.0. The deprecated `ecs` setting in 6.x is not
supported.
Resolves#37329
* Remove `ecs` setting from docs
This drops the option for tests to disable strict deprecation mode in
the low level rest client in favor of configuring expected warnings on
any calls that should expect warnings. This behavior is
paranoid-by-default which is generally the right way to handle
deprecations and tests in general.
This PR attempts to remove all typed calls from our YAML REST tests. The PR adds include_type_name: false to create index requests that use a mapping and also to put mapping requests. It also removes _type from index requests where they haven't already been removed. The PR ignores tests named *_with_types.yml since this are specifically testing typed API behaviour.
The change also includes changing the test harness to add the type _doc to index, update, get and bulk requests that do not specify the document type when the test is running against a mixed 7.x/6.x cluster.
The delete and update by query APIs both offer protection against overriding concurrent user changes to the documents they touch. They currently are using internal versioning. This PR changes that to rely on sequences numbers and primary terms.
Relates #37639
Relates #36148
Relates #10708
* Remove empty statements
There are a couple of instances of undocumented empty statements all across the
code base. While they are mostly harmless, they make the code hard to read and
are potentially error-prone. Removing most of these instances and marking blocks
that look empty by intention as such.
* Change test, slightly more verbose but less confusing
The ingest date processor is currently only able to parse joda formats.
However it is not using the existing elasticsearch classes but access
joda directly. This means that our existing BWC layer does not notify
the user about deprecated formats. This commit switches to use the
exising Elasticsearch Joda methods to acquire a date format, that
includes the BWC check and the ability to parse java 8 dates.
The date parsing in ingest has also another extra feature, that the
fallback year, when a date format without a year is used, is the current
year, and not 1970 like usual. This is currently not properly supported
in the DateFormatter class. As this is the only case for this feature
and java time can take care of this using the toZonedDateTime() method,
a workaround just for the joda time parser has been created, that can be
removed soon again from 7.0.
Ranaming as follows:
feature -> rank_feature
feature_vector -> rank_features
feature query -> rank_feature query
Ranaming is done to distinguish from other vector types.
Closes#36723
This commit moves the aggregation and mapping code from joda time to
java time. This includes field mappers, root object mappers, aggregations with date
histograms, query builders and a lot of changes within tests.
The cut-over to java time is a requirement so that we can support nanoseconds
properly in a future field mapper.
Relates #27330
Users may require the sequence number and primary terms to perform optimistic concurrency control operations. Currently, you can get the sequence number via the `docvalues_fields` API but the primary term is not accessible because it is maintained by the `SeqNoFieldMapper` and the infrastructure can't find it.
This commit adds a dedicated sub fetch phase to return both numbers that is connected to a new `seq_no_primary_term` parameter.
This adds a set of helper classes to determine if an agg "has a value".
This is needed because InternalAggs represent "empty" in different
manners according to convention. Some use `NaN`, `+/- Inf`, `0.0`, etc.
A user can pass the Internal agg type to one of these helper methods
and it will report if the agg contains a value or not, which allows the
user to differentiate "empty" from a real `NaN`.
These helpers are best-effort in some cases. For example, several
pipeline aggs share a single return class but use different conventions
to mark "empty", so the helper uses the loosest definition that applies
to all the aggs that use the class.
Sums in particular are unreliable. The InternalSum simply returns 0.0
if the agg is empty (which is correct, no values == sum of zero). But this
also means the helper cannot differentiate from "empty" and `+1 + -1`.
This commit introduces a NetworkMessage class. This class has two
subclasses - InboundMessage and OutboundMessage. These messages can
be serialized and deserialized independent of the transport. This allows
more granular testing. Additionally, the serialization mechanism is now
a simple Supplier. This builds the framework to eventually move the
serialization of transport messages to the network thread. This is the
one serialization component that is not currently performed on the
network thread (transport deserialization and http serialization and
deserialization are all on the network thread).
This adds deprecation to _type in the script contexts for ingest and update.
This adds a DeprecationMap that wraps the ctx Map containing _type for these
specific contexts.
* Default include_type_name to false for get and put mappings.
* Default include_type_name to false for get field mappings.
* Add a constant for the default include_type_name value.
* Default include_type_name to false for get and put index templates.
* Default include_type_name to false for create index.
* Update create index calls in REST documentation to use include_type_name=true.
* Some minor clean-ups around the get index API.
* In REST tests, use include_type_name=true by default for index creation.
* Make sure to use 'expression == false'.
* Clarify the different IndexTemplateMetaData toXContent methods.
* Fix FullClusterRestartIT#testSnapshotRestore.
* Fix the ml_anomalies_default_mappings test.
* Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests.
We make sure to specify include_type_name=true during xContent parsing,
so we continue to test the legacy typed responses. XContent generation
for the typeless responses is currently only covered by REST tests,
but we will be adding unit test coverage for these as we implement
each typeless API in the Java HLRC.
This commit also refactors GetMappingsResponse to follow the same appraoch
as the other mappings-related responses, where we read include_type_name
out of the xContent params, instead of creating a second toXContent method.
This gives better consistency in the response parsing code.
* Fix more REST tests.
* Improve some wording in the create index documentation.
* Add a note about types removal in the create index docs.
* Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL.
* Make sure to mention include_type_name in the REST docs for affected APIs.
* Make sure to use 'expression == false' in FullClusterRestartIT.
* Mention include_type_name in the REST templates docs.
Added warnings checks to existing tests
Added “defaultTypeIfNull” to DocWriteRequest interface so that Bulk requests can override a null choice of document type with any global custom choice.
Related to #35190
This change fixes an unreleased bug that trips an assertion because a static instance
shared among threads is modified during the search. This commit copies the static
instance in order to ensure that each thread can modify the value without modifying
the other instances.
Closes#37179Closes#37266
* ingest: compile mustache template only if field includes '{{''
Prior to this change, any field in an ingest node processor that supports
script templates would be compiled as mustache template regardless if they
contain a template or not. Compiling normal text as mustache templates is
harmless. However, each compilation counts against the script compilation
circuit breaker. A large number of processors without any templates or scripts
could un-intuitively trip the too many script compilations circuit breaker.
This change simple checks for '{{' in the text before it attempts to compile.
fixes#37120
* [Analysis] Deprecate Standard Html Strip Analyzer
Deprecate only Standard Html Strip Analyzer
If user create index with the analyzer since 7.0, es throws an exception.
If an index was created before 7.0, es issue deprecation log
We will remove it in 8.0
Related #4704
In Lucene 8 searches can skip non-competitive hits if the total hit count is not requested.
It is also possible to track the number of hits up to a certain threshold. This is a trade off to speed up searches while still being able to know a lower bound of the total hit count. This change adds the ability to set this threshold directly in the track_total_hits search option. A boolean value (true, false) indicates whether the total hit count should be tracked in the response. When set as an integer this option allows to compute a lower bound of the total hits while preserving the ability to skip non-competitive hits when enough matches have been collected.
Relates #33028
Types can be used both in the source and dest section of the body which will
be translated to search and index requests respectively. Adding a deprecation warning
for those cases and removing examples using more than one type in reindex since
support for this is going to be removed.