When analysing a semi-structured text file the
find_file_structure endpoint merges lines to form
multi-line messages using the assumption that the
first line in each message contains the timestamp.
However, if the timestamp is misdetected then this
can lead to excessive numbers of lines being merged
to form massive messages.
This commit adds a line_merge_size_limit setting
(default 10000 characters) that halts the analysis
if a message bigger than this is created. This
prevents significant CPU time being spent subsequently
trying to determine the internal structure of the
huge bogus messages.
This commit clones the existing AnalyzeRequest/AnalyzeResponse classes
to the high-level rest client, and adjusts request converters to use these new
classes.
This is a prerequisite to removing the Streamable interface from the internal
server version of these classes.
In hamcrest 2.1 warnings for unchecked varargs were fixed by hamcrest using @SafeVarargs for those matchers where this warning occurred.
This PR is aimed to remove these annotations when Matchers.contains ,Matchers.containsInAnyOrder or Matchers.hasItems was used
backport #41528
The existing `RequestConverters.Params` is confusing, because it wraps
an underlying request object and mutations of the `Params` object
actually mutate the `Request` that was used in the construction of the
`Params`.
This leads to a situation where we create a `RequestConverter.Params`
object, mutate it, and then it appears nothing happens to it - it
appears to be unused. What happens behind the scenes is that the Request
object is mutated when methods on `Params` are invoked. This results in
unclear, confusing code where mutating one object changes another with
no obvious connection.
This commit refactors `RequestConverters.Params` to be a simple helper
class to produce a `Map` which must be passed explicitly to a Request
object. This makes it apparent that the `Params` are actually used, and
that they have an effect on the `request` object explicit and easier to
understand.
Co-authored-by: Ojas Gulati <ojasgulati100@gmail.com>
* Now that we process the bulk requests themselves on the WRITE threadpool, they can run out of retries too like the item requests even when backoff is active
* Fixes#41324 by using the same logic that checks failed item requests for their retry status for the top level bulk requests as well
* add support for fixed_interval, calendar_interval, remove interval
* adapt HLRC
* checkstyle
* add a hlrc to server test
* adapt yml test
* improve naming and doc
* improve interface and add test code for hlrc to server
* address review comments
* repair merge conflict
* fix date patterns
* address review comments
* remove assert for warning
* improve exception message
* use constants
As a follow-up to #38540 we can use lambda functions and method
references where convenient in the low-level REST client.
Also, we need to update the docs to state that the minimum java version
required is 1.8.
Removing of payload in BulkRequest (#39843) had a side effect of making
`BulkRequest.add(DocWriteRequest<?>...)` (with varargs) recursive, thus
leading to StackOverflowError. This PR adds a small change in
RequestConvertersTests to show the error and the corresponding fix in
`BulkRequest`.
Fixes#41668
The date_histogram accepts an interval which can be either a calendar
interval (DST-aware, leap seconds, arbitrary length of months, etc) or
fixed interval (strict multiples of SI units). Unfortunately this is inferred
by first trying to parse as a calendar interval, then falling back to fixed
if that fails.
This leads to confusing arrangement where `1d` == calendar, but
`2d` == fixed. And if you want a day of fixed time, you have to
specify `24h` (e.g. the next smallest unit). This arrangement is very
error-prone for users.
This PR adds `calendar_interval` and `fixed_interval` parameters to any
code that uses intervals (date_histogram, rollup, composite, datafeed, etc).
Calendar only accepts calendar intervals, fixed accepts any combination of
units (meaning `1d` can be used to specify `24h` in fixed time), and both
are mutually exclusive.
The old interval behavior is deprecated and will throw a deprecation warning.
It is also mutually exclusive with the two new parameters. In the future the
old dual-purpose interval will be removed.
The change applies to both REST and java clients.
This commit updates the default ciphers and TLS protocols that are used
when the runtime JDK supports them. New cipher support has been
introduced in JDK 11 and 12 along with performance fixes for AES GCM.
The ciphers are ordered with PFS ciphers being most preferred, then
AEAD ciphers, and finally those with mainstream hardware support. When
available stronger encryption is preferred for a given cipher.
This is a backport of #41385 and #41808. There are known JDK bugs with
TLSv1.3 that have been fixed in various versions. These are:
1. The JDK's bundled HttpsServer will endless loop under JDK11 and JDK
12.0 (Fixed in 12.0.1) based on the way the Apache HttpClient performs
a close (half close).
2. In all versions of JDK 11 and 12, the HttpsServer will endless loop
when certificates are not trusted or another handshake error occurs. An
email has been sent to the openjdk security-dev list and #38646 is open
to track this.
3. In JDK 11.0.2 and prior there is a race condition with session
resumption that leads to handshake errors when multiple concurrent
handshakes are going on between the same client and server. This bug
does not appear when client authentication is in use. This is
JDK-8213202, which was fixed in 11.0.3 and 12.0.
4. In JDK 11.0.2 and prior there is a bug where resumed TLS sessions do
not retain peer certificate information. This is JDK-8212885.
The way these issues are addressed is that the current java version is
checked and used to determine the supported protocols for tests that
provoke these issues.
Improve the hard_limit memory audit message by reporting how many bytes
over the configured memory limit the job was at the point of the last
allocation failure.
Previously the model memory usage was reported, however this was
inaccurate and hence of limited use - primarily because the total
memory used by the model can decrease significantly after the models
status is changed to hard_limit but before the model size stats are
reported from autodetect to ES.
While this PR contains the changes to the format of the hard_limit audit
message it is dependent on modifications to the ml-cpp backend to
send additional data fields in the model size stats message. These
changes will follow in a subsequent PR. It is worth noting that this PR
must be merged prior to the ml-cpp one, to keep CI tests happy.
* [ML] adding pivot.size option for setting paging size
* Changing field name to address PR comments
* fixing ctor usage
* adjust hlrc for field name change
* [ML] Adds progress reporting for transforms
* fixing after master merge
* Addressing PR comments
* removing unused imports
* Adjusting afterKey handling and percentage to be 100*
* Making sure it is a linked hashmap for serialization
* removing unused import
* addressing PR comments
* removing unused import
* simplifying code, only storing total docs and decrementing
* adjusting for rewrite
* removing initial progress gathering from executor
Today the `_field_caps` API returns the list of indices where a field
is present only if this field has different types within the requested indices.
However if the request is an index pattern (or an alias, or both...) there
is no way to infer the indices if the response contains only fields that have
the same type in all indices. This commit changes the response to always return
the list of indices in the response. It also adds a way to retrieve unmapped field
in a specific section per field called `unmapped`. This section is created for each field
that is present in some indices but not all if the parameter `include_unmapped` is set to
true in the request (defaults to false).
In order to support empty action metadata in the first msearch item,
we need to remove support for prepending msearch request body with an
empty line, which prevents us from parsing the empty line as action
metadata for the first search item.
Relates to #41011
hamcrest has some improvements in newer versions, like FileMatchers
that make assertions regarding file exists cleaner. This commit upgrades
to the latest version of hamcrest so we can start using new and improved
matchers.
* The test fails for the retry backoff enabled case because the retry handler in the bulk processor hasn't been adjusted to account for #40866 which now might lead to an outright rejection of the request instead of its items individually
* Fixed by adding retry functionality to the top level request as well
* Also fixed the duplicate test for the HLRC that wasn't handling the non-backoff case yet the same way the non-client IT did
* closes#41324
This change adds either ToXContentObject or ToXContentFragment to classes
directly implementing ToXContent currently. This helps in reasoning about
whether those implementations output full xcontent object or just fragments.
Relates to #16347
* moved hlrc parsing tests from xpack to hlrc module and removed dependency on hlrc from xpack core
* deprecated old base test class
* added deprecated jdoc tag
* split test between xpack-core part and hlrc part
* added lang-mustache test dependency, this previously came in via
hlrc dependency.
* added hlrc dependency on a qa module
* duplicated ClusterPrivilegeName class in xpack-core, since x-pack
core no longer has a dependency on hlrc.
* replace ClusterPrivilegeName usages with string literals
* moved tests to dedicated to hlrc packages in order to remove Hlrc part from the name and make sure to use imports instead of full qualified class where possible
* remove ESTestCase. from method invocation and use method directly,
because these tests indirectly extend from ESTestCase
changed hlrc ccr request tests to use AbstractRequestTestCase base class.
This way the request classes are tested in a more realistic setting.
Note this change also adds a test dependency on xpack core module.
Similar to #39844 but then for hlrc request serialization tests.
Removed iterators from hlrc parsing tests.
Use empty xcontent registries.
Relates to #39745
* Replace usages RandomizedTestingTask with built-in Gradle Test (#40978)
This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.
(cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6)
* Fix forking JVM runner
* Don't bump shadow plugin version
Made changes to ensure that unique IDs are generated for model snapshots
used by the deleteExpiredDataTest test in the MachineLearningIT suite.
Previously a sleep of 1s was performed between jobs under the assumption
that this would be sufficient to guarantee that the timestamps used in
the composition of the snapshot IDs would be different.
The new approach is to wait on the condition that the old and new
timestamps are in fact different (to 1s resolution).
This change updates our version of httpclient to version 4.5.8, which
contains the fix for HTTPCLIENT-1968, which is a bug where the client
started re-writing paths that contained encoded reserved characters
with their unreserved form.
Many gradle projects specifically use the -try exclude flag, because
there are many cases where auto-closeable resource ignore is never
referenced in body of corresponding try statement. Suppressing this
warning specifically in each case that it happens using
`@SuppressWarnings("try")` would be very verbose.
This change removes `-try` from any gradle project and adds it to the
build plugin. Also this change removes exclude flags from gradle projects
that is already specified in build plugin (for example -deprecation).
Relates to #40366
This commit fixes a problem with BWC that was brought up in #40511. A
newer version of the code was emitting a new value for an enum to an
older version, and the older version could not handle that. It caused
the response to error. The MainResponse is now relaxed, and will accept
whatever values the server expose, and holds most of them as Strings
instead of complex objects.
Fixes#40511
This adds a new `role_templates` field to role mappings that is an
alternative to the existing roles field.
These templates are evaluated at runtime to determine which roles should be
granted to a user.
For example, it is possible to specify:
"role_templates": [
{ "template":{ "source": "_user_{{username}}" } }
]
which would mean that every user is assigned to their own role based on
their username.
You may not specify both roles and role_templates in the same role
mapping.
This commit adds support for templates to the role mapping API, the role
mapping engine, the Java high level rest client, and Elasticsearch
documentation.
Due to the lack of caching in our role mapping store, it is currently
inefficient to use a large number of templated role mappings. This will be
addressed in a future change.
Backport of: #39984, #40504