Historically, the bootstrap checks used 2048 as the minimum limit for
the maximum number of threads. This limit was guided by the fact that
the number of processors was artificially capped at 32. This limit was
removed in 6.0.0 and the minimum limit was raised to 4096 to accommodate
this. However, the docs were not updated and this commit addresses that
miss.
This change adds the current primary term to the header of the current
translog file. Having a term in a translog header is a prerequisite step
that allows us to trim translog operations given the max valid seq# for
that term.
This commit also updates tests to conform the primary term invariant
which guarantees that all translog operations in a translog file have
its terms at most the term stored in the translog header.
* Add a helper method to get a random java.util.TimeZone
This adds a helper method to ESTestCase that returns a randomized
`java.util.TimeZone`. This can be used when transitioning code from Joda to the
JDK's time classes.
This commit moves the `TimeValue` class into the elasticsearch-core project.
This allows us to use this class in many of our other projects without relying
on the entire `server` jar.
Relates to #28504
This commit introduces built in support for adding files to the
keystore when configuring the integration test cluster for a project.
In order to use this support, simply add `keystoreFile` followed by the
secure setting name and the path to the source file inside the
integTestCluster closure for a project. The built in support will
handle the creation of the keystore and the addition of the file to the
keystore.
* Decouple TimeValue from Elasticsearch server classes
This commit decouples the `TimeValue` class from the other server classes. This
is in preperation to move `TimeValue` into the `elasticsearch-core` jar,
allowing us to use it from projects that cannot depend on the elasticsearch-core
library.
Relates to #28504
This change removes the check for extra tokens when parsing a source generated by a templated
_msearch request. This was added unintentionally in #29428 but the intent of this modification was to validate
simple _search request only.
The skeleton of ElasticsearchMergePolicy is quite similar to
MergePolicyWrapper. This commit therefore makes ElasticsearchMergePolicy
inherited from MergePolicyWrapper instead of MergePolicy.
Currently, a flush stats contains only the total flush which is the sum
of manual flush (via API) and periodic flush (async triggered when the
uncommitted translog size is exceeded the flush threshold). Sometimes,
it's useful to know these two numbers independently. This commit tracks
and returns a periodic flush count in a flush stats.
This adds an `include_type_name` option to the `indices.create`,
`indices.get_mapping` and `indices.put_mapping` APIs, which defaults to `true`.
When set to `false`, then mappings will be returned directly in the body of
the `indices.get_mapping` API, without keying them by the type name, the
`indices.create` will expect mappings directly under the `mappings` key, and
the `indices.put_mapping` will use `_doc` as a type name and fail if a `type`
is provided explicitly.
Relates #15613
Today we expose a mutable list of documents in ParseContext via
ParseContext#docs(). This, on the one hand places knowledge how
to access nested documnts in multiple places and on the other
allows for potential illegal access to nested only docs after
the docs are reversed. This change restricts the access and
streamlines nested / non-root doc access.
This change validates that the `_search` request does not have trailing
tokens after the main object and fails the request with a parsing exception otherwise.
Closes#28995
Some features have been deprecated since `6.0` like the `_parent` field or the
ability to have multiple types per index. This allows to remove quite some
code, which in-turn will hopefully make it easier to proceed with the removal
of types.
Today when a user runs a CLI tool with standard input closed and no tty
attached, the result from reading is null and this usually leads to a
null pointer exception when we try to parse this input. This arises for
example when the user runs the plugin installer through a Docker
container without leaving standard input open and attaching a tty
(docker exec <container ID> bin/elasticsearch-plugin install). When we
try to read that the user accepts the plugin requiring additional
security permissions we will get back null. This commit addresses this
for all cases by throwing an illegal state exception. The solution for
the user is leave standard input open and attach a tty (or, for some
tools, use batch mode).
#29409 removed the nearlyEquals() double comparison snippet, which
makes these tests very flaky because they can generate very large or
very small doubles which don't work well with absolute error comparison.
We need to either refactor these tests to guarantee they stay in a small
range (which could be difficult due to holt/holt-winters) or re-implement
the more robust double comparison.
Tracking issue: #29456
This commit simplifies the exception handling in
TranslogWriter#closeWithTragicEvent. When invoking this method, the
inner close method could throw an exception which we always catch and
suppress into the exception that led us to tragically close. This commit
moves that repeated logic into closeWithTragicException and now callers
simply need to catch, invoke closeWithTragicException, and rethrow.
Currently rest-based tests do not work from the IDE, as the security
manager is configured to permit certain network operations when
using the snapshot jars compiled by gradle. We have an existing
workaround that explicitly associates a codebase with the path
from which the classes are loaded (in this case, the IDE build
directory). This PR adds the rest client to this workaround list.
In the case that a document with a percolator field is matched when using the `percolate` query then
the fetch phase can fail due to the fact that the percolator can't resolve any query from that document.
Closes#29429
In case of a disjunction query with both range and term based clauses and
msm specified, the query analyzer needs to also reduce the msn if a range
based clause for the same field is encountered. This did not happen.
Instead of fixing this bug the logic has been simplified to just set a
percolator query's msm to 1 if a disjunction contains range clauses and
msm on disjunction has been specified. The logic would otherwise just get
to complex and the performance gain isn't that much for this kind of
percolator queries.
In case a percolator query has clauses that have duplicate terms or ranges then
for disjunction clauses with a minimum should match the query extraction of the
clause with the lowest msm should be used and for conjunction queries query
extractions wiht duplicate terms/ranges the msn should be ignored. If this
is not done then percolator queries that should match never match.
Example percolator query: value1 OR value2 OR value2 OR value3 OR value3 OR value3 OR value4 OR value5 (msm set to 3)
In the above example query the extracted msm would be 3
Example document1: value1 value2 value3
With the msm and extracted terms this would match and is expected behaviour
Example document2: value3
This document should match too (value3 appears in 3 clauses), but with msm set to 3 and the fact
that fact that only distinct values are indexed in extracted terms field this document would
Also added another random duel test.
Closes#29393
* Remove copy-pasted code
We had two instances of copy-pasted code with a bad license from
another website. The code was doing something rather simple, and
that functionality already exists within junit. This PR simply leverages
the junit functionality.
`BaseRandomBinaryDocValuesRangeQueryTestCase.testRandomBig` should only run with
nightly tests. It doesn't make sense to make it part of every test run.
`UUIDTests` had a slow test for compression, which I made a bit faster by
decreasing the number of indexed docs.
This change tries to simplify the extraction logic of boolean queries by
concentrating the logic into two methods: one that merges results for
conjunctions, and another one for disjunctions. Other concerns, like the impact
of prohibited clauses or how an `UnsupportedQueryException` should be treated
are applied on top of those two methods.
This is mostly a code reorganization, it doesn't change the result of query
extraction except in the case that a query both has required clauses and a
minimum number of `SHOULD` clauses that is greater than 1, which we now
rewrite into a pure conjunction. For instance `(+A B C)~1` is rewritten into
`(+A +(B C))` prior to extraction.
This commit fixes the formatting of the values in the composite
aggregation response. `date` fields should return timestamp as longs
when used in a `terms` source and `ip` fields should always be formatted as strings.
This commit also fixes the parsing of the `after` key for these field types.
Finally, this commit disables the index optimization for the `ip` field and any source that provides a `missing` value.
This commit simplifies the invocations to
Translog#closeOnTragicEvent. This method already catches all possible
exceptions and suppresses the non-AlreadyClosedExceptions into the
exception that triggered the invocation. Therefore, there is no need for
callers to do this same logic (which would never execute).
* Move Streams.copy into elasticsearch-core and make a multi-release jar
This moves the method `Streams.copy(InputStream in, OutputStream out)` into the
`elasticsearch-core` project (inside the `o.e.core.internal.io` package). It
also makes this class into a multi-release class where the Java 9 equivalent
uses `InputStream#transferTo`.
This is a followup from
https://github.com/elastic/elasticsearch/pull/29300#discussion_r178147495
* Move ObjectParser into the x-content lib
This moves `ObjectParser`, `AbstractObjectParser`, and
`ConstructingObjectParser` into the libs/x-content dependency. This decoupling
allows them to be used for parsing for projects that don't want to depend on the
entire Elasticsearch jar.
Relates to #28504
* Move Tuple into elasticsearch-core
This allows us to use Tuple from other projects that don't want to rely on the
entire Elasticsearch jar.
I have also added very simple tests, since there were none.
Relates tangentially to #28504
Today we close the translog write tragically if we experience any I/O
exception on a write. These tragic closes lead to use closing the
translog and failing the engine. Yet, there is one case that is missed
which is when we touch the write channel during a read (checking if
reading from the writer would put us past what has been flushed). This
commit addresses this by closing the writer tragically if we encounter
an I/O exception on the write channel while reading. This becomes
interesting when we consider that this method is invoked from the engine
through the translog as part of getting a document from the
translog. This means we have to consider closing the translog here as
well which will cascade up into us finally failing the engine.
Note that there is no semantic change to, for example, primary/replica
resync and recovery. These actions will take a snapshot of the translog
which syncs the translog to disk. If an I/O exception occurs during the
sync we already close the writer tragically and once we have synced we
do not ever read past the position that was synced while taking the
snapshot.
Currently the ranking evaluation API doesn't support many of the
standard parameters of the search API. Some of these make sense, like
adding support for the common indices options parameters, which this
change adds.