Today the translog of an engine is exposed and can be accessed directly.
While this exposure offers much flexibility, it also causes these troubles:
- Inconsistent behavior between translog method and engine method.
For example, rolling a translog generation via an engine also trims
unreferenced files, but translog's method does not.
- An engine does not get notified when critical errors happen in translog
as the access is direct.
This change isolates translog of an engine and enforces all accesses to
translog via the engine.
The index thread pool is no longer needed as its primary use-case for
single-document indexing requests has been relieved now that
single-document indexing requests are converted to bulk indexing
requests (with a single document payload).
This adds 2 testcases that test if a shard goes idle
pending (uncommitted) segments are committed and unreferenced
files will be freed.
Relates to #29482
This change adds the current primary term to the header of the current
translog file. Having a term in a translog header is a prerequisite step
that allows us to trim translog operations given the max valid seq# for
that term.
This commit also updates tests to conform the primary term invariant
which guarantees that all translog operations in a translog file have
its terms at most the term stored in the translog header.
* Add a helper method to get a random java.util.TimeZone
This adds a helper method to ESTestCase that returns a randomized
`java.util.TimeZone`. This can be used when transitioning code from Joda to the
JDK's time classes.
Some features have been deprecated since `6.0` like the `_parent` field or the
ability to have multiple types per index. This allows to remove quite some
code, which in-turn will hopefully make it easier to proceed with the removal
of types.
Currently rest-based tests do not work from the IDE, as the security
manager is configured to permit certain network operations when
using the snapshot jars compiled by gradle. We have an existing
workaround that explicitly associates a codebase with the path
from which the classes are loaded (in this case, the IDE build
directory). This PR adds the rest client to this workaround list.
* Move Streams.copy into elasticsearch-core and make a multi-release jar
This moves the method `Streams.copy(InputStream in, OutputStream out)` into the
`elasticsearch-core` project (inside the `o.e.core.internal.io` package). It
also makes this class into a multi-release class where the Java 9 equivalent
uses `InputStream#transferTo`.
This is a followup from
https://github.com/elastic/elasticsearch/pull/29300#discussion_r178147495
* Move ObjectParser into the x-content lib
This moves `ObjectParser`, `AbstractObjectParser`, and
`ConstructingObjectParser` into the libs/x-content dependency. This decoupling
allows them to be used for parsing for projects that don't want to depend on the
entire Elasticsearch jar.
Relates to #28504
* Fixes query_string query equals timezone check
This change fixes a bug where two `QueryStringQueryBuilder`s were found
to be equal if they had the same timezone set even if the query string
in the builders were different
Closes#29403
* Adds mutate function to QueryStringQueryBuilderTests
* iter
This improves the way similarities are plugged in in order to:
- reject the classic similarity on 7.x indices and emit a deprecation
warning otherwise
- reject unkwown parameters on 7.x indices and emit a deprecation
warning otherwise
Even though this breaks the plugin API, I'd like to backport to 7.x so
that users can get deprecation warnings when they are doing something
that will become unsupported in the future.
Closes#23208Closes#29035
* Begin moving XContent to a separate lib/artifact
This commit moves a large portion of the XContent code from the `server` project
to the `libs/xcontent` project. For the pieces that have been moved, some
helpers have been duplicated to allow them to be decoupled from ES helper
classes. In addition, `Booleans` and `CheckedFunction` have been moved to the
`elasticsearch-core` project.
This decoupling is a move so that we can eventually make things like the
high-level REST client not rely on the entire ES jar, only the parts it needs.
There are some pieces that are still not decoupled, in particular some of the
XContent tests still remain in the server project, this is because they test a
large portion of the pluggable xcontent pieces through
`XContentElasticsearchException`. They may be decoupled in future work.
Additionally, there may be more piecese that we want to move to the xcontent lib
in the future that are not part of this PR, this is a starting point.
Relates to #28504
Removes a set of assertions in the test framework that verified that
Streamable objects could be serialized and deserialized across different
versions. When this was discussed the consensus was that this approach
has not caught many bugs in a long time and that serialization testing of
objects was best left to their respective unit and integration tests.
This commit also removes a transport interceptor that was used in
ESIntegTestCase tests to make these assertions about objects coming in
or off the wire.
Today we have a few problems with how we handle bad requests:
- handling requests with bad encoding
- handling requests with invalid value for filter_path/pretty/human
- handling requests with a garbage Content-Type header
There are two problems:
- in every case, we give an empty response to the client
- in most cases, we leak the byte buffer backing the request!
These problems are caused by a broader problem: poor handling preparing
the request for handling, or the channel to write to when the response
is ready. This commit addresses these issues by taking a unified
approach to all of them that ensures that:
- we respond to the client with the exception that blew us up
- we do not leak the byte buffer backing the request
We historically removed reading from the transaction log to get consistent
results from _GET calls. There was also the motivation that the read-modify-update
principle we apply should not be hidden from the user. We still agree on the fact
that we should not hide these aspects but the impact on updates is quite significant
especially if the same documents is updated before it's written to disk and made serachable.
This change adds back the ability to read from the transaction log but only for update calls.
Calls to the _GET API will always do a refresh if necessary to return consistent results ie.
if stored fields or DocValues Fields are requested.
Closes#26802
Once a document is deleted and Lucene is refreshed, we will not be able
to look up the `version/seq#` associated with that delete in Lucene. As
conflicting operations can still be indexed, we need another mechanism
to remember these deletes. Therefore deletes should still be stored in
the Version Map, even after Lucene is refreshed. Obviously, we can't
remember all deletes forever so a trimming mechanism is needed.
Currently, we remember deletes for at least 1 minute (the default GC
deletes cycle) and clean them periodically. This is, at the moment, the
best we can do on the primary for user facing APIs but this arbitrary
time limit is problematic for replicas. Furthermore, we can't rely on
the primary and replicas doing the trimming in a synchronized manner,
and failing to do so results in the replica and primary making different
decisions.
The following scenario can cause inconsistency between
primary and replica.
1. Primary index doc (index, id=1, v2)
2. Network packet issue causes index operation to back off and wait
3. Primary deletes doc (delete, id=1, v3)
4. Replica processes delete (delete, id=1, v3)
5. 1+ minute passes (GC deletes runs replica)
6. Indexing op is finally sent to the replica which no processes it
because it forgot about the delete.
We can reply on sequence-numbers to prevent this issue. If we prune only
deletes whose seqno at most the local checkpoint, a replica will
correctly remember what it needs. The correctness is explained as
follows:
Suppose o1 and o2 are two operations on the same document with seq#(o1)
< seq#(o2), and o2 arrives before o1 on the replica. o2 is processed
normally since it arrives first; when o1 arrives it should be discarded:
1. If seq#(o1) <= LCP, then it will be not be added to Lucene, as it was
already previously added.
2. If seq#(o1) > LCP, then it depends on the nature of o2:
- If o2 is a delete then its seq# is recorded in the VersionMap,
since seq#(o2) > seq#(o1) > LCP, so a lookup can find it and
determine that o1 is stale.
- If o2 is an indexing then its seq# is either in Lucene (if
refreshed) or the VersionMap (if not refreshed yet), so a
real-time lookup can find it and determine that o1 is stale.
In this PR, we prefer to deploy a single trimming strategy, which
satisfies both requirements, on primary and replicas because:
- It's simpler - no need to distinguish if an engine is running at
primary mode or replica mode or being promoted.
- If a replica subsequently is promoted, user experience is fully
maintained as that replica remembers deletes for the last GC cycle.
However, the version map may consume less memory if we deploy two
different trimming strategies for primary and replicas.
#28245 has introduced the utility class`EngineDiskUtils` with a set of methods to prepare/change
translog and lucene commit points. That util class bundled everything that's needed to create and
empty shard, bootstrap a shard from a lucene index that was just restored etc.
In order to safely do these manipulations, the util methods acquired the IndexWriter's lock. That
would sometime fail due to concurrent shard store fetching or other short activities that require the
files not to be changed while they read from them.
Since there is no way to wait on the index writer lock, the `Store` class has other locks to make
sure that once we try to acquire the IW lock, it will succeed. To side step this waiting problem, this
PR folds `EngineDiskUtils` into `Store`. Sadly this comes with a price - the store class doesn't and
shouldn't know about the translog. As such the logic is slightly less tight and callers have to do the
translog manipulations on their own.
This change refactors the composite aggregation to add an execution mode that visits documents in the order of the values
present in the leading source of the composite definition. This mode does not need to visit all documents since it can early terminate
the collection when the leading source value is greater than the lowest value in the queue.
Instead of collecting the documents in the order of their doc_id, this mode uses the inverted lists (or the bkd tree for numerics) to collect documents
in the order of the values present in the leading source.
For instance the following aggregation:
```
"composite" : {
"sources" : [
{ "value1": { "terms" : { "field": "timestamp", "order": "asc" } } }
],
"size": 10
}
```
... can use the field `timestamp` to collect the documents with the 10 lowest values for the field instead of visiting all documents.
For composite aggregation with more than one source the execution can early terminate as soon as one of the 10 lowest values produces enough
composite buckets. For instance if visiting the first two lowest timestamp created 10 composite buckets we can early terminate the collection since it
is guaranteed that the third lowest timestamp cannot create a composite key that compares lower than the one already visited.
This mode can execute iff:
* The leading source in the composite definition uses an indexed field of type `date` (works also with `date_histogram` source), `integer`, `long` or `keyword`.
* The query is a match_all query or a range query over the field that is used as the leading source in the composite definition.
* The sort order of the leading source is the natural order (ascending since postings and numerics are sorted in ascending order only).
If these conditions are not met this aggregation visits each document like any other agg.
* Remove BytesArray and BytesReference usage from XContentFactory
This removes the usage of `BytesArray` and `BytesReference` from
`XContentFactory`. Instead, a regular `byte[]` should be passed. To assist with
this a helper has been added to `XContentHelper` that will preserve the offset
and length from the underlying BytesReference.
This is part of ongoing work to separate the XContent parts from ES so they can
be factored into their own jar.
Relates to #28504
`$_path` is used by documentation tests to ignore a value from a
response, for example:
```
[source,js]
----
{
"count": 1,
"datafeeds": [
{
"datafeed_id": "datafeed-total-requests",
"state": "started",
"node": {
...
"attributes": {
"ml.machine_memory": "17179869184",
"ml.max_open_jobs": "20",
"ml.enabled": "true"
}
},
"assignment_explanation": ""
}
]
}
----
// TESTRESPONSE[s/"17179869184"/$body.$_path/]
```
That example shows `17179869184` in the compiled docs but when it runs
the tests generated by that doc it ignores `17179869184` and asserts
instead that there is a value in that field. This is required because we
can't predict things like "how many milliseconds will this take?" and
"how much memory will this take?".
Before this change it was impossible to use `$_path` when any component
of the path contained a `.`. This fixes the `$_path` evaluator to
properly escape `.`.
Closes#28770
Changes made in #28972 seems to have changed some assumptions about how
SMILE and CBOR write byte[] values and how this is tested. This changes
the generation of the randomized DocumentField values back to BytesArray
while expecting the JSON and YAML deserialisation to produce Base64
encoded strings and SMILE and CBOR to parse back BytesArray instances.
Closes#29080
Currently we have a fairly complicated logic in the engine constructor logic to deal with all the
various ways we want to mutate the lucene index and translog we're opening.
We can:
1) Create an empty index
2) Use the lucene but create a new translog
3) Use both
4) Force a new history uuid in all cases.
This leads complicated code flows which makes it harder and harder to make sure we cover all the
corner cases. This PR tries to take another approach. Constructing an InternalEngine always opens
things as they are and all needed modifications are done by static methods directly on the
directory, one at a time.
* Decouple XContentBuilder from BytesReference
This commit removes all mentions of `BytesReference` from `XContentBuilder`.
This is needed so that we can completely decouple the XContent code and move it
into its own dependency.
While this change appears large, it is due to two main changes, moving
`.bytes()` and `.string()` out of XContentBuilder itself into static methods
`BytesReference.bytes` and `Strings.toString` respectively. The rest of the
change is code reacting to these changes (the majority of it in tests).
Relates to #28504
As we have factored Elasticsearch into smaller libraries, we have ended
up in a situation that some of the dependencies of Elasticsearch are not
available to code that depends on these smaller libraries but not server
Elasticsearch. This is a good thing, this was one of the goals of
separating Elasticsearch into smaller libraries, to shed some of the
dependencies from other components of the system. However, this now
means that simple utility methods from Lucene that we rely on are no
longer available everywhere. This commit copies IOUtils (with some small
formatting changes for our codebase) into the fold so that other
components of the system can rely on these methods where they no longer
depend on Lucene.
I have long wanted an actual test that dying with dignity works. It is
tricky because if dying with dignity works, it means the test JVM dies
which is usually an abnormal condition. And anyway, how does one force a
fatal error to be thrown. I was motivated to investigate this again by
the fact that I missed a backport to one branch leading to an issue
where Elasticsearch would not successfully die with dignity. And now we
have a solution: we install a plugin that throws an out of memory error
when it receives a request. We hack the standalone test infrastructure
to prevent this from failing the test. To do this, we bypass the
security manager and remove the PID file for the node; this tricks the
test infrastructure into thinking that it does not need to stop the
node. We also bypass seccomp so that we can fork jps to make sure that
Elasticsearch really died. And to be extra paranoid, we parse the logs
of the dead Elasticsearch process to make sure it died with
dignity. Never forget.
Today we have two test base classes that have a lot in common when it comes to testing wire and xcontent serialization: `AbstractSerializingTestCase` and `AbstractXContentStreamableTestCase`. There are subtle differences though between the two, in the way they work, what can be overridden and features that they support (e.g. insertion of random fields).
This commit introduces a new base class called `AbstractWireTestCase` which holds all of the serialization test code in common between `Streamable` and `Writeable`. It has two minimal subclasses called `AbstractWireSerializingTestCase` and `AbstractStreamableTestCase` which are specialized for `Writeable` and `Streamable`.
This commit also introduces a new test class called `AbstractXContentTestCase` for all of the xContent testing, which holds a testFromXContent method for parsing and rendering to xContent. This one can be delegated to from the existing `AbstractStreamableXContentTestCase` and `AbstractSerializingTestCase` so that we avoid code duplicate as much as possible and all these base classes offer the same functionalities in the same way. Having this last base class decoupled from the serialization testing may also help with the REST high-level client testing, as there are some classes where it's hard to implement equals/hashcode and this makes it possible to override `assertEqualInstances` for custom equality comparisons (also this base class doesn't require implementing equals/hashcode as it doesn't test such methods.
This commit is related to #27260. Currently there is a weird
relationship between channel contexts and nio channels. The selectors
use the context for read and writing. But the selector operates directly
on the nio channel for registering, closing, and connecting.
This commit works on improving this relationship. The selector operates
directly on the context which wraps the low level java.nio.channels. The
NioChannel class is simply an API that is used to interact with the
channel (sending messages from outside the selector event loop,
scheduling a close, adding listeners, etc). The context is only used
internally by the channel to implement these apis and by the selector to
perform these operations.
This allows us to save a bit of code, but also adds more coverage as it tests serialization which was missing in some of the existing tests. Also it requires implementing equals/hashcode and we get the corresponding tests for them for free from the base test class.
* Pass InputStream when creating XContent parser
Rather than passing the raw `BytesReference` in when creating the xcontent
parser, this passes the StreamInput (which is an InputStream), this allows us to
decouple XContent from BytesReference.
This also removes the use of `commons.Booleans` so it doesn't require more
external commons classes.
Related to #28504
* Undo boolean removal
* Enhance deprecation javadoc
Add support version and version_type in ingest pipelines
Add support for setting document version and version type in set
processor of an ingest pipeline.
* Remove log4j dependency from elasticsearch-core
This removes the log4j dependency from our elasticsearch-core project. It was
originally necessary only for our jar classpath checking. It is now replaced by
a `Consumer<String>` so that the es-core dependency doesn't have external
dependencies.
The parts of #28191 which were moved in conjunction (like `ESLoggerFactory` and
`Loggers`) have been moved back where appropriate, since they are not required
in the core jar.
This is tangentially related to #28504
* Add javadocs for `output` parameter
* Change @code to @link
Today we have several levels of indirection to acquire an Engine.Searcher.
We first acquire a the reference manager for the scope then acquire an
IndexSearcher and then create a searcher for the engine based on that.
This change simplifies the creation into a single method call instead of
3 different ones.
Previously we introduced a new parameter to `acquireIndexCommit` to
allow acquire either a safe commit or a last commit. However with the
new parameters, callers can provide a nonsense combination - flush first
but acquire the safe commit. This commit separates acquireIndexCommit
method into two different methods to avoid that problem. Moreover, this
change should also improve the readability.
Relates #28038
The REST high-level client supports now encoding of path parts, so that for instance documents with valid ids, but containing characters that need to be encoded as part of urls (`#` etc.), are properly supported. We also make sure that each path part can contain `/` by encoding them properly too.
Closes#28625
Currently the Translog constructor is capable both of opening an existing translog and creating a
new one (deleting existing files). This PR separates these two into separate code paths. The
constructors opens files and a dedicated static methods creates an empty translog.
* Move more XContent.createParser calls to non-deprecated version
Part 2
This moves more of the callers to pass in the DeprecationHandler.
Relates to #28504
* Use parser's deprecation handler where appropriate
* Use logging handler in test that uses deprecated field on purpose
* Move more XContent.createParser calls to non-deprecated version
This moves more of the callers to pass in the DeprecationHandler.
Relates to #28504
* Use parser's deprecation handler where available
Version Utils did not previously have logic that removed the last majors
minor snapshot if there was a next bugfix and maintenance bugfix
release. This adds the logic and fixes some broken assumptions in tests
as well.
relates #28505
The build.snapshot was mistakenly passed in to every snapshot version,
so when release tests were run, these versions were mistaken as released
entities and could not be found in maven, because they do not
exist. This fix removes that bug in logic, and always makes them proper
snapshots. This has a benefit of cleaning up the VersionUtilsTests
because they no longer rely on different sets of versions to check
against, which was also a bug.
Currently if a yaml test has a teardown and a test is failing then
a stash dump of a request in the teardown is logged instead of
a stash dump of a request in the test itself.
By handling the logging of stash dumps separately for setup, tests and
teardown yaml sections we shouldn't miss the stash dump of request/response
that is actually causing the yaml test to fail.
The is a follow up to #28567 changing the method used to capture stack traces, as requested
during the review. Instead of creating a throwable, we explicitly capture the stack trace of the
current thread. This should Make Jason Happy Again ™️ .
Generalizing BWC building so that there is less code to modify for a release. This ensures we do not
need to think about what major or minor version is in the gradle code. It follows the general rules of the
elastic release structure. For more information on the rules, see the VersionCollection's javadoc.
This also removes the additional bwc snapshots that will never be released, such as 6.0.2, which were
being built and tested against every time we ran bwc tests.
Additionally, it creates 4 new projects that correspond to the different types of snapshots that may exist
for a given version. Its possible to now run those individual tasks to work out bwc logic whereas
previously it was impossible and the entire suite of bwc tests had to be run to work out any logic
changes in the build tools' bwc project. Please note that if the project does not make sense for the
version that is current, that an error will be thrown from that individual project if an attempt is made to
run it.
This should allow for automating the version bumps as well, since it removes all the hardcoded version
logic from the configs.
Today we acquire a permit from the shard to coordinate between indexing operations, recoveries and other state transitions. When we leak an permit it's practically impossible to find who the culprit is. This PR add stack traces capturing for each permit so we can identify which part of the code is responsible for acquiring the unreleased permit. This code is only active when assertions are active.
The output is something like:
```
java.lang.AssertionError: shard [test][1] on node [node_s0] has pending operations:
--> java.lang.RuntimeException: something helpful 2
at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:223)
at org.elasticsearch.index.shard.IndexShard.<init>(IndexShard.java:322)
at org.elasticsearch.index.IndexService.createShard(IndexService.java:382)
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:514)
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:143)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:552)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:529)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:231)
at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$6(ClusterApplierService.java:498)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:495)
at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:482)
at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:161)
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:566)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.base/java.lang.Thread.run(Thread.java:844)
--> java.lang.RuntimeException: something helpful
at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:223)
at org.elasticsearch.index.shard.IndexShard.<init>(IndexShard.java:311)
at org.elasticsearch.index.IndexService.createShard(IndexService.java:382)
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:514)
at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:143)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:552)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:529)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:231)
at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$6(ClusterApplierService.java:498)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:495)
at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:482)
at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:161)
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:566)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.base/java.lang.Thread.run(Thread.java:844)
```
This commit modifies the transport stats with exception test to remove
the requirement that we calculate the published address size when
comparing bytes received. This is tricky and is currently broken as we
also place the address string in the transport exception, however we do
not adjust the bytes for that.
The solution in this commit is to just serialize the transport exception
in the test and use that for the calculation.
* Move to non-deprecated XContentHelper.createParser(...)
This moves away from one of the now-deprecated XContentHelper.createParser
methods in favor of specifying the deprecation logger at parser creation time.
Relates to #28449
Note that this doesn't move all the `createParser` calls because some of them
use the already-deprecated method that doesn't specify the XContentType.
* Remove the deprecated (and now non-needed) createParser method
This commit switches all the modules and server test code to use the
non-deprecated `ParseField.match` method, passing in the parser's deprecation
handler or the logging deprecation handler when a parser is not available (like
in tests).
Relates to #28449
The primary currently replicates writes to all other shard copies as soon as they're added to the routing table. Initially those shards are not even ready yet to receive these replication requests, for example when undergoing a file-based peer recovery. Based on the specific stage that the shard copies are in, they will throw different kinds of exceptions when they receive the replication requests. The primary then ignores responses from shards that match certain exception types. With this mechanism it's not possible for a primary to distinguish between a situation where a replication target shard is not allocated and ready yet to receive requests and a situation where the shard was successfully allocated and active but subsequently failed.
This commit changes replication so that only initializing shards that have successfully opened their engine are used as replication targets. This removes the need to replicate requests to initializing shards that are not even ready yet to receive those requests. This saves on network bandwidth and enables features that rely on the distinction between a "not-yet-ready" shard and a failed shard.
This change adds a shallow copy method for aggregation builders. This method returns a copy of the builder replacing the factoriesBuilder and metaDada
This method is used when the builder is rewritten (AggregationBuilder#rewrite) in order to make sure that we create a new instance of the parent builder when sub aggregations are rewritten.
Relates #27782
Adds allow_partial_search_results flag to search requests with default setting = true.
When false, will error if search either timeouts, has partial errors or has missing shards rather
than returning partial search results. A cluster-level setting provides a default for search requests with no flag.
Closes#27435
This change makes sure that this function does not create field names that end with a '.', more precisely it only allows
alpha-numeric characters to compose the leaf field name.
Closes#27373
The MockUncasedHostProvider accesses nodes that are not fully built yet, where TransportService.getNode() returns null, which means that the null entries end up in the list of seedNodes that UnicastZenPing then uses.
Cluster settings shouldn't leak into the next test.
I played with failing the test if it left over any settings but that
felt like it added more ceremony then it was worth. The advantage is
that any test that intentionally wants to leave settings in place after
the test would fail and require looking at but, so far as I can tell, we
don't have any such tests.
Currently meta plugins will ask for confirmation of security policy
exceptions for each bundled plugin. This commit collects the necessary
permissions of each bundled plugin, and asks for confirmation of all of
them at the same time.
This change adds the test name to the exceptions thrown by the MockPageCacheRecycler and MockBigArrays. Also, if there is more than one page/array which are not released it will add the first one as the cause of the thrown exception and the others as suppressed exceptions.
Relates to #21315
This introduces a settings updater that allows to specify a list of
settings. Whenever one of those settings changes, the whole block of
settings is passed to the consumer.
This also fixes an issue with affix settings, when used in combination
with group settings, which could result in no found settings when used
to get a setting for a namespace.
Lastly logging has been slightly changed, so that filtered settings now
only log the setting key.
Another bug has been fixed for the mock log appender, which did not
work, when checking for the exact message.
Closes#28047
The tests for those field types were removed in #26549 because the range mapper
was moved to a module, but later this mapper was moved back to core in #27854.
This change adds back those two field types like before to the general setup in
AbstractQueryTestCase and adds some specifics to the RangeQueryBuilder and
TermsQueryBuilder tests. Also adding back an integration test in SearchQueryIT that
has been removed before but that can be kept with the mapper back in core now.
Relates to #28147
In many cases we use the `ShardOperationFailedException` interface to abstract an exception that can only be of one type, namely `DefaultShardOperationException`. There is no need to use the interface in such cases, the concrete type should be used instead. That has the additional advantage of simplifying parsing such exceptions back from rest responses for the high-level REST client
This commit is related to #27260. Currently have a channel context that
implements reading and writing logic for socket channels. Additionally,
we have exception contexts to handle exceptions. And accepting contexts
to handle accepted channels. This PR introduces a ChannelContext that
handles close and exception handling for all channel types.
Additionally, it has implementers that provide specific functionality
for socket channels (read and writing). And specific functionality for
server channels (accepting).
There a number of tests in `AbstractSimpleTransportTestCase` that
create `MockTcpTransport` impls. This commit modifies two of these tests
to use the transport implementation that is being tested.
This commit is related to #27260. Right now we have separate read and
write contexts for implementing specific protocol logic. However, some
protocols require a closer relationship between read and write
operations than is allowed by our current model. An example is HTTP
which might require a write if some problem with request parsing was
encountered.
Additionally, some protocols require close messages to be sent when a
channel is shutdown. This is also problematic in our current model,
where we assume that channels should simply be queued for close and
forgotten.
This commit transitions to a single ChannelContext which implements
all read, write, and close logic for protocols. It is the job of the
context to tell the selector when to close the channel. A channel can
still be manually queued for close with a selector. This is how server
channels are closed for now. And this route allows timeout mechanisms on
normal channel closes to be implemented.
This logging message adds considerable noise to many REST tests, if you
are using something like HTTP basic auth in every API call or set any custom
header.
The log level moves from info to debug, so can still be seen if wanted.
The composite aggregation defers the collection of sub-aggregations to a second pass that visits documents only if they
appear in the top buckets. Though the scorer for sub-aggregations is not set on this second pass and generates an NPE if any sub-aggregation
tries to access the score. This change creates a scorer for the second pass and makes sure that sub-aggs can use it safely to check the score of
the collected documents.
The method `initiateChannel` on `TcpTransport` is explicit in that
channels can be connect asynchronously. All production implementations
do connect asynchronously. Only the blocking `MockTcpTransport`
connects in a synchronous manner. This avoids testing some of the
blocking code in `TcpTransport` that waits on connections to complete.
Additionally, it requires a more extensive method signature than
required for other transports.
This commit modifies the `MockTcpTransport` to make these connections
asynchronously on a different thread. Additionally, it simplifies that
`initiateChannel` method signature.
This is related to #27933. It introduces a jar named elasticsearch-core
in the lib directory. This commit moves the JarHell class from server to
elasticsearch-core. Additionally, PathUtils and some of Loggers are
moved as JarHell depends on them.
Today a primary shard transfers the most recent commit point to a
replica shard in a file-based recovery. However, the most recent commit
may not be a "safe" commit; this causes a replica shard not having a
safe commit point until it can retain a safe commit by itself.
This commits collapses the snapshot deletion policy into the combined
deletion policy and modifies the peer recovery source to send a safe
commit.
Relates #10708
We set the watermarks to low values in other test cases to prevent test
failures on nodes with low disk space (if the disk space is too low, the
test will fail anyway but we should not prematurely fail). This commit
sets the watermarks in the single-node test cases to avoid test failures
in such situations.
Relates #28134
This commit adds the ability to package multiple plugins in a single zip.
The zip file for a meta plugin must contains the following structure:
|____elasticsearch/
| |____ <plugin1> <-- The plugin files for plugin1 (the content of the elastisearch directory)
| |____ <plugin2> <-- The plugin files for plugin2
| |____ meta-plugin-descriptor.properties <-- example contents below
The meta plugin properties descriptor is mandatory and must contain the following properties:
description: simple summary of the meta plugin.
name: the meta plugin name
The installation process installs each plugin in a sub-folder inside the meta plugin directory.
The example above would create the following structure in the plugins directory:
|_____ plugins
| |____ <name_of_the_meta_plugin>
| | |____ meta-plugin-descriptor.properties
| | |____ <plugin1>
| | |____ <plugin2>
If the sub plugins contain a config or a bin directory, they are copied in a sub folder inside the meta plugin config/bin directory.
|_____ config
| |____ <name_of_the_meta_plugin>
| | |____ <plugin1>
| | |____ <plugin2>
|_____ bin
| |____ <name_of_the_meta_plugin>
| | |____ <plugin1>
| | |____ <plugin2>
The sub-plugins are loaded at startup like normal plugins with the same restrictions; they have a separate class loader and a sub-plugin
cannot have the same name than another plugin (or a sub-plugin inside another meta plugin).
It is also not possible to remove a sub-plugin inside a meta plugin, only full removal of the meta plugin is allowed.
Closes#27316
This commit changes IndexShardSnapshotStatus so that the Stage is updated
coherently with any required information. It also provides a asCopy()
method that returns the status of a IndexShardSnapshotStatus at a given
point in time, ensuring that all information are coherent.
Closes#26480
Several responses include the shards_acknowledged flag (indicating whether the
requisite number of shard copies started before the completion of the operation)
and there are two different getters used : isShardsAcknowledged() and isShardsAcked().
This PR deprecates the isShardsAcked() in favour of isShardsAcknowledged() in
CreateIndexResponse, RolloverResponse and CreateIndexClusterStateUpdateResponse.
Closes#27784
This is related to #27260. This commit moves the NioTransport from
:test:framework to a new nio-transport plugin. Additionally, supporting
tcp decoding classes are moved to this plugin. Generic byte reading and
writing contexts are moved to the nio library.
Additionally, this commit adds a basic MockNioTransport to
:test:framework that is a TcpTransport implementation for testing that
is driven by nio.
This commit sets the elasticsearch-nio code base in the
BootstrapForTesting class. This is necessary as that codebase needs
socket permissions. Setting the codebase manually is necessary as
intellij does not package our internal libraries when running tests.
Allows TransportResponse objects not to implement Streamable anymore. As an example, I've adapted the response handler for ShardActiveResponse, allowing the fields in that class to become final.
This commit adds the infrastructure to plugin building and loading to
allow one plugin to extend another. That is, one plugin may extend
another by the "parent" plugin allowing itself to be extended through
java SPI. When all plugins extending a plugin are finished loading, the
"parent" plugin has a callback (through the ExtensiblePlugin interface)
allowing it to reload SPI.
This commit also adds an example plugin which uses as-yet implemented
extensibility (adding to the painless whitelist).
This commit disables the nio transport as an option for the test
transport in integration tests. This is because it does not currently
run properly in intellij due to socket permissions. It should be
reenabled once #27881 is merged (and the proper permissions are added).
* Adds task dependenciesInfo to BuildPlugin to generate a CSV file with dependencies information (name,version,url,license)
* Adds `ConcatFilesTask.groovy` to concatenates multiple files into one
* Adds task `:distribution:generateDependenciesReport` to concatenate `dependencies.csv` files into a single file (`es-dependencies.csv` by default)
# Examples:
$ gradle dependenciesInfo :distribution:generateDependenciesReport
## Use `csv` system property to customize the output file path
$ gradle dependenciesInfo :distribution:generateDependenciesReport -Dcsv=/tmp/elasticsearch-dependencies.csv
## When branch is not master, use `build.branch` system property to generate correct licenses URLs
$ gradle dependenciesInfo :distribution:generateDependenciesReport -Dbuild.branch=6.x -Dcsv=/tmp/elasticsearch-dependencies.csv
Today we always recover a primary from the last commit point. However
with a new deletion policy, we keep multiple commit points in the
existing store, thus we have chance to find a good starting commit
point. With a good starting commit point, we may be able to throw away
stale operations. This PR rollbacks a primary to a starting commit then
recovering from translog.
Relates #10708
This is related to #27802. This commit adds a jar called
elasticsearch-nio that contains the base nio classes that will be used
for the tcp nio transport and eventually the http nio transport.
The jar does not depend on elasticsearch:core, so all references to core
have been removed.
Today when we get a metadata snapshot directly from a store directory,
we acquire a metadata lock, then acquire an IndexWriter lock. However,
we create a CheckIndex in IndexShard without acquiring the metadata lock
first. This causes a recovery failed because the IndexWriter lock can be
still held by method snapshotStoreMetadata. This commit makes sure to
create a CheckIndex under the metadata lock.
Closes#24481Closes#27731
Relates #24787
The last operation executed in IndicesClientDocumentationIT.testCreate()
is an asynchronous index creation. Because nothing waits for its
completion, on slow machines the index can sometimes be created after
the testCreate() test is finished, and it can fail the following test.
Closes#27754
When the first parameter of `ESTestCase#randomValueOtherThan` is `null`
then run the supplier until it returns non-`null`. Previously,
`randomValueOtherThan` just ran the supplier one time which was
confusing.
Unexpectedly, it looks like not tests rely on the original `null`
handling.
Closes#27821