This commit fixes a minor issue with the shard ID that is logged while
indexing in DiscoveryWithServiceDisruptionsIT#testAckedIndexing. The
issue is that the operation routing hash could lead to a negative
remainder modulo the number of primaries (if the hash itself is
negative) but should instead be the normalized positive remainder. This
issue only impacts the logging of the shard ID as the actual shard ID
used during indexing is computed elsewhere but would cause the shard ID
in the affected logging statement to not match shard IDs that are logged
elsewhere.
this change allows us to open existing IndexMetaData that contains invalid, removed settings
or settings with invalid values and instead of filling up the users disks with exceptions we _archive_
the settings with and `archive.` prefix. This allows us to warn the user via logs (once it's archived) as
well as via external tools like the upgrade validation tool since those archived settings will be preserved
even over restarts etc. It will prevent indices from failing during the allocaiton phase but instead will
print a prominent warning on index metadata recovery from disk.
Affects match, multi_match, query_string and simple_query_string queries.
Direct bool queries are not affected anymore (minimum_should_match is applied even if the coord factor is disabled).
Default headers are now read-only fallbacks if the key is actually not in
the headers map. That way we never serialize them across the wire and also never
prevent them from being overwritten.
This has caused some test failures lately especially on window (which is likely caused
by the rather bad performance of the windows test machines).
See one failure here:
http://build-us-00.elastic.co/job/es_core_master_window-2008/2934/
This fix has now also a unittest that tests this issue separately.
* Cleaned up MapperService#searchFilter(...) and moved it DefaultSearchContext, since it that class was the only user. As part of the cleanup percolate query documents are no longer excluded from the search response.
* Removed resolveClosestNestedObjectMapper(...) method as it was no longer used.
* Removed DocumentTypeListener infrastructure. Before it was used by the percolator and parent/child, but these features no longer use it.
Closes#15924
Adding the ability to parse from xContent to the rescore builder.
Also making RescoreBuilder an interface and renaming the current
base builder that encapsulates the `window_size` setting the the
concrete rescorer implementation to RescoreBaseBuilder.
* Remove remaining 1.x bwc logic.
* Stop storing stored fields and indexed terms. The _parent field's only purpose is to support joins between parent and child type and only storing doc values is sufficient.
* In the mapping the parent field mapper is now known under '{parent}#{child}' key, because this is the field the parent/child join uses too.
* Added new sub fetch phase to lookup that _parent field from doc values field if that is required (before this was fetched from stored _parent field)
* Removed the ability to query directly on `_parent` in the query dsl. Instead the `{parent}#{child}` field should be used. Under the hood a doc values query is used instead of a term query, because only doc values fields are stored now.
* Added a new `parent_id` query to easily query child documents with a specific parent id without having to know what join field to use
* Also in aggregations `_parent` field can't be used any more and `{parent}#{child}` field name should be used instead to aggregate directly on the _parent join field.
While conceptually correct this call can be confusing since if a
realtime GET is exectued after refresh is called with realtime=true it might
fetch the wrong document out of the translog since the isRefreshNeeded guard
might return false once the new searcher is published but it will not
wait until the verision map is flushed and that can cause a slight race condition
if a subsequent call GETs a document that it expects to come from the index.
Ie. the `_size` mapper tests do this and expecte the GET call to return the document
from the index but it comes from the t-log due to this race.
This change only uses the guard in an async refresh that we schedule to prevent unnecessary
refresh calls but will allow API Refresh calls to have the somewhat less confusing semantics.
This was introduces lately only on unrelease major version branches.
Mostly these were pretty easy to clean up by insisting that the request
and response stays consistent across the filter. There are a few places
where we have to make assumptions in tests but those are valid assumptions
for the test.
This adds the ability to filter clients multiple times and allows to inherit the outer context if done so.
This also adds a method to access all tasks in the threadpool in an unwrapped fashion
This commit allows an integ test to wrap all clients that are exposed by the
InternalTestCluster which is useful for request interception or to add general headers
to request or to intercept client to server call even if the client calls are done from within
the test framework.
It defaults to false and when false it returns a task identifier. Right
now all you can do is get the task to see if it is still running. Once
the task finishes it vanishes and you can't get any information about it.
Today the TransportClient uses the given settings rather than the updated setting from the plugin
service to pass on to it's modules etc. It should use the updates settings instead.
Today we already validate all index level setting on startup. For global
settings we are not fully there yet since not all settings are registered.
Yet we can already validate the ones that are know if their values are parseable/correct.
This is better than nothing and an improvement to what we had before. Over time there will
be more an dmore setting converted and we can finally flip the switch.
Cut over all index scope settings to the new setting infrastrucuture
This change moves all index.* settings over to the new infrastructure. This means in short that:
- every setting that has an index scope must be registered up-front
- index settings are validated on index creation, template creation, index settings update
- node level settings starting with index.* are validated on node startup
- settings that are private to ES like index.version.created can only be set by tests when they install a specific test plugin.
- all index settings can be reset by passing null as their value on update
- all index settings defaults can be listed via the settings APIs
Closes#12854Closes#6732Closes#16032Closes#12790
This commit adds handling of channel failures when starting a shard to
o.e.c.a.s.ShardStateAction. This means that shard started requests
that timeout or occur when there is no master or the master leaves
after the request is sent will now be retried from here. The listener
for a shard state request will now only be notified upon successful
completion of the shard state request, or when a catastrophic
non-channel failure occurs.
This commit also refactors the handling of shard failure requests so
that the two shard state actions of shard failure and shard started
now share the same channel-retry and notification logic.
This commit fixes an issue in the handling of TransportExceptions in
ShardStateAction. There were two cases not being handled correctly.
- when the local node is shutting down, handlers will be notified with
a TransportException with a message starting "transport stopped"
- when the remote node disconnects, handlers will be notified with a
NodeDisconnectedException
In both of these cases, the cause of the exception will be null and this
was incorrectly being handled. The first case can passed to the listener
like any other critical non-channel failure, and the second case can be
handled by modifying the logic for detecting master channel exceptions.
There was a third case of NodeNotConnectedException that was not being
treated as a master channel exception but should be.
This commit adds an integration test that simulates the handling of a
shard failure request during a network partition. By isolating the
master from the cluster while a shard failed request is in flight, this
test simulates that we wait until a new master is elected and then retry
sending that shard failed request to the newly elected master.
This commit adds methods to CapturingTransport to separate local and
remote transport exceptions. The motivation for this change is that
local transport exceptions are delivered to listeners (usually, but not
always) wrapped in SendRequestTransportException while remote transport
exceptions are delivered to listeners wrapped in
RemoteTransportException. By making this distinction clear in the
CapturingTransport, this makes it less likely that tests will make
incorrect assumptions about the exceptions coming out of the transport
layer to listeners.
Closes#16057
Currently we use ref counting to manage the life cycles of a translog file. This was done to allow the creation of view and snapshots, making sure that the underlying files are available. This commit takes a simpler route based on the observation that a snapshot doesn't need to have it's own life cycle but rather can lift on the lifecycle of it's parent (translog or view). If code failes to adhere to this assumption it will get a channel already closed exception. As such, each file is now owned by a single owner and there is no need for reference counting. As part of the rewrite TranslogReader is renamed to BaseTranslogReader and ImmutableTranslogReader to TranslogReader
Also, I took the opportunity to clean up legacy translog readers we don't need in master.
Closes#15898
Sometimes action callers might be interested in having an access to the task that they have just initiated. This changes allows callers to get access to the Task object of the actions that they just started if the action supports the task management.
We have two similar tests with the same name, ContextAndHeaderTransportTests.
They shared lots of common code so I extracted much of it into
ActionRecordingPlugin, a plugin which records all action requests for later
inspection.
I also removed all the warnings from both tests. That made lang-mustache
compile cleanly without any custom -Xlint so I removed those. To remove
the warnings I had to add type parameters to ActionFilter which seemed
like a good idea anyway.
This commit tightens the load average test assertions by separating out
OS X as a system where we can make tighter assertions about the load
average values.
This commit fixes the test for load averages on FreeBSD. On FreeBSD, it
is either the case that linprocfs is mounted at /compat/linux/proc in
which case the load averages are available, or this is not the case and
no load average are available. Previously, the test on FreeBSD was
falling back to the catch all case which asserts that the five-minute
and fifteen-minute load averages are not available.
This commit normalizes the one-minute load average obtained from
OperatingSystemMXBean#getSystemLoadAverage to -1 when it is not
available. This is to reflect the Javadocs for this method saying "If
the load average is not available, a negative value is returned."
1. Gets guice out of the business of building ScoreFunctionParsers and
QueryParsers.
2. Moves QueryParser registration to SearchModule
3. Moves NamedWriteableRegistry construction out of guice and into Node and
TransportClient.
4. Moves shape registration into SearchModule so now all named writeable
registration is done in the SearchModule.
This is breaking for plugin authors. Instead of declaring new QueryParser
like:
```java
public void onModule(IndicesModule module) {
module.registerQueryParser(NewQueryParser.class);
}
```
you do it like:
```java
public void onModule(SearchModule module) {
module.registerQueryParser(NewQueryParser::new);
}
```
The QueryParser's argument no longer come from @Inject, now they come from
the declaration in the plugin. The above example is for a no-arg QueryParser.
Most of the QueryParsers in Elasticsearch are no-arg.
ScoreFunctionParsers have a similar but slightly different change. This:
```java
public void onModule(SearchModule module) {
module.registerFunctionScoreParser(NewFunctionScoreParser.class);
}
```
becomes
```java
public void onModule(SearchModule module) {
module.registerFunctionScoreParser(new NewFunctionScoreParser());
}
```
Since all known ScoreFunctionParsers have no arg constructors its simpler to
just build them at registration time rather than specify a supplier that is
used to build them later.
This commit modifies the load_average in the node stats API response
to be an object containing the one-minute, five-minute and
fifteen-minute load averages as fields (if those values are
available). Additionally, this commit modifies the cat nodes API
response to format the one-minute, five-minute and fifteen-minute load
averages as null if any of the respective values are not available.
This would be useful in order to only perform some validations in the case of
a mapping update and in cases when a mapping is restored eg. after a restart,
such as discussed in #15989.
This replaces the current `applyDefault` parameter which can be derived from
the mapping merge reason: the default mapping should be applied only in case of
a mapping update, if the mapping does not exist yet and if this is not the
default mapping.
Setting realTime to false in the get term vector request ensures,
that after a refresh the document is not fetched from the translog,
which seems to yield different in rare test runs.
The change likely triggering this was introduced in #15933
From this commit on we also validate all settings starting with `index.` that are
node-level settings configured in yaml files or via commandline arguments or system properties.
This check happens on node startup before the actual node is started.
The old infa has been removed in this commit such that nothing uses `DynamicSettings` anymore
and all index-scoped settings require to be registered before the node has fully started up.
This commit adds some additional assertions that test success is not
falsely indicated by adding assertions that success / failure methods
are not incorrectly invoked in failure / success scenarios.
Site plugins used to be used for things like kibana and marvel, but
there is no longer a need since kibana (and marvel as a kibana plugin)
uses node.js. This change removes site plugins, as well as the flag for
jvm plugins. Now all plugins are jvm plugins.
This commit adds a test for calculating the age of PrioritizedRunnable
that allows real clock time to elapse. The test ensures that at least
one millisecond has passed, and that the resolution of System#nanoTime
on the underlying system is actually able to detect this.
Relates #15995
This commit wraps a trace logging message in a trace logging level check
to prevent allocating an Object array (to hold the logging parameters)
and a String (from the interval) when trace logging is not enabled every
second (with the default index refresh interval) and every five seconds
(with the default translog sync interval) for every open index when
trace logging is not enabled.
This commit simplifies an equality test in IndexShard#sameException
where the messages for two exceptions are being compared. The previous
condition first tested logical equality if the left exception is not
null, and otherwise tested reference equality. There is a convenience
method since JDK 7 for testing equality in this way: Objects#equals.
Closes#16025
Fixes an issue caused by trying to add a LeafReader for a closed index to
ShardCoreKeyMap. It add itself to the map half way before throwing
AlreadyClosedException, leaking some references and causesing Elasticsearch
to refuse to shut down cleanly when testing.
This commit enhances the master channel exception test in
o.e.c.a.s.ShardStateActionTests to test that a retries loop as expected
when requests to the master repeatedly fail.
When a metadata mapper is not specified in a mapping update, it should default
to the current metadata mapper instead of the general default in order for the
update to not conflict with the current mapping.
Closes#15997
It had some funky errors, like lenient:true not working and queries with
two integer fields blowing up if there was no analyzer defined on the
query. This throws a bunch more tests at it and rejiggers how non-strings
are handled so they don't wander off into scary QueryBuilder-land unless
they have a nice strong analyzer to protect them.
This commit adds a trace log on a cluster state update while waiting for
a new master, and changes the log level on cluster service close to the
warn level.
This commit tightens the tests in o.e.c.a.s.ShardStateActionTests:
- adds a simple test for a success condition that validates the shard
failed request is correct and sent to the correct place
- remove redundant assertions from the no master and master left tests
- an assertion that success is not falsely indicated in the case of a
unhandled error
the environment is now available through NodeModule#getNode#getEnvironment and can be retrieved during onModule(NodeModule), no need for this indirection anymore using the BiFunction
This commit addresses a time unit conversion bug in calculating the age
of a PrioritizedRunnable. The issue was an incorrect conversion from
nanoseconds to milliseconds as instead the conversion was to
microseconds. This leads to the timeInQueue metric for pending tasks to
be off by three orders of magnitude.
* Folded IngestModule into NodeModule
* Renamed IngestBootstrapper to IngestService
* Let NodeService construct IngestService and removed the Guice annotations
* Let IngestService implement Closable
We have a performance bug that if a filter aggregation is below a terms
aggregation that has a cardinality of 1000, we will call Query.createWeight
1000 times as well. However, Query.createWeight can be a costly operation.
For instance in the case of a TermQuery it will seek the term in every
segment. Instead, we should create the Weight once, and then get as many
iterators as we need from this Weight.
I found this problem while trying to diagnose a performance regression while
upgrading from 1.7 to 2.1[1]. While the problem was not introduced in 2.x, the
fact that 1.7 cached very aggressively had hidden this problem, since you don't
need to seek the term anymore on a cached TermFilter.
Doing things once for every aggregator is not easy with the current API but
I discussed this with Colin and Aggregator factories will need to get an init
method for different reasons, where we will be able to put these steps that
need to be performed only once, no matter haw many aggregators need to be
created.
[1] https://discuss.elastic.co/t/aggregations-in-2-1-0-much-slower-than-1-6-0/38056/26
If an operating system reports -1 for the total bytes of a filesystem
path, we should ignore it when capturing the least and most available
statistics.
Relates to #15919
Squashed commit of the following:
commit 5d2258ffeff8a0d156295dcc754ab9b6cbb4b02e
Author: Lee Hinman <lee@writequit.org>
Date: Thu Jan 14 14:14:27 2016 -0700
Change test to test positive total with negative 'free' value
commit 927e61d4b39692fc147220a955b63b291ad80db5
Author: Lee Hinman <lee@writequit.org>
Date: Thu Jan 14 13:09:28 2016 -0700
Skip capturing least/most FS info for an FS with no total
If an operating system reports -1 for the total bytes of a filesystem
path, we should ignore it when capturing the least and most available
statistics.
Relates to #15919
This commit removes the timeout retry mechanism from ShardStateAction
allowing it to instead be handled by the general master channel retry
mechanism. The idea is that if there is a network issue, the master will
miss a ping timeout causing the channel to be closed which will expose
itself via a NodeDisconnectedException. At this point, we can just wait
for a new master and retry, as with any other master channel exception.
This commit moves the handling of channel failures when failing a shard
to o.e.c.a.s.ShardStateAction. This means that shard failure requests
that timeout or occur when there is no master or the master leaves after
the request is sent will now be retried from here. The listener for a
shard failed request will now only be notified upon successful
completion of the shard failed request, or when a catastrophic
non-channel failure occurs.
This commit adds a simulation of the master leaving after a shard
failure request has been sent. In this case, after a new cluster state
is published (simulating a new master having been elected), the request
to fail the shard should be retried.
This commit handles the situation when we are failing a shard and either
no master is known, or the known master left while failing the shard. We
handle this situation by waiting for a new master to be reelected, and
then sending the shard failed request to the new master.
To main concern with the dedicated ingest TP is that there are already many TPs and in the case with beefy nodes we would many more threads. In the case ingest isn't used the all these threads are just idle.
Adding serialization capabilities to RescoreBuilder and make
all QueryRescorer implement NamedWritable, also requiring
all implementations of RescoreBuilder.Rescorer to implement
equals() and hashCode.
In addition, the current rescore mode enumeration is pulled out to a
separate class to make sharing of constants easier between
the query builders XContent rendering coder and the parser.
This change affects get alias, get aliases as well as cat aliases. They all return closed indices too by default. get alias and get aliases also allow to return open indices only through the `expand_wildcards` option (set it to `open`).
Closes#14982
This undocumented setting was mainly used for testing and a safety net for
a new flushOnClose feature back in 1.x. We can now safely remove this setting.
The test usage now uses a mock plugin setting to achive the same.
This creates an reindex plugin with a very basic implementation that is
very like delete-by-query. New we'll integrate it with the task managament
work but for now this works.
Today we maintain a lot of settings on the shard level which are all index level settings.
In order to cut over to the new settings API where we register update listener we have to move
all of them on to the index level otherwise we need a way to un-register listeners which is error-prone
and requires additional handling when shards are closed. It's simpler and also more accurate to handle all of
them on the index level where we can trash the entire registry for update listener once the index goes out of scope.
Don't set the suppressed Exception in Translog.closeOnTragicEvent(Exception ex) if it is an
AlreadyClosedException. ACE is thrown by the TranslogWriter and as cause might
contain the Exception that we add the suppressed ACE to. We then end up with a
circular reference where Exception A has a suppressed Exception B that has as cause A.
This would cause a stackoverflow when we try to serialize it.
For a more detailed description see #15941closes#15941
ContextAndHeaders has a massive impact on the core infrastructure since it has to
be manually passed on to all relevant places across threads/network calls etc. For the same reason
it's also very error prone and easily forgotten on potentially relevant APIs.
The new ThreadContext is associated with a ThreadPool (node or transport client) and ensures that
headers and context registered on a current thread are inherited to new threads spawned, send across
the network to be deserialized on the receiver end as well as restored on the response handling thread
once the response is received.
This commit adds load averages to the OS stats on FreeBSD. For these
stats to be available, linprocfs must be available and mounted at
/compat/linux/proc.
`refresh_interval` is a per index setting but we interpret and maintain it per shard. This
change moves the refresh task outside of IndexShard to the IndexService where it logically belongs
and reuses scheduling infrastructure used for translog fsync (async commit).
This change will use the same task for all shards of an index while previously we used on thread/task
per shard to refresh. This will also prevent too many concurrent refreshes if there are many indices and
shards allocated on a single node.
These filters leak into highlighting and probably other places and cause
things like the type name to be highlighted when using
requireFieldMatch=false. We could have special hacks to keep them out of
highlighting but it feals better to keep them out of any variable named
"originalQuery".
Closes#15689
This commit adds convenience methods to o.e.t.t.CapturingTransport
that enables capturing requests and clearing the captured requests
with a single method. This is to simplify a common pattern in tests of
capturing requests, and then clearing the captured requests.
Also renaming internal methods to reflect that they are dealing with
jts coordinates. Also renamed the list() to build() method for creating
the coordinates lists and adding constructors to PolygonBuilder that
take CoordinatesBuilders and implicitely call build() on them.
Rather than failing the request, when a node with node.ingest set to false receives an index or bulk request with a pipeline id, it should try to redirect the request to another node with node.ingest set to true. If there are no node with ingest set to true based on the current cluster state, an exception will be returned and the request will fail. Note that in case there are no ingest nodes and bulk has a pipeline id specified only for a subset of index requests, the whole bulk will fail.
The indexing buffer on a node (default: 10% of the JVM heap) is now a "shared pool" across all shards on that node. This way, shards doing intense indexing can use much more than other shards doing only light indexing, and only once the sum of all indexing buffers across all shards exceeds the node's indexing buffer will we ask shards to move recently indexed documents to segments on disk.
determined, the UpdateRequest would still try to parse the content instead
of throwing the standard ElasticsearchParseException. This manifests when
passing illegal JSON in the request body that does not begin with a '{'.
By trying to parse the content from an unknown request body content type,
the UpdateRequest was throwing a null pointer exception. This has been
fixed to throw an ElasticsearchParseException, to be consistent with the
behavior of all other requests in the face of undecipherable request
content types.
Closes#15822
1. Uses forbidden patterns to prevent things from referencing
java.io.Serializable or from mentioning serialVersionUID.
2. Uses -Xlint:-serial so we don't have to hear from javac that we aren't
declaring serialVersionUID on any classes that we make that happen to extend
Serializable.
3. Remove Serializable and serialVersionUID declarations.
I didn't use forbidden apis because it doesn't look like it has a way to ban
explicitly implementing Serializable. If you try to ban Serializable with
forbidden apis you end up banning all Exceptions and all Strings.
Closes#15847
So far the validation of geo shapes was only taking place in the
parse methods in ShapeBuilder. With the recent refactoring we no
longer can rely on shapes being parsed from json, so the same kind
of validation should take place when just using the java api.
A lot of validation concerns the number of points a shape needs to
have in order to be valid. Since this is not possible with current
builders where points can be added one by one, the builder constructors
are changed to require the mandatory parameters and validate those
already at construction time. To help with constructing longer lists
of points, a new utility PointsListBuilder is instroduces which can
produce list of coordinates accepted by most of the other shape builder
constructors.
Also adding tests for invalid shape exceptions to the already existing
shape builder tests.
Now that the ingest infra is part of es core we can remove some code that was required by the plugin and have a better integration with es core. We allow to specify the pipeline id in bulk and index as a request parameter, we have a REST filter that parses it and adds it to the relevant action request. That is not required anymore, as we can add this logic to RestIndexAction and RestBulkAction directly, no need for a filter. Also, we can allow to specify a pipeline id for each index requests in a bulk request. The small downside of this is that the ingest filter has to go over each item of a bulk request, all the time, to figure out whether they have a pipeline id.
Running requests via the percolate or mpercolate api is irrelevant.
What is relevant is that when nodes come back that they report the expected number of matches.
We fail today with ClusterBlockExceptions if an alias expands to a closed index
during search since we miss to check the index option down the road after we expanded
aliases.
Closes#13278
This commit removes and now forbids use of
org.apache.lucene.index.IndexWriter#isLocked as this method was
deprecated in LUCENE-6508. The deprecation is due to the fact that
checking if a lock is held before acquiring that lock is subject to a
time-of-check-to-time-of-use race condition. There were three uses of
IndexWriter#isLocked in the code base:
- a logging statement in o.e.i.e.InternalEngine where we are already in
an exceptional condition that the lock was held; in this case,
logging whether or not the directory is locked is superfluous
- in o.e.c.l.u.VersionsTests where we were verifying that a write lock
is released upon closing an IndexWriter; in this case, the check is
not needed as successfully closing an IndexWriter releases its
write lock
- in o.e.t.s.MockFSDirectoryService where we were verifying that a
directory is not write-locked before (implicitly) trying to obtain
such a write lock in org.apache.lucene.index.CheckIndex#<init> (this
is the exact type of a situation that is subject to a race
condition); in this case we can proceed by just (implicitly) trying
to obtain the write lock and failing if we encounter a
LockObtainFailedException
This commit reduces the former ShardIndexinService to a simple stats/metrics
class, moves IndexingSlowLog to the IndexService level since it can be shared
across shards of an index and is now hidden behind IndexingOperationListener.
IndexingOperationListener is now a first class citizen in IndexShard and is passed
in from IndexService.
It had some funky errors, like lenient:true not working and queries with
two integer fields blowing up if there was no analyzer defined on the
query. This throws a bunch more tests at it and rejiggers how non-strings
are handled so they don't wander off into scary QueryBuilder-land unless
they have a nice strong analyzer to protect them.
Closes#15860
It currently tries to align to the page size (16KB) by default. However, this
might waste a significant memory (if many BytesStreamOutputs are allocated)
and is also useless given that BytesStreamOutput does not recycle (on the
contrary to ReleasableBytesStreamOutput). So the initial size has been changed
to 0.
Closes#15789
This commit removes and now forbids all uses of
java.util.concurrent.ThreadLocalRandom across the codebase. The
underlying issue with ThreadLocalRandom is that it can not be
seeded. This means that if ThreadLocalRandom is used in production code,
then tests that cover any code path containing ThreadLocalRandom will be
prevented from being reproducible by use of ThreadLocalRandom. Instead,
using org.elasticsearch.common.random.Randomness#get will give
reproducible sources of random when running under tests and otherwise
still give an instance of ThreadLocalRandom when running as production
code.
On slow machines when this test randomly picks a large number of shards it can occasionally take more than 32.5 seconds to snapshot all shards. That is causing the test to miss the second to last assert in awaitsBusy at 32.5 seconds and then timeout in BlockingClusterStateListener at 60 seconds. Due to the timeout, the pending task queue is cleaned before the last awaitsBusy assert at 65 seconds and as a result the last assert runs on a completely empty queue and fails with a very confusing assert error.
This commit makes the timeout in BlockingClusterStateListener to occur after the last assert in assertBusyPendingTasks and therefore allows assertBusyPendingTasks to perform the last assert before cleaning the pending tasks queue takes place.
This commit also reduces the maximum number of shards used in the test to 10 in order to speed up this test.
We have a bug that makes all per-index bitset caches store bitsets for all
indices. In the case that you have many indices, which is fairly common with
time-based data, this could translate to a lot of wasted memory.
Closes#15820
`ScheduledThreadPoolExecutor` allows you to schedule tasks to run once or periodically at the future. If such a task throws an exception, that exception is caught and reported in the future that `ScheduledThreadPoolExecutor#schedule` returns. However, we typically do not capture the future / do not test it for errors. This results in exception being swallowed and not reported. To mitigate this we now wrap any command in a LoggingRunnable (already used for periodic tasks). Also, RunnableCommand is changed not to swallow exception but percolate them further for reporting by the future.
Closes#15824
CRUD and simulate apis work now fine, every node has the pipelines in memory, but node.ingest disables ingestion, meaning that any index or bulk request with a pipeline id is going to fail
This commit adds a new test that can throw an IOException at any point in time
and ensures that all previously synced documents can be successfully recovered after hitting
an excepiton.
Relates to #15788
Warmers are now barely useful and will be removed in 3.0. Note that this only
removes the warmer API and query-based warmers. We still have warmers internally
for eg. global ordinals.
Close#15607
Right now we define the same sort of methods as taking String arrays and
string varargs. We should standardize on one and varargs is easier to
call so lets use varargs!
This commit uses a CyclicBarrier to correctly and simply sychronize the
driver and test threads in
ClusterServiceIT#testClusterStateUpdateTasksAreExecutedInOrder.
This commit speeds up GeoShapeQueryTests by reducing the size of the random generated shapes and defaulting geo_shape indexes to use quadtree (more efficient for shapes) over geohash.
test failed, because now the percolator returns upto 10 matches whereas before this was unbounded. The test has been updated to take this in account by checking the total count instead of the number of matches
This commit cleans up some of the assertions in
TransportReplicationActionTests#runReplicateTest:
- use a Map to track actual vs. expected requests
- assert that no request was sent to the local node
- use RoutingTable#shardRoutingTable convenience method
- explicitly use false in boolean conditions
- clarify requests are expected on replica shards when assigned and
execution on replicas is true
- test ShardRouting equality when checking the failed shard request
This commit adds tighter assertions in
TransportReplicationActionTests#runReplicateTest that replication
requests are sent to the correct shard copies.
This commit addresses an issue when handling a failed replication
request against a relocating target shard. Namely, if a replication
request fails against the target of a relocation we currently fail both
the source and the target. This leads to an unnecessary
recovery. Instead, only the target of the relocation should be failed.
We will keep this abstractions as it's convenient, otherwise IngestDocument would depend on ScriptService directly, and would explicitly rely on mustache which is not even part of core. better to have the interface in core, and the impl as part of the ingest plugin, which relies on mustache, shipped with core by default.
* Added percolator field mapper that extracts the query terms and indexes these terms with the percolator query.
* At percolate time these extracted terms are used to query percolator queries that are like to be evaluated. This can significantly cut down the time it takes to percolate. Whereas before all percolator queries were evaluated if they matches with the document being percolated.
* Changes made to percolator queries are no longer immediately visible, a refresh needs to happen before the changes are visible.
* By default the percolate api only returns upto 10 matches instead of returning all matching percolator queries.
* Made percolate more modular, so that it is easier to add unit tests.
* Added unit tests for the percolator.
Closes#12664Closes#13646
We today delete the translog-N.tlog file if any subsequent operation fails
but we might actually be in a good state if for instance the creation of the writer
failes after we sucessfully baked the new translog generation into the checkpoint. In this situation
we used to delete the translog-N.tlog file and failed on the next recovery of the translog with a
NoSuchFileException | FileNotFoundException just like in https://discuss.elastic.co/t/cannot-recover-index-because-of-missing-tanslog-files/38336
This commit changes the behavior and cleans up that limbo state on recovery if we already have a generation+1 file written but not baked into
the checkpoint we remove that file but only if the previous ckp file has already been renamed otherwise we know we can't be in this state.
We has a postIndex|DeleteUnderLock listener callback to load percolator
queries which is entirely private to the index shard in the meanwhile. Yet,
it still calls an external callback while holding an indexing lock which is scary
since we have no control over how long the operation could possibly take.
This commit decouples the percolator registry entirely from the ShardIndexingService
by pessimistically fetching percolator documents from the the engine using realtime get.
Even in situations where the same document is changed concurrently we will eventually end up
in the correct state without loosing an update. This also moves the index throtteling stats directly into
the engine to entirely remove the need for the dependency between InternalEngine and ShardIndexingService.
Relocating a non-primary shard from one node to another is actually done by recovering from the active
primary shard in the cluster, and not the node that we are logically relocating from.
Closes#15775
This commit addresses an issue where a cluster state task listener
throwing an exception could prevent other listeners from being notified,
and could prevent the executor from receiving notifications that a new
cluster state was published. Additionally, this commit also addresses a
similar issue for executors handling cluster state publication
notifications.
Adds task manager class and enables all activities to register with the task manager. Currently, the immutable Transport*Activity class represents activity itself shared across all requests. This PR adds and an additional structure Task that keeps track of currently running requests and can be used to communicate with these requests using TransportTaskAction.
Related to #15117
We used to write into an in-memory buffer and if necessary also allow reading
from the memory buffer if the some translog locations that are not flushed to
the channel need to be read. This commit hides all writing behind a buffered output
stream and if ncecessary flushes all buffered data to the channel for reading. This allows
for several simplifcations like reusing javas build in BufferedOutputStream and removes the
need for read write locks on the translog writer. All thread safety is now achived using
the synchronized primitive.
Several IOExceptions are always wrapped in an NotSerializableWrapper which is
annoying to read. These exceptions are important to get right across the network
and we should support the important ones that indicate problems on the Filesystem.
This commit also adds general support for IOException to preserve the parent type
across the network if no specific type is serializable.
As a default in V2, the GeoPointField.stored option was set to true. Since this consumes disk space with no positive benefit the default stored option is being reverted back to false.
This commit restores logging the ShardRouting#shardId at the front of
the log messages in ShardStateAction. The reason for this is so that
shard-level log messages have the format "[component][node][shard]
message".
There are two bugs:
- the 'global_ordinals_low_cardinality' mode requires a fielddata-based impl so
that it can extract the segment to global ordinal mapping
- the 'global_ordinals_hash' mode abusively casts to the values source to a
fielddata-based impl while it is not needed
Closes#14882
This commit modifies the handling of cluster states in
o.e.c.a.s.ShardStateAction so that all necessary state is obtained
externally to the ShardStateAction#shardFailed and
ShardStateAction#shardStarted methods. This refactoring permits the
removal of the ClusterService field from ShardStateAction.
This commit applies a minor code cleanup to
o/e/c/ClusterStateObserver.java. In particular
- employ the diamond operator instead of explicitly specifying a
generic type parameter
- use 'L' instead of 'l' for specifying a long literal
- remove redundant static modifier on a nested interface
- remove redundant public access modifiers on interface methods
- reformat the declaration of the four-argument ChangePredicate#apply
- simplify the bodies of ValidationPredicate#apply
This commit fixes multiField support for GeoPointFieldMapper by passing an externalValueContext to the multiField parser. Unit testing is added for multi field coverage.
The MapperService doesn't currently check the
index.mapper.dynamic setting during index creation,
so indices can be created with dynamic mappings even
if this setting is false. Add a check that throws an
exception in this case. Fixes#15381
DedicatedClusterSnapshotRestoreIT#testRestoreIndexWithMissingShards took ~1.5 min to finish
due to timeouts that are applied if not all shards are allocated. Now that the index that has
unallocated shareds is not refreshed the test is more reasonable and runs in 15 sec
With this commit we check more precisely on the result of a bulk
request. It could either be ok, fail or be rejected due to resource
constraints. Previously, we have relied that by default we never
get rejected.
However, this is a valid condition even when retrying. With this
commit we check that we either retried often enough that we don't
get rejected *and* if we got rejected that we maxed out the number
of specified retries.
When specifying a string field, you can either do:
```
{
"foo": "bar"
}
```
or
```
{
"foo": {
"value": "bar",
"boost": 42
}
}
```
The latter option is now removed.
Closes#15388
Removal of the pattern node.addShard() -> calculate weight -> node.removeShard() which is expensive as, beside map lookups, it invalidates caching of precomputed values in ModelNode and ModelIndex. Replaced by adding an additional parameter to the weight function which accounts for the added / removed shard.
This removes the backward compatibility layer with pre-2.0 indices, notably
the extraction of _id, _routing or _timestamp from the source document when a
path is defined.
Today when dynamically mapping a field that is already defined in another type,
we use the regular dynamic mapping logic and try to copy some settings to avoid
introducing conflicts. However this is quite fragile as we don't deal with every
existing setting. This proposes a different approach that will just reuse the
shared field type.
Close#15568
FunctionScoreQuery should do two things that it doesn't do today:
- propagate the two-phase iterator from the wrapped scorer so that things are
still executed efficiently eg. if a phrase or geo-distance query is wrapped
- filter out docs that don't have a high enough score using two-phase
iteration: this way the score is only checked when everything else matches
While doing these changes, I noticed that minScore was ignored when scores were
not needed and that explain did not take it into account, so I fixed these
issues as well.
This changes a couple of things:
Mappings are truly immutable. Before, each field mapper stored a
MappedFieldTypeReference that was shared across fields that have the same name
across types. This means that a mapping update could have the side-effect of
changing the field type in other types when updateAllTypes is true. This works
differently now: after a mapping update, a new copy of the mappings is created
in such a way that fields across different types have the same MappedFieldType.
See the new Mapper.updateFieldType API which replaces MappedFieldTypeReference.
DocumentMapper is now immutable and MapperService.merge has been refactored in
such a way that if an exception is thrown while eg. lookup structures are being
updated, then the whole mapping update will be aborted. As a consequence,
FieldTypeLookup's checkCompatibility has been folded into copyAndAddAll.
Synchronization was simplified: given that mappings are truly immutable, we
don't need the read/write lock so that no documents can be parsed while a
mapping update is being processed. Document parsing is not performed under a
lock anymore, and mapping merging uses a simple synchronized block.
This adds the required changes/checks so that the build can run on
FreeBSD.
There are a few things that differ between FreeBSD and Linux:
- CPU probes return -1 for CPU usage
- `hot_threads` cannot be supported on FreeBSD
From OpenJDK's `os_bsd.cpp`:
```c++
bool os::is_thread_cpu_time_supported() {
#ifdef __APPLE__
return true;
#else
return false;
#endif
}
```
So this API now returns (for each FreeBSD node):
```
curl -s localhost:9200/_nodes/hot_threads
::: {Devil Hunter Gabriel}{q8OJnKCcQS6EB9fygU4R4g}{127.0.0.1}{127.0.0.1:9300}
hot_threads is not supported on FreeBSD
```
- multicast fails in native `join` method - known bug:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=193246
Which causes:
```
1> Caused by: java.net.SocketException: Invalid argument
1> at java.net.PlainDatagramSocketImpl.join(Native Method)
1> at java.net.AbstractPlainDatagramSocketImpl.join(AbstractPlainDatagramSocketImpl.java:179)
1> at java.net.MulticastSocket.joinGroup(MulticastSocket.java:323)
1> at org.elasticsearch.plugin.discovery.multicast.MulticastChannel$Plain.buildMulticastSocket(MulticastChannel.java:309)
```
So these tests are skipped on FreeBSD.
Resolves#15562
In this commit we increase the queue size of the bulk pool in
BulkProcessorRetryIT to make it less sensitive.
As this test case should stress the pool so bulk processor needs to
back off but not so much that the backoff policy will give up at
some point (which is a valid condition), we still keep it below the
default queue size of 50.
Some tests, but in particular CodecTests, are slow because they test all
versions that ever existed even though they should only test supported
versions.
Today we throttle recoveries only for incoming recoveries. Nodes that have a lot
of primaries can get overloaded due to too many recoveries. To still keep that at bay
we limit the number of threads that are sending files to the target to overcome this problem.
The right solution here is to also throttle the outgoing recoveries that are today unbounded on
the master and don't start the recovery until we have enough resources on both source and target nodes.
The concurrency aspects of the recovery source also added a lot of complexity and additional threadpools
that are hard to configure. This commit removes the concurrent streamns notion completely and sends files
in the thread that drives the recovery simplifying the recovery code considerably.
Outgoing recoveries are not throttled on the master via a allocation decider.
We added this undocumented realtime setting as backup plan long ago
but to date we haven't had a situation where it was a problem. It's reducing
the number of filehandles in the NRT case dramatically and should always be enabled.