This removes the deprecation for the geohash based setter to quickly fix the failure here: http://build-us-00.elastic.co/job/es_core_master_suse/3312
Reintroducing postponed until related test in groovy module is fixed. Need to figure out what went wrong when I ran the build locally w/o failure before.
This commit enableds strict settings validation on node startup. All settings
passed to elasticsearch either through system properties, yaml files or any other
way to pass settings must be registered and valid. Settings that are unknown ie. due to
typos or due to deprecation or removal will cause the node to NOT start up. Plugins
have to declare all their settings on the `SettingsModule#registerSetting` and settings for
plugins that are not installed must be removed.
This commit also removes the ability to specify the nodes name via `-Des.name` or just `name` in the
configuration files. The node name must be prefixed with the node prexif like `node.name: Boom`. Left over
usage of `name` will also cause startup to fail.
Adds to GeoDistanceSortBuilder:
* equals
* hashcode
* writeto/readfrom
* moves xcontent parsing logic over
* adds roundtrip tests
* fixes roundtrip test for xcontent by keeping points just as geopoints not geohashes internally
* fixes xcontent parsing of ignore_malformed if coerce is set/unset
* adds exception to sortMode setter to avoid setting invalid sort modes
Relates to #15178
When primary relocation completes, a cluster state is propagated that deactivates the old primary and marks the new primary as active.
As cluster state changes are not applied synchronously on all nodes, there can be a time interval where the relocation target has processed
the cluster state and believes to be the active primary and the relocation source has not yet processed the cluster state update and
still believes itself to be the active primary. This commit ensures that, before completing the relocation, the reloction source deactivates
writing to its store and delegates requests to the relocation target.
Closes#15900
Adds a container that represents a resource with reference counting capabilities. Provides operations to suspend acquisition of new references. Useful for resource management when resources are intermittently unavailable.
Closes#15956
Its what we say our maximum line length is in CONTRIBUTING.md and it'd be
nice to have a check on that. Unfortunately, we don't actually wrap all lines
at 140.
Cli tools currently catch all exceptions, and only print the exception
message, except when a special system property is set. Even with this
flag set, certain exceptions, like IOException, are captured and their
stack trace is always lost.
This change adds a UserError class, which can be used a cli tools to
specify a message to the user, as well as an exit status. All other
exceptions are propagated out of main, so java will exit with non-zero
and print the stack trace.
This commit fixes an inconsistency between ShardId#equals(Object),
ShardId#hashCode, and ShardId#compareTo(ShardId). In particular,
ShardId#equals(Object) compared only the numerical shard ID and the
index name, but did not compare the index UUID; a similar situation
applies to ShardId#compareTo(ShardId). However, ShardId#hashCode
incorporated the indexUUID into its calculation. This can lead to
situations where two ShardIds compare as equal yet have different hash
codes.
Closes#16319
The following settings were moved to Setting contents in HttpTransportSettings:
* http.cors.allow-origin
* http.port
* http.publish_port
* http.detailed_errors.enabled
* http.max_content_length
* http.max_chunk_size
* http.max_header_size
* http.max_initial_line_length
The following settings were removed:
* http.port
* http.netty.port
* http.netty.publish_port
* http.netty.max_content_length
* http.netty.max_chunk_size
* http.netty.max_header_size
* http.netty.max_initial_line_length
As a prerequisite for refactoring the whole PhraseSuggestionBuilder
to be able to be parsed and streamed from the coordinating node, the
DirectCandidateGenerator must implement Writeable, be able to parse
a new instance (fromXContent()) and later when transported to the
shard to generate a PhraseSuggestionContext.DirectCandidateGenerator.
Also adding equals/hashCode and tests and moving DirectCandidateGenerator
to its own DirectCandidateGeneratorBuilder class.
Long ago (#7834) the owner ship of the local disco node was centralized to the cluster service. LocalDiscovery is still created it's own disco node, which is not used by the cluster service and thus creating confusion (two nodes same name but different ids).
This commit also removes and optimization where when joining a new master we would first copy the master's metadata and only then pull in the rest of the cluster state (and it's nodes).
Closes#16317
The plugin cli currently is extremely lenient, allowing most errors to
simply be logged. This can lead to either corrupt installations (eg
partially installed plugins), or confused users.
This change rewrites the plugin cli to have almost no leniency.
Unfortunately it was not possible to remove all leniency, due in
particular to how config files are handled.
The following functionality was simplified:
* The format of the name argument to install a plugin is now an official
plugin name, maven coordinates, or a URL.
* Checksum files are required, and only checked, for official plugins
and maven plugins. Checksums are also only SHA1.
* Downloading no longer uses a separate thread, and no longer has a timeout.
* Installation, and removal, attempts to be atomic. This only truly works
when no config or bin files exist.
* config and bin directories are verified before copying is attempted.
* Permissions and user/group are no longer set on config and bin files.
We rely on the users umask.
* config and bin directories must only contain files, no subdirectories.
* The code is reorganized so each command is a separate class. These
classes already existed, but were embedded in the plugin cli class, as
an extra layer between the cli code and the code running for each command.
With the recent change to allow ESSingleNodeTestCase to specify plugins
and Version for the node, node creation became lazy, where it now
happens before the first test to run. However, there were many static
methods on ESSingleNodeTestCase that still try to access the node. This
change makes those methods non-static.
This commit removes and forbids the use of lowercase ells ('l') in long
literals because they are often hard to distinguish from the digit
representing one ('1').
Closes#16329
Previously it's possible to get errors like:
```
Caused by: NotSerializableExceptionWrapper[d:\shared_data\afs-issue-1-1\23]
```
Which doesn't tell us what the underlying exception type was that could
not be serialized.
This changes the message to prepend the
`ElasticsearchException.getExceptionName()` of the exception (which is a
underscore case of the class with the leading 'Elasticsearch' removed)
This commit amends a logging statement in TransportReplicationAction to
log the full shard ID instead of just the numerical shard ID (without
the index name).
When there is an exception thrown during pipeline creation within
Rest calls (in put pipeline, and simulate) We now return a structured
error response to the user with details around which processor's
configuration is the cause of the issue, or which configuration property
is misconfigured, etc.
Does so by introducing TaskListener which is just like ActionListener but
gets the Task as each parameter. Unlike ActionListener which is used
_everywhere_ you can only use TaskListener directly with TransportAction.
TransportAction under the covers uses an ActionListener implemetation that
closes over the task to call the TaskListener.
We are leaking all kinds of resources if something during Node#close() barfs.
This commit cuts over to a list of closeables to release resources that
also closed remaining services if one or more services fail to close.
Closes#13685
This is an issue due to some refactoring we did along the lines of Index.java etc.
We pass the index object to Set#contains which should actually be only it's name.
Closes#16299
This can cause premature closing of the underlying file descriptor immediately
if at the same time the thread is blocked on IO. The file descriptor will remain closed
and subsequent access to NIOFSDirectory will throw a ClosedChannelException.
Adds a method that emits a WordScorerFactory to all of the
three SmoothingModel implementatins that will be needed when
we switch to parsing the PhraseSuggestion on the coordinating
node and need to delay creating the WordScorer on the shards.
Now MasterNodeOperations, ReplicationAllShards, ReplicationSingleShard, BroadcastReplication and BroadcastByNode actions keep track of their parent tasks.
With this commit we revert the default value for the setting
'index.requests.cache.enable' again to false, which has been
set to true by accident in PR #16054.
Checked with @s1monw
Note for many of the settings we had a netty specific override to a generic transport setting. I have removed those as they are not needed imo (ex. "transport.connections_per_node.recovery" was overriden by "transport.netty.connections_per_node.recovery", which I removed). The netty prefix was kept for settings that are tied to the netty universe - ex. "transport.netty.max_cumulation_buffer_capacity"
Closes#16200
Today our logger settings are treated differently than all other
settings. This was a shortcut / workaround since it's quite messy but now
that we have strict validation and transactional updates we should integrate the
logger settings there as well. This commit makes the logger settings a build-in
settings updater and removes all the special code-paths we had for these settings.
Logger settings now also updateable and resetttable by passing `null`
In the early days Elasticsearch used to use the index name as the index identity. Around 1.0.0 we introduced a unique index uuid which is stored in the index setting. Since then we used that uuid in a few places but it is by far not the main identifier when working with indices, partially because it's not always readily available in all places.
This PR start to make a move in the direction of using uuids instead of name by making sure that the uuid is available on the Index class (currently just a wrapper around the name) and as such also available via ShardRouting and ShardId.
Note that this is by no means an attempt to do the right thing with the uuid in all places. In almost all places it falls back to the name based comparison that was done before. It is meant as a first step towards slowly improving the situation.
Closes#16217
Adds unit tests for blob operations and integration tests for repository operations. These tests can be used by repository plugins to verify that repository operations were implemented as expected by BlobStoreRepository.
These methods allow to modify and retrieve the content of pipelines, which are stored in the cluster state. Their actions names were already correct under the category cluster:admin/ingest/pipeline/* , the corresponding methods should be moved under client.admin().cluster() .
If we don't do this, and some path.conf is set when starting the tribe node, that path.conf will be ignored and the inner tribe clients will try to read elsewhere, where they most likely don't have permissions to read from.
Closes#16253Closes#16258
This commit limits the `index.translog.sync_interval` to a value not less than `100ms` and
removes the support for fsync on every operation which used to be enabled if `index.translog.sync_interval` was set to `0s`
Now this pr also only schedules an async fsync if the durability is set to `async`. By default not async task is scheduled.
Closes#16152
This commit method renames the ScriptEngineService interface methods
types, extensions, and sandboxed to getTypes, getExtensions, and
isSandboxed, respectively.
This commit converts the script mode settings to the new settings
infrastructure. This is a major refactoring of the handling of script
mode settings. This refactoring is necessary because these settings are
determined at runtime based on the registered script engines and the
registered script contexts.
The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values.
This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on.
NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document.
Fixes#8192
Doc values can now only be enabled by setting `doc_values: true` in the
mappings. Removing this feature also means that we can now fail mapping updates
that try to disable doc values.
Parsing is currently very lenient, which has the bad side-effect that if you
have a typo and pass eg. `store: fasle` this will actually be interpreted as
`store: true`. Since mappings can't be changed after the fact, it is quite bad
if it happens on an index that already contains data.
Note that this does not cover all settings that accept a boolean, but since the
PR was quite hard to build and already covers some main settirgs like `store`
or `doc_values` this would already be a good incremental improvement.
Doc values currently default to `true` if the field is indexed and not analyzed.
So setting `index:no` automatically disables doc values, which is not explicit
in the documentation.
This commit makes doc values default to true for numerics, booleans regardless
of whether they are indexed. Not indexed strings still don't have doc values,
since we can't know whether it is rather a text or keyword field. This
potential source of confusion should go away when we split `string` into `text`
and `keyword`.
This commit adds handling on the master side for shard failure requests
for shards that do not exist at the time that they are processed on the
master node (whether it be from errant requests, duplicate requests, or
both the primary and replica notifying the master of a shard
failure). This change is made because such shard failure requests should
always be considered successful (the failed shard is not there anymore),
but could be marked as failed if batched with a shard failure request
that does in fact fail. This avoids the possibility of an unexpected
catastrophic failure while applying the failed shards from causing such
a request to also be marked as failed setting in motion additional
failures.
Closes#16089
RescoreBuilder: Add parsing and creating of RescoreSearchContext
Adding the ability to parse from xContent to the rescore builder. Also making RescoreBuilder an abstract base class that encapsulates the window_size setting, with QueryRescoreBuilder as its only implementation at the moment.
Relates to #15559
When a Version is passed to `Settings#put(String, Version)` it's id is used as
an integer which should also be used to deserialize it on the consumer end.
Today AssertinLocalTransport expects Version#toString() to be used which can lead
to subtile bugs in tests.
Merge feature/ingest branch into master branch.
This adds the ingest feature to ES that allows to preprocess document before indexing on an ingest node.
By default a node is an ingest node. Documents are preprocessed via a pipeline. A pipeline consists
out of one or more processors Each processor makes one or more modifications to a document processed.
There are many types of processors available out-of-the-box that are designed to make a specific change to a document being processed. In a cluster many pipeline can be configured via dedicated pipeline APIs. An new option on the bulk
and index APIs allows to control what pipeline is picked for preprocessing. If no pipeline is specified then the ingest
feature is skipped and no preprocessing takes place.
The current RescoreBaseBuilder only serves as a container
for a pair of the optional `window_size` parameter and
the actual rescorer. Instead of a wrapper object, this makes
it an abstract class that conrete implementations like
QueryRescoreBuilder can extend.
With this commit we deprecate the widely misunderstood
fuzzy query but will still allow the fuzziness
parameter in match queries and suggesters.
Relates to #15760
Boolean parsing is now strict. Also added isIngestNode methods to DiscoveryNode to align this setting with the existing node.data node.master and node.client. Removed NodeModule#isIngestEnabled methods that are not needed anymore.
This commit removes non-reproducible randomness from
SearchWhileCreatingIndexIT. The cause of the non-reproducible randomness
is the use of a random draws for the shard preference inside of a
non-deterministic while loop. Because the inner while loop executed a
non-deterministic number of times, the draws for the next iteration of
the outer loop would be impacted by this making the random draws
non-reproducible. The solution is to move the random draws outside of
the while loop (just make a single draw for the prefernce and increment
with a counter), and remove the outer loop iteration instead using an
annotation to get the desired repetitions.
Closes#16208
This commit amends the logging statement in two places in
ShardStateAction to log the full shard ID instead of just the numerical
shard ID (without the index name).
This commit migrates all the settings under network service to the new settings infra.
It also adds some chaining utils to make fall back settings slightly less verbose.
Breaking (but I think acceptable) - network.tcp.no_delay and network.tcp.keep_alive used to accept the value `default` which make us not set them at all on netty. Our default was true so we weren't using this feature. I removed it and now we only accept a true boolean.
* `test.cluster.node.seed` was only used in one place where it wasn't adding value and was now replaced with a constant
* `tests.portsfile` is now an official setting and has been renamed to `node.portsfile`
this commit also convert `search.default_keep_alive` and `search.keep_alive_interval` to the new settings infrastrucutre
PhraseSuggestionBuilder uses three smoothing models internally.
In order to enable proper serialization / parsing from xContent
to the phrase suggester later, this change starts by making the
smoothing models writable, adding hashCode/equals and fromXContent.
Currently this fails when loading data from a segment, which means that it will
never fail on an empty index since it does not have segments.
Closes#16135
This commit also uses a try/finally with success pattern instead of catching
and excpetion. TranslogTests reproduce with `-Dtests.seed=DF6A38BAE739227A` every
time.
Closes#16142
TranslogWriter.closeIntoReader transfers the file ownership from a writer to a reader and closes the writer. If the transfer fails, we need to make sure we closed the underlying channel as the writer is already closes.
See: http://build-us-00.elastic.co/job/es_core_master_regression/4355
Relates to #16142
Today we run the metadata upgrade only on the current major version
but this should run on every upgrade at least once to ensure we don't miss
an important check or upgrade.
This commit fixes a minor issue with the shard ID that is logged while
indexing in DiscoveryWithServiceDisruptionsIT#testAckedIndexing. The
issue is that the operation routing hash could lead to a negative
remainder modulo the number of primaries (if the hash itself is
negative) but should instead be the normalized positive remainder. This
issue only impacts the logging of the shard ID as the actual shard ID
used during indexing is computed elsewhere but would cause the shard ID
in the affected logging statement to not match shard IDs that are logged
elsewhere.
this change allows us to open existing IndexMetaData that contains invalid, removed settings
or settings with invalid values and instead of filling up the users disks with exceptions we _archive_
the settings with and `archive.` prefix. This allows us to warn the user via logs (once it's archived) as
well as via external tools like the upgrade validation tool since those archived settings will be preserved
even over restarts etc. It will prevent indices from failing during the allocaiton phase but instead will
print a prominent warning on index metadata recovery from disk.
Affects match, multi_match, query_string and simple_query_string queries.
Direct bool queries are not affected anymore (minimum_should_match is applied even if the coord factor is disabled).
Default headers are now read-only fallbacks if the key is actually not in
the headers map. That way we never serialize them across the wire and also never
prevent them from being overwritten.
This has caused some test failures lately especially on window (which is likely caused
by the rather bad performance of the windows test machines).
See one failure here:
http://build-us-00.elastic.co/job/es_core_master_window-2008/2934/
This fix has now also a unittest that tests this issue separately.
* Cleaned up MapperService#searchFilter(...) and moved it DefaultSearchContext, since it that class was the only user. As part of the cleanup percolate query documents are no longer excluded from the search response.
* Removed resolveClosestNestedObjectMapper(...) method as it was no longer used.
* Removed DocumentTypeListener infrastructure. Before it was used by the percolator and parent/child, but these features no longer use it.
Closes#15924
Adding the ability to parse from xContent to the rescore builder.
Also making RescoreBuilder an interface and renaming the current
base builder that encapsulates the `window_size` setting the the
concrete rescorer implementation to RescoreBaseBuilder.
* Remove remaining 1.x bwc logic.
* Stop storing stored fields and indexed terms. The _parent field's only purpose is to support joins between parent and child type and only storing doc values is sufficient.
* In the mapping the parent field mapper is now known under '{parent}#{child}' key, because this is the field the parent/child join uses too.
* Added new sub fetch phase to lookup that _parent field from doc values field if that is required (before this was fetched from stored _parent field)
* Removed the ability to query directly on `_parent` in the query dsl. Instead the `{parent}#{child}` field should be used. Under the hood a doc values query is used instead of a term query, because only doc values fields are stored now.
* Added a new `parent_id` query to easily query child documents with a specific parent id without having to know what join field to use
* Also in aggregations `_parent` field can't be used any more and `{parent}#{child}` field name should be used instead to aggregate directly on the _parent join field.
While conceptually correct this call can be confusing since if a
realtime GET is exectued after refresh is called with realtime=true it might
fetch the wrong document out of the translog since the isRefreshNeeded guard
might return false once the new searcher is published but it will not
wait until the verision map is flushed and that can cause a slight race condition
if a subsequent call GETs a document that it expects to come from the index.
Ie. the `_size` mapper tests do this and expecte the GET call to return the document
from the index but it comes from the t-log due to this race.
This change only uses the guard in an async refresh that we schedule to prevent unnecessary
refresh calls but will allow API Refresh calls to have the somewhat less confusing semantics.
This was introduces lately only on unrelease major version branches.
Mostly these were pretty easy to clean up by insisting that the request
and response stays consistent across the filter. There are a few places
where we have to make assumptions in tests but those are valid assumptions
for the test.
This adds the ability to filter clients multiple times and allows to inherit the outer context if done so.
This also adds a method to access all tasks in the threadpool in an unwrapped fashion
This commit allows an integ test to wrap all clients that are exposed by the
InternalTestCluster which is useful for request interception or to add general headers
to request or to intercept client to server call even if the client calls are done from within
the test framework.
It defaults to false and when false it returns a task identifier. Right
now all you can do is get the task to see if it is still running. Once
the task finishes it vanishes and you can't get any information about it.
Today the TransportClient uses the given settings rather than the updated setting from the plugin
service to pass on to it's modules etc. It should use the updates settings instead.
Today we already validate all index level setting on startup. For global
settings we are not fully there yet since not all settings are registered.
Yet we can already validate the ones that are know if their values are parseable/correct.
This is better than nothing and an improvement to what we had before. Over time there will
be more an dmore setting converted and we can finally flip the switch.
Cut over all index scope settings to the new setting infrastrucuture
This change moves all index.* settings over to the new infrastructure. This means in short that:
- every setting that has an index scope must be registered up-front
- index settings are validated on index creation, template creation, index settings update
- node level settings starting with index.* are validated on node startup
- settings that are private to ES like index.version.created can only be set by tests when they install a specific test plugin.
- all index settings can be reset by passing null as their value on update
- all index settings defaults can be listed via the settings APIs
Closes#12854Closes#6732Closes#16032Closes#12790
This commit adds handling of channel failures when starting a shard to
o.e.c.a.s.ShardStateAction. This means that shard started requests
that timeout or occur when there is no master or the master leaves
after the request is sent will now be retried from here. The listener
for a shard state request will now only be notified upon successful
completion of the shard state request, or when a catastrophic
non-channel failure occurs.
This commit also refactors the handling of shard failure requests so
that the two shard state actions of shard failure and shard started
now share the same channel-retry and notification logic.
This commit fixes an issue in the handling of TransportExceptions in
ShardStateAction. There were two cases not being handled correctly.
- when the local node is shutting down, handlers will be notified with
a TransportException with a message starting "transport stopped"
- when the remote node disconnects, handlers will be notified with a
NodeDisconnectedException
In both of these cases, the cause of the exception will be null and this
was incorrectly being handled. The first case can passed to the listener
like any other critical non-channel failure, and the second case can be
handled by modifying the logic for detecting master channel exceptions.
There was a third case of NodeNotConnectedException that was not being
treated as a master channel exception but should be.
This commit adds an integration test that simulates the handling of a
shard failure request during a network partition. By isolating the
master from the cluster while a shard failed request is in flight, this
test simulates that we wait until a new master is elected and then retry
sending that shard failed request to the newly elected master.
This commit adds methods to CapturingTransport to separate local and
remote transport exceptions. The motivation for this change is that
local transport exceptions are delivered to listeners (usually, but not
always) wrapped in SendRequestTransportException while remote transport
exceptions are delivered to listeners wrapped in
RemoteTransportException. By making this distinction clear in the
CapturingTransport, this makes it less likely that tests will make
incorrect assumptions about the exceptions coming out of the transport
layer to listeners.
Closes#16057
Currently we use ref counting to manage the life cycles of a translog file. This was done to allow the creation of view and snapshots, making sure that the underlying files are available. This commit takes a simpler route based on the observation that a snapshot doesn't need to have it's own life cycle but rather can lift on the lifecycle of it's parent (translog or view). If code failes to adhere to this assumption it will get a channel already closed exception. As such, each file is now owned by a single owner and there is no need for reference counting. As part of the rewrite TranslogReader is renamed to BaseTranslogReader and ImmutableTranslogReader to TranslogReader
Also, I took the opportunity to clean up legacy translog readers we don't need in master.
Closes#15898
Sometimes action callers might be interested in having an access to the task that they have just initiated. This changes allows callers to get access to the Task object of the actions that they just started if the action supports the task management.
We have two similar tests with the same name, ContextAndHeaderTransportTests.
They shared lots of common code so I extracted much of it into
ActionRecordingPlugin, a plugin which records all action requests for later
inspection.
I also removed all the warnings from both tests. That made lang-mustache
compile cleanly without any custom -Xlint so I removed those. To remove
the warnings I had to add type parameters to ActionFilter which seemed
like a good idea anyway.
This commit tightens the load average test assertions by separating out
OS X as a system where we can make tighter assertions about the load
average values.
This commit fixes the test for load averages on FreeBSD. On FreeBSD, it
is either the case that linprocfs is mounted at /compat/linux/proc in
which case the load averages are available, or this is not the case and
no load average are available. Previously, the test on FreeBSD was
falling back to the catch all case which asserts that the five-minute
and fifteen-minute load averages are not available.
This commit normalizes the one-minute load average obtained from
OperatingSystemMXBean#getSystemLoadAverage to -1 when it is not
available. This is to reflect the Javadocs for this method saying "If
the load average is not available, a negative value is returned."
1. Gets guice out of the business of building ScoreFunctionParsers and
QueryParsers.
2. Moves QueryParser registration to SearchModule
3. Moves NamedWriteableRegistry construction out of guice and into Node and
TransportClient.
4. Moves shape registration into SearchModule so now all named writeable
registration is done in the SearchModule.
This is breaking for plugin authors. Instead of declaring new QueryParser
like:
```java
public void onModule(IndicesModule module) {
module.registerQueryParser(NewQueryParser.class);
}
```
you do it like:
```java
public void onModule(SearchModule module) {
module.registerQueryParser(NewQueryParser::new);
}
```
The QueryParser's argument no longer come from @Inject, now they come from
the declaration in the plugin. The above example is for a no-arg QueryParser.
Most of the QueryParsers in Elasticsearch are no-arg.
ScoreFunctionParsers have a similar but slightly different change. This:
```java
public void onModule(SearchModule module) {
module.registerFunctionScoreParser(NewFunctionScoreParser.class);
}
```
becomes
```java
public void onModule(SearchModule module) {
module.registerFunctionScoreParser(new NewFunctionScoreParser());
}
```
Since all known ScoreFunctionParsers have no arg constructors its simpler to
just build them at registration time rather than specify a supplier that is
used to build them later.
This commit modifies the load_average in the node stats API response
to be an object containing the one-minute, five-minute and
fifteen-minute load averages as fields (if those values are
available). Additionally, this commit modifies the cat nodes API
response to format the one-minute, five-minute and fifteen-minute load
averages as null if any of the respective values are not available.
This would be useful in order to only perform some validations in the case of
a mapping update and in cases when a mapping is restored eg. after a restart,
such as discussed in #15989.
This replaces the current `applyDefault` parameter which can be derived from
the mapping merge reason: the default mapping should be applied only in case of
a mapping update, if the mapping does not exist yet and if this is not the
default mapping.
Setting realTime to false in the get term vector request ensures,
that after a refresh the document is not fetched from the translog,
which seems to yield different in rare test runs.
The change likely triggering this was introduced in #15933
From this commit on we also validate all settings starting with `index.` that are
node-level settings configured in yaml files or via commandline arguments or system properties.
This check happens on node startup before the actual node is started.
The old infa has been removed in this commit such that nothing uses `DynamicSettings` anymore
and all index-scoped settings require to be registered before the node has fully started up.
This commit adds some additional assertions that test success is not
falsely indicated by adding assertions that success / failure methods
are not incorrectly invoked in failure / success scenarios.
Site plugins used to be used for things like kibana and marvel, but
there is no longer a need since kibana (and marvel as a kibana plugin)
uses node.js. This change removes site plugins, as well as the flag for
jvm plugins. Now all plugins are jvm plugins.
This commit adds a test for calculating the age of PrioritizedRunnable
that allows real clock time to elapse. The test ensures that at least
one millisecond has passed, and that the resolution of System#nanoTime
on the underlying system is actually able to detect this.
Relates #15995
This commit wraps a trace logging message in a trace logging level check
to prevent allocating an Object array (to hold the logging parameters)
and a String (from the interval) when trace logging is not enabled every
second (with the default index refresh interval) and every five seconds
(with the default translog sync interval) for every open index when
trace logging is not enabled.
This commit simplifies an equality test in IndexShard#sameException
where the messages for two exceptions are being compared. The previous
condition first tested logical equality if the left exception is not
null, and otherwise tested reference equality. There is a convenience
method since JDK 7 for testing equality in this way: Objects#equals.
Closes#16025
Fixes an issue caused by trying to add a LeafReader for a closed index to
ShardCoreKeyMap. It add itself to the map half way before throwing
AlreadyClosedException, leaking some references and causesing Elasticsearch
to refuse to shut down cleanly when testing.
This commit enhances the master channel exception test in
o.e.c.a.s.ShardStateActionTests to test that a retries loop as expected
when requests to the master repeatedly fail.
When a metadata mapper is not specified in a mapping update, it should default
to the current metadata mapper instead of the general default in order for the
update to not conflict with the current mapping.
Closes#15997
It had some funky errors, like lenient:true not working and queries with
two integer fields blowing up if there was no analyzer defined on the
query. This throws a bunch more tests at it and rejiggers how non-strings
are handled so they don't wander off into scary QueryBuilder-land unless
they have a nice strong analyzer to protect them.
This commit adds a trace log on a cluster state update while waiting for
a new master, and changes the log level on cluster service close to the
warn level.
This commit tightens the tests in o.e.c.a.s.ShardStateActionTests:
- adds a simple test for a success condition that validates the shard
failed request is correct and sent to the correct place
- remove redundant assertions from the no master and master left tests
- an assertion that success is not falsely indicated in the case of a
unhandled error
the environment is now available through NodeModule#getNode#getEnvironment and can be retrieved during onModule(NodeModule), no need for this indirection anymore using the BiFunction
This commit addresses a time unit conversion bug in calculating the age
of a PrioritizedRunnable. The issue was an incorrect conversion from
nanoseconds to milliseconds as instead the conversion was to
microseconds. This leads to the timeInQueue metric for pending tasks to
be off by three orders of magnitude.
* Folded IngestModule into NodeModule
* Renamed IngestBootstrapper to IngestService
* Let NodeService construct IngestService and removed the Guice annotations
* Let IngestService implement Closable
We have a performance bug that if a filter aggregation is below a terms
aggregation that has a cardinality of 1000, we will call Query.createWeight
1000 times as well. However, Query.createWeight can be a costly operation.
For instance in the case of a TermQuery it will seek the term in every
segment. Instead, we should create the Weight once, and then get as many
iterators as we need from this Weight.
I found this problem while trying to diagnose a performance regression while
upgrading from 1.7 to 2.1[1]. While the problem was not introduced in 2.x, the
fact that 1.7 cached very aggressively had hidden this problem, since you don't
need to seek the term anymore on a cached TermFilter.
Doing things once for every aggregator is not easy with the current API but
I discussed this with Colin and Aggregator factories will need to get an init
method for different reasons, where we will be able to put these steps that
need to be performed only once, no matter haw many aggregators need to be
created.
[1] https://discuss.elastic.co/t/aggregations-in-2-1-0-much-slower-than-1-6-0/38056/26
If an operating system reports -1 for the total bytes of a filesystem
path, we should ignore it when capturing the least and most available
statistics.
Relates to #15919
Squashed commit of the following:
commit 5d2258ffeff8a0d156295dcc754ab9b6cbb4b02e
Author: Lee Hinman <lee@writequit.org>
Date: Thu Jan 14 14:14:27 2016 -0700
Change test to test positive total with negative 'free' value
commit 927e61d4b39692fc147220a955b63b291ad80db5
Author: Lee Hinman <lee@writequit.org>
Date: Thu Jan 14 13:09:28 2016 -0700
Skip capturing least/most FS info for an FS with no total
If an operating system reports -1 for the total bytes of a filesystem
path, we should ignore it when capturing the least and most available
statistics.
Relates to #15919
This commit removes the timeout retry mechanism from ShardStateAction
allowing it to instead be handled by the general master channel retry
mechanism. The idea is that if there is a network issue, the master will
miss a ping timeout causing the channel to be closed which will expose
itself via a NodeDisconnectedException. At this point, we can just wait
for a new master and retry, as with any other master channel exception.
This commit moves the handling of channel failures when failing a shard
to o.e.c.a.s.ShardStateAction. This means that shard failure requests
that timeout or occur when there is no master or the master leaves after
the request is sent will now be retried from here. The listener for a
shard failed request will now only be notified upon successful
completion of the shard failed request, or when a catastrophic
non-channel failure occurs.
This commit adds a simulation of the master leaving after a shard
failure request has been sent. In this case, after a new cluster state
is published (simulating a new master having been elected), the request
to fail the shard should be retried.
This commit handles the situation when we are failing a shard and either
no master is known, or the known master left while failing the shard. We
handle this situation by waiting for a new master to be reelected, and
then sending the shard failed request to the new master.
To main concern with the dedicated ingest TP is that there are already many TPs and in the case with beefy nodes we would many more threads. In the case ingest isn't used the all these threads are just idle.
Adding serialization capabilities to RescoreBuilder and make
all QueryRescorer implement NamedWritable, also requiring
all implementations of RescoreBuilder.Rescorer to implement
equals() and hashCode.
In addition, the current rescore mode enumeration is pulled out to a
separate class to make sharing of constants easier between
the query builders XContent rendering coder and the parser.
This change affects get alias, get aliases as well as cat aliases. They all return closed indices too by default. get alias and get aliases also allow to return open indices only through the `expand_wildcards` option (set it to `open`).
Closes#14982
This undocumented setting was mainly used for testing and a safety net for
a new flushOnClose feature back in 1.x. We can now safely remove this setting.
The test usage now uses a mock plugin setting to achive the same.
This creates an reindex plugin with a very basic implementation that is
very like delete-by-query. New we'll integrate it with the task managament
work but for now this works.
Today we maintain a lot of settings on the shard level which are all index level settings.
In order to cut over to the new settings API where we register update listener we have to move
all of them on to the index level otherwise we need a way to un-register listeners which is error-prone
and requires additional handling when shards are closed. It's simpler and also more accurate to handle all of
them on the index level where we can trash the entire registry for update listener once the index goes out of scope.
Don't set the suppressed Exception in Translog.closeOnTragicEvent(Exception ex) if it is an
AlreadyClosedException. ACE is thrown by the TranslogWriter and as cause might
contain the Exception that we add the suppressed ACE to. We then end up with a
circular reference where Exception A has a suppressed Exception B that has as cause A.
This would cause a stackoverflow when we try to serialize it.
For a more detailed description see #15941closes#15941
ContextAndHeaders has a massive impact on the core infrastructure since it has to
be manually passed on to all relevant places across threads/network calls etc. For the same reason
it's also very error prone and easily forgotten on potentially relevant APIs.
The new ThreadContext is associated with a ThreadPool (node or transport client) and ensures that
headers and context registered on a current thread are inherited to new threads spawned, send across
the network to be deserialized on the receiver end as well as restored on the response handling thread
once the response is received.
This commit adds load averages to the OS stats on FreeBSD. For these
stats to be available, linprocfs must be available and mounted at
/compat/linux/proc.
`refresh_interval` is a per index setting but we interpret and maintain it per shard. This
change moves the refresh task outside of IndexShard to the IndexService where it logically belongs
and reuses scheduling infrastructure used for translog fsync (async commit).
This change will use the same task for all shards of an index while previously we used on thread/task
per shard to refresh. This will also prevent too many concurrent refreshes if there are many indices and
shards allocated on a single node.
These filters leak into highlighting and probably other places and cause
things like the type name to be highlighted when using
requireFieldMatch=false. We could have special hacks to keep them out of
highlighting but it feals better to keep them out of any variable named
"originalQuery".
Closes#15689
This commit adds convenience methods to o.e.t.t.CapturingTransport
that enables capturing requests and clearing the captured requests
with a single method. This is to simplify a common pattern in tests of
capturing requests, and then clearing the captured requests.