TRA currently resolves incoming requests to IndexShards in order to acquire operations locks on them. There is no need for all subclasses to have to go through the same IndicesService/IndexService song and dance. Also, doing it once means we don't need to worry about edge cases where the shard is removed while a TRA is in flight.
This commit improves the logic flow of BalancedShardsAllocator in
preparation for separating out components of this class to be used
in the cluster allocation explain APIs. In particular, this commit:
1. Adds a minimum value for the index/shard balance factor settings (0.0)
2. Makes the Balancer data structures immutable and pre-calculated at
construction time.
3. Removes difficult to follow labeled blocks / GOTOs
4. Better logic for skipping over the same replica set when one of
the replicas received a NO decision
5. Separates the decision making logic for a single shard from the logic
to iterate over all unassigned shards.
Before this change the processing of the ranges in the date range (and
other range type) aggregations was done when the Aggregator was created.
This meant that the SearchContext did not know that now had been used in
a range until after the decision to cache was made.
This change moves the processing of the ranges to the aggregation builders
so that the search context is made aware that now has been used before
it decides if the request should be cached
This commit adds a did you mean feature to the strict REST params error
message. This works by comparing any unconsumed parameters to all of the
consumer parameters, comparing the Levenstein distance between those
parameters, and taking any consumed parameters that are close to an
unconsumed parameter as candiates for the did you mean.
* Fix pluralization in strict REST params message
This commit fixes the pluralization in the strict REST parameters error
message so that the word "parameter" is not unconditionally written as
"parameters" even when there is only one unrecognized parameter.
* Strength strict REST params did you mean test
This commit adds an unconsumed parameter that is too far from every
consumed parameter to have any candidate suggestions.
Relates #20747
This commit changes the strict REST parameters message to say that
unconsumed parameters are unrecognized rather than unused. Additionally,
the test is beefed up to include two unused parameters.
Relates #20745
Before this change the processing of the ranges in the date range (and
other range type) aggregations was done when the Aggregator was created.
This meant that the SearchContext did not know that now had been used in
a range until after the decision to cache was made.
This change moves the processing of the ranges to the aggregation builders
so that the search context is made aware that now has been used before
it decides if the request should be cached
This commit adds a did you mean feature to the strict REST params error
message. This works by comparing any unconsumed parameters to all of the
consumer parameters, comparing the Levenstein distance between those
parameters, and taking any consumed parameters that are close to an
unconsumed parameter as candiates for the did you mean.
* Fix pluralization in strict REST params message
This commit fixes the pluralization in the strict REST parameters error
message so that the word "parameter" is not unconditionally written as
"parameters" even when there is only one unrecognized parameter.
* Strength strict REST params did you mean test
This commit adds an unconsumed parameter that is too far from every
consumed parameter to have any candidate suggestions.
Relates #20747
This commit changes the strict REST parameters message to say that
unconsumed parameters are unrecognized rather than unused. Additionally,
the test is beefed up to include two unused parameters.
Relates #20745
Today when parsing a request, Elasticsearch silently ignores incorrect
(including parameters with typos) or unused parameters. This is bad as
it leads to requests having unintended behavior (e.g., if a user hits
the _analyze API and misspell the "tokenizer" then Elasticsearch will
just use the standard analyzer, completely against intentions).
This commit removes lenient URL parameter parsing. The strategy is
simple: when a request is handled and a parameter is touched, we mark it
as such. Before the request is actually executed, we check to ensure
that all parameters have been consumed. If there are remaining
parameters yet to be consumed, we fail the request with a list of the
unconsumed parameters. An exception has to be made for parameters that
format the response (as opposed to controlling the request); for this
case, handlers are able to provide a list of parameters that should be
excluded from tripping the unconsumed parameters check because those
parameters will be used in formatting the response.
Additionally, some inconsistencies between the parameters in the code
and in the docs are corrected.
Relates #20722
Today, the individual allocation deciders appear in random
order when initialized in AllocationDeciders, which means
potentially more performance intensive allocation deciders
could run before less expensive deciders. This adds to the
execution time when a less expensive decider could terminate
the decision making process early with a NO decision. This
commit orders the initialization of allocation deciders,
based on a general assessment of the big O runtime of each
decider, moving the likely more expensive deciders last.
Closes#12815
IndicesClusterStateService and IndicesStore are responsible for synchronizing local shard state based on incoming cluster state updates. On client/tribe nodes, which don't store any such shard/index data/metadata, all of the logic that computes which data is to be deleted, which shards to be initialized etc. can be completely skipped, saving precious CPU cycles.
CompositeIndicesRequest should be implemented by all requests that are composed of multiple subrequests which relate to one or more indices. A composite request is
executed by its own transport action class (e.g. TransportMultiSearchAction for _msearch), which goes through all the subrequests and delegates their execution to the appropriate transport action (e.g. TransportSearchAction for _msearch) for each single item. IndicesAliasesRequest is a particular request as it holds multiple items that implement AliasesRequest, but it shouldn't be considered a composite request, as it has no specific transport action for each of its items. Also, either all of its subitems fail or succeed.
Also clarified javadocs for CompositeIndicesRequest.
We have new icons for elastic products with 5.0. This change updates the
favicon embedded in elasticsearch that users see when using the rest api
through a browser.
When a node get disconnected from the cluster and rejoins during a master election, it may be that the new master already has that node in it's cluster and will try to assign it shards. If the node hosts started primaries, the new shards will be initializing and will have the same allocation id as the allocation ids of the current started size. We currently do not recognize this currently. We should clean the current IndexShard instances and initialize new ones.
This also hardens test assertions in the same area.
This makes geo-distance sorting use `LatLonDocValuesField.newDistanceSort`
whenever applicable, which should be faster that the current approach since it
tracks a bounding box that documents need to be in in order to be competitive
instead of doing a costly distance computation all the time.
Closes#20450
today it's not possible to use date-math efficiently with the `_rollover`
API. This change adds support for date-math in the target index as well as
support for preserving the math logic when an existing index that was created with
a date math expression all subsequent indices are created with the same expression.
Right now our unit tests in that area only simulate indexing single documents. As we go forward it should be easy
to add other actions, like delete & bulk indexing. This commit extracts the common parts of the current indexing
logic to a based class make it easier to extend.
The Setting.timeValue() method uses TimeValue.toString() which can produce fractional time values. These fractional time values cannot be parsed again by the settings framework.
This commit fix a method that still use the .toString() method and replaces it with .getStringRep(). It also changes a second method so that it's not up to the caller to decide which stringify method to call.
closes#20662
This commit fixes a failing cluster settings tests, namely the logger
level update test. The test was incorrectly assuming the default log
level was info, but it could be non-info, for example, if
tests.es.logger.level is set to some non-info level.
Closes#20318
Today we allow system bootstrap checks to be ignored with a
setting. Yet, the system bootstrap checks are as vital to the health of
a production node as the non-system checks (e.g., the original bootstrap
check, the file descriptor check, is critical for reducing the chances
of data loss from being too low). This commit removes the ability to
ignore system bootstrap checks.
Relates #20511
The invalid ingest configuration field name used to show itself,
even when it was null, in error messages. Sometimes this does not make
sense.
e.g.
```[null] Only one of [file], [id], or [inline] may be configure```
vs.
```Only one of [file], [id], or [inline] may be configure```
The above deals with three fields, therefore this no one property
responsible.
this change adds a hard limit to `index.number_of_shard` that prevents
indices from being created that have more than 1024 shards. This is still
a huge limit and can only be changed via settings a system property.
Today when getting setting via an API like the cluster settings API,
complex settings are excluded (e.g.,
discovery.zen.ping.unicast.hosts). This commit adds these settings to
the output of such APIs.
Relates #20622
It currently returns something like:
```
"No feature for name [_siohgjoidfhjfihfg]"
```
Which is not the most understandable message, this changes it to be a
little more readable.
Resolves#10946
Today when executing the install plugin command without a plugin id, we
end up throwing an NPE because the plugin id is null yet we just keep
going (ultimatley we try to lookup the null plugin id in a set, the
direct cause of the NPE). This commit modifies the install command so
that a missing plugin id is detected and help is provided to the user.
Relates #20660
reindex-from-remote should ignore unknown fields so it is mostly
future compatible. This makes it ignore unknown fields by adding an
option to `ObjectParser` and `ConstructingObjectParser` that, if
enabled, causes them to ignore unknown fields.
Closes#20504
Many of our unit tests instantiate an `AllocationService`, which requires having a `GatewayAllocator`. Today almost all of our test use a class called `NoopGatewayAllocator` which does nothing, effectively leaving all shard assignments to the balanced allocator. This is sad as it means we test a system that behaves differently than our production logic in very basic things. For example, a started primary that is lost will be assigned to a node that didn't use to have it.
This PR removes `NoopGatewayAllocator` in favor of a new `TestGatewayAllocator` that inherits the standard `GatewayAllocator` and overrides shard information fetching to return information based on historical assignments the allocator has done. The only exception is `BalanceConfigurationTests` which does test only the balancer and I opted to not have it work around the `GatewayAllocator` being in it's way.
Changes the API of GatewayAllocator#applyStartedShards and
GatewayAllocator#applyFailedShards to take both a RoutingAllocation
and a list of shards to apply. This allows better mock allocators
to be created as being done in #20637.
Closes#20642
Removes the FailedRerouteAllocation class and StartedRerouteAllocation
class, as they were just wrappers for RerouteAllocation that stored
started and failed shards, but these started and failed shards can
be passed in directly to the methods that needed them, removing the
need for this wrapper class and extra level of indirection.
Closes#20626
When initializing a new index routing table, we make a decision where the primary shards should be recovered from. This can be an empty folder for new indices, a set of specific allocation ids for old indices or a snapshot. We currently allow callers of `IndexRoutingTable.initializeEmpty` to supply the source but also set it automatically if null is given. Sadly the current logic is reusing the supplied parameter to store the result of the automatic decision. This is flawed if some of the decision should be *different* between the different index shard (as the first decision that is maid sticks).
This commit fixes this but also simplifies the API to always make an automatic decision.
This was discovered while working on #20637 which strengthens the testing infra and caused this to bubble up. I put it as a separate commit to make sure it is not lost as part of a bigger test only PR.
Today we hold on to all possible tokenizers, tokenfilters etc. when we create
an index service on a node. This was mainly done to allow the `_analyze` API to
directly access all these primitive. We fixed this in #19827 and can now get rid of
the AnalysisService entirely and replace it with a simple map like class. This
ensures we don't create a gazillion long living objects that are entirely useless since
they are never used in most of the indices. Also those objects might consume a considerable
amount of memory since they might load stopwords or synonyms etc.
Closes#19828
When testing tribe nodes in an integration test, we should pass the classpath
plugins of the node down to the tribe client nodes. Without this the tribe client
nodes could be prevented from communicating with the tribes.
When an active shadow replica is reinitialized during primary promotion, the recovery stats that are used by the allocation decider settings `cluster.routing.allocation.node_concurrent_recoveries` and `cluster.routing.allocation.node_concurrent_incoming_recoveries` have to be updated.
If your native script needs to do some heavy computation on initialization,
the fact that we create a new one for every segment rather than for the whole
index could have a negative performance impact.
This commit changes the default behavior of `_flush` to block if other flushes are ongoing.
This also removes the use of `FlushNotAllowedException` and instead simply return immediately
by skipping the flush. Users should be aware if they set this option that the flush might or might
not flush everything to disk ie. no transactional behavior of some sort.
Closes#20569
Translog#read is a left-over from realtime-get that allows to read
from an arbitrary location in the transaction log. This method is unused
and can be replaced with snapshots in tests.
`index.routing.allocation.initial_recovery` is used with index shrinking to make sure the new index's primary is assigned to the node that holds a copy of each of the source index shards. Sadly with the introduction of `RecoverySource` a regression was introduced that limits the allocation of replicas of the new index.
Today when CLI tools are executed, logging statements can intentionally
or unintentionally be executed when logging is not configured. This
leads to log messages that the status logger is not configured. This
commit reworks logging configuration for CLI tools so that logging is
always configured.
Relates #20575
This commit removes `ByteSizeValue`'s methods that are duplicated (ex: `mbFrac()` and `getMbFrac()`) in order to only keep the `getN` form.
It also renames `mb()` -> `getMb()`, `kb()` -> `getKB()` in order to be more coherent with the `ByteSizeUnit` method names.
Adds a cat api endpoint: /_cat/templates and its more specific version, /_cat/templates/{name}.
It looks something like:
$ curl "localhost:9200/_cat/templates?v"
name template order version
sushi_california_roll *avocado* 1 1
pizza_hawaiian *pineapples* 1
pizza_pepperoni *pepperoni* 1
The specified version (only allows * globs) looks like:
$ curl "localhost:9200/_cat/templates/pizza*"
name template order version
pizza_hawaiian *pineapples* 1
pizza_pepperoni *pepperoni* 1
Partially specified columns:
$ curl "localhost:9200/_cat/templates/pizza*?v=true&h=name,template"
name template
pizza_hawaiian *pineapples*
pizza_pepperoni *pepperoni*
The help text:
$ curl "localhost:9200/_cat/templates/pizza*?help"
name | n | template name
template | t | template pattern string
order | o | template application order number
version | v | version
Closes#20467
This commit adds a new test TribeIT#testClusterStateNodes() to verify that the tribe node correctly reflects the nodes of the remote clusters it is connected to.
It also changes the existing tests so that they really use two remote clusters now.
IndexResponse#toString method outputs an error caused by the shards object needing to be wrapped into another object. It is fixed by calling a different variant of Strings.toString(XContent) which accepts a second boolean argument that makes sure that a new object is created before outputting ShardInfo. I didn't change ShardInfo#toString directly as whether it needs a new object or not very much depends on where it is printed out. IndexResponse seemed a specific case as the rest of the info were not json, hence the shards object was the first one, but it is usually not the case.
With the unified release process across the elastic stack, download
links for all products are changing. This change updates docs referring
to the old download and packages urls.
Note that this change also updates the plugin installation command as
the url for downloads is being changed to be consistent with that for
packages (both plural).
The serial collector is not suitable for running with a server
application like Elasticsearch and can decimate performance and lead to
cluster instability. This commit adds a bootstrap check to prevent usage
of the serial collector when Elasticsearch is running in production
mode.
Relates #20558
Today when acquiring a prefix logger for a logger info stream, we obtain
a new prefix logger per invocation. This can lead to contention on the
markers lock in the constructor of PrefixLogger. Usually this is not a
problem (because the vast majority of callers hold on to the logger they
obtain). Unfortunately, under heavy indexing with multiple threads, the
contention on the lock can be devastating. This commit modifies
LoggerInfoStream to hold on to the loggers it obtains to avoid
contending over the lock there.
Relates #20571
* Build: Remove old maven deploy support
This change removes the old maven deploy that we have in parallel to
maven-publish, and makes maven-publish fully work with publishing to
maven local. Using `gradle publishToMavenLocal` should be used to
publish to .m2.
Note that there is an unfortunate hack that means for
zip artifacts we must first create/publish a dummy pom file, and then
follow that with the real pom file. It would be nice to have the pom
file contains packaging=zip, but maven central then requires sources and
javadocs. But our zips are really just attached artifacts, so we already
set the packaging type to pom for our zip files. This change just works
around a limitation of the underlying maven publishing library which
silently skips attached artifacts when the packaging type is set to pom.
relates #20164closes#20375
* Remove unnecessary extra spacing
This change removes all guice interaction from Transport, HttpServerTransport,
HttpServer and TransportService. All these classes as well as their subclasses
or extended version configured via plugins are now created by using plain old
bloody java constructors. YAY!
Since #19975 we are aggressively failing with AssertionError when we catch an ACE
inside the InternalEngine. We treat everything that is neither a tragic even on
the IndexWriter or the Translog as a bug and throw an AssertionError. Yet, if the
engine hits an IOException on refresh of some sort and the IW doesn't realize it since
it's not fully under it's control we fail he engine but neither IW nor Translog are marked
as failed by tragic event while they are already closed.
This change takes the `failedEngine` exception into account and if it's set we know
that the engine failed by some other even than a tragic one and can continue.
This change also uses the `ReferenceManager#RefreshListener` interface in the engine rather
than it's concrete implementation.
Relates to #19975
Currently all the reroute-like methods of `AllocationService` return a result object of type `RoutingAllocation.Result`. The result object contains the new `RoutingTable` and `MetaData` plus an indication whether those were changed. The caller is then responsible of updating a cluster state with these. These means that things can easily go wrong and one can take one of these but not the other causing inconsistencies. We already have a utility method on the `ClusterState` builder that does but no one forces you to do so. Also 99% of the callers do the same thing: i.e., check if the result was changed and if so update the very same cluster state that was passed to `AllocationService`. This PR folds this pattern into `AllocationService` and changes almost all it's methods to return a new cluster state (potentially the original one). This saves some 500 lines of code.
The one exception here is the reroute API which executes allocation commands and potentially returns an explanation as well (next to the routing table and metadata). That API now returns a `CommandsResult` object which encapsulate a cluster state and the explanation.
A few of our unit tests generate a random search request body nd run tests against it. The source can optionally contain ext elements under the ext sections, which can be parsed by plugins. With this commit we introduce a plugin so that the tests don't use the one from FetchSubPhasePluginIT anymore. They rather generate multiple search ext elements. The plugin can parse and deal with all those. This extends the test coverage as we may have multiple elements with random names.
Took the chance to introduce a common test base class for search requests, called AbstractSearchTestCase, given that the setup phase is the same for all three tests around search source. Then we can have the setup isolated to the base class and the subclasses relying on it.
Closes#17685
* Throw error if query element doesn't end with END_OBJECT
Followup to #20515 where we added validation that after we parse a query within a query element, we should not get a field name. Truth is that the only token allowed at that point is END_OBJECT, as our DSL allows only one single query within the query object:
```
{
"query" : {
"term" : { "field" : "value" }
}
}
```
We can then check that after parsing of the query we have an end_object that closes the query itself (which we already do). Following that we can check that the query object is immediately closed, as there are no other tokens that can be present in that position.
Relates to #20515
I'd made some mistakes that hadn't caused the test to fail but did
slow it down and partially invalidate some of the assertions. This
fixes those mistakes.
* Fix FieldStats deserialization of `ip` field
Add missing readBytes in `ip` field deserialization
Add (de)serialization tests for all types
This change also removes the ability to set FieldStats.minValue or FieldStats.maxValue to null.
This is not required anymore since the stats are built on fields with values only.
Fixes#20516
If an index was created with pre 2.0 we should not treat it as supported
even if all segments have been upgraded to a supported lucene version.
Closes#20512
TransportService is such a central part of the core server, replacing
it's implementation is risky and can cause serious issues. This change removes the ability to
plug in TransportService but allows registering a TransportInterceptor that enables
plugins to intercept requests on both the sender and the receiver ends. This is a commonly used
and overwritten functionality but encapsulates the custom code in a contained manner.
During a networking partition, cluster states updates (like mapping changes or shard assignments)
are committed if a majority of the masters node received the update correctly. This means that the current master has access to enough nodes in the cluster to continue to operate correctly. When the network partition heals, the isolated nodes catch up with the current state and get the changes they couldn't receive before. However, if a second partition happens while the cluster
is still recovering from the previous one *and* the old master is put in the minority side, it may be that a new master is elected which did not yet catch up. If that happens, cluster state updates can be lost.
This commit fixed 95% of this rare problem by adding the current cluster state version to `PingResponse` and use them when deciding which master to join (and thus casting the node's vote).
Note: this doesn't fully mitigate the problem as a cluster state update which is issued concurrently with a network partition can be lost if the partition prevents the commit message (part of the two phased commit of cluster state updates) from reaching any single node in the majority side *and* the partition does allow for the master to acknowledge the change. We are working on a more comprehensive fix but that requires considerate work and is targeted at 6.0.
Currently, we silently accept malformed query where more
than one key is defined at the top-level for query object.
If all the keys have a valid query body, only the last query
is executed, besides throwing off parsing for additional suggest,
aggregation or highlighting defined in the search request.
This commit throws a parsing exception when we encounter a query
with multiple keys.
closes#20500
The default of 30s causes some tests to timeout when running ensureGreen and similar. This is because network delays simulation blocks connect until either the connect timeout expires or the disruption configured time stops. We do *not* immediately connect when the disruption is stopped.
Today when starting Elasticsearch without a Log4j 2 configuration file,
we end up throwing an array index out of bounds exception. This is
because we are passing no configuration files to Log4j. Instead, we
should throw a useful error message to the user. This commit modifies
the Log4j configuration setup to throw a user exception if no Log4j
configuration files are present in the config directory.
Relates #20493
The test uses a NetworkDelay that drops requests and slows down connecting. Next to that it disable node fault detection to make sure nodes are not removed before we check our publishing. Sadly that can lead to huge slow downs if the disruption hits while a node is still pinging (and tries to connect, which is slowed down). Instead we can start the disruption on the cluster state thread, making sure the result of fault detection won't be processed before we publish
This was actually a byproduct of trying to remove a //norelease for
index shard setting validation in MetaDataIndexService. This //norelease
is now removed. Previously this check was *only* used by the template
service, so we validated twice, once in the Settings infrastructure and
once when actually creating the index. We now instead use the Settings
infrastructure to validate the settings for shard count.
`TransportService#registerRequestHandler` allowed to register
handlers more than once and issues an annoying warn log message when
this happens. This change simple throws an exception to prevent regsitering
the same handler more than once. This commit also removes the ability
to remove request handlers.
Relates to #20468
We still use some crazy poor mans compression in InternalSearchHit that
uses a thread local and an unordered map as a lookup table if requested.
Stuff like this should be handled by compression on the transport layer
rather than in-line in the serialization code. This code is complex enough.
This utility class is used in 3 places while we only need to register
the handlers once per node. Otherwise we will see nasty `WARN` logs like:
`registered two transport handlers for action indices:data/read/search[phase/fetch/id/scroll]...`
This change will only register handlers inside the main TransportSearchAction.
After this change SearchModule doesn't subclass AbstractModule anymore and all wiring
happens in `Node.java`. As a side-effect several tests don't need a guice injector anymore.
This commit modifies the logger names within Elasticsearch to be the
fully-qualified class name as opposed removing the org.elasticsearch
prefix and dropping the class name. This change separates the root
logger from the Elasticsearch loggers (they were equated from the
removal of the org.elasticsearch prefix) and enables log levels to be
set at the class level (instead of the package level).
Relates #20457
Today when setting the logging level via the command-line or an API
call, the expectation is that the logging level should trickle down the
hiearchy to descendant loggers. However, this is not necessarily the
case. For example, if loggers x and x.y are already configured then
setting the logging level on x will not descend to x.y. This is because
the logging config for x.y has already been forked from the logging
config for x. Therefore, we must explicitly descend the hierarchy when
setting the logging level and that is what this commit does.
Relates #20463
This commit introduces a new plugin for file-based unicast hosts
discovery. This allows specifying the unicast hosts participating
in discovery through a `unicast_hosts.txt` file located in the
`config/discovery-file` directory. The plugin will use the hosts
specified in this file as the set of hosts to ping during discovery.
The format of the `unicast_hosts.txt` file is to have one host/port
entry per line. The hosts file is read and parsed every time
discovery makes ping requests, thus a new version of the file that
is published to the config directory will automatically be picked
up.
Closes#20323
This commit fixes the following geo_point bwc tests:
* GeoDistanceIT to test deprecated GeoDistanceRangeQuery on legacy indexes only.
* ExternalFieldMapperTests to correctly handle LatLonPoint type
* GeoPointFieldMapperTests to correctly test stored geo_point fields
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.
The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain
The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain
The `fields` parameter has been deprecated for the following APIs (it is replaced by _source filtering):
* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.
Some APIs still have the `fields` parameter for various reasons:
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.
Fixes#20155
Today we add a prefix when logging within Elasticsearch. This prefix
contains the node name, and index and shard-level components if
appropriate.
Due to some implementation details with Log4j 2 , this does not work for
integration tests; instead what we see is the node name for the last
node to startup. The implementation detail here is that Log4j 2 there is
only one logger for a name, message factory pair, and the key derived
from the message factory is the class name of the message factory. So,
when the last node starts up and starts setting prefixes on its message
factories, it will impact the loggers for the other nodes.
Additionally, the prefixes are lost when logging an exception. This is
due to another implementation detail in Log4j 2. Namely, since we log
exceptions using a parameterized message, Log4j 2 decides that that
means that we do not want to use the message factory that we have
provided (the prefix message factory) and so logs the exception without
the prefix.
This commit fixes both of these issues.
Relates #20429
This commit cuts over geo_point fields to use Lucene's new point-based LatLonPoint type for indexes created in 5.0. Indexes created prior to 5.0 continue to use their respective encoding type. Below is a description of the changes made to support the new encoding type:
* New indexes use a new LatLonPointFieldMapper which provides a parse method for the new type
* The new LatLonPoint parse method removes support for lat_lon and geohash parameters
* Backcompat testing for deprecated lat_lon and geohash parameters is added to all unit and integration tests
* LatLonPointFieldMapper provides DocValues support (enabled by default) which uses Lucene's new LatLonDocValuesField type
* New LatLonPoint field data classes are added for aggregation support (wraps LatLonPoint's Numeric Doc Values)
* MultiFields use the geohash as the string value instead of the lat,lon string making it easier to perform geo string queries on the geohash instead of a lat,lon comma delimited string.
Removed Features:
* With the removal of geohash indexing, GeoHashCellQuery support is removed for all new indexes (still supported on existing indexes)
* LatLonPoint does not support a Distance Range query because it is super inefficient. Instead, the geo_distance_range query should be accomplished using either the geo_distance aggregation, sorting by descending distance on a geo_distance query, or a boolean must not of the excluded distance (which is what the distance_range query did anyway).
TODO:
* fix/finish yaml changes for plugin and rest integration tests
* update documentation
When generating random bogus documents, it could happen that they contain both the terms "the" and "ultimate", which would match the query "the ultimate" better than all the other non bogus documents, which would cause testCrossFieldMode to fail. "the" is a term that's relatively likely to be randomly generated given its length; we can simply increase the minimum length of randomly generated terms to 5, so that there are no collisions, as "the" cannot be generated anymore (nor can "ultimate" as the lenght doesn't go up to 8).
Also made some assertions more accurate to check how many hits match a query rather than checking only that the first or second hits are there.
Closes#18873
This commit adds a -q/--quiet option to Elasticsearch so that it does not log anything in the console and closes stdout & stderr streams. This is useful for SystemD to avoid duplicate logs in both journalctl and /var/log/elasticsearch/elasticsearch.log while still allows the JVM to print error messages in stdout/stderr if needed.
closes#17220
This commit adds a health status parameter to the cat indices API for
filtering on indices that match the specified status (green|yellow|red).
Relates #20393
Previously when trying to listen on virtual interfaces during
bootstrap the application would stop working - the interface
couldn't be found by the NetworkUtils class.
The NetworkUtils utilize the underlying JDK NetworkInterface
class which, when asked to lookup by name only takes physical
interfaces into account, failing at virtual (or subinterfaces)
ones (returning null).
Note that when interating over all interfaces, both physical and
virtual ones are taken into account.
This changeset asks for all known interfaces, iterates over them
and matches on the given name as part of the loop, allowing it
to catch both physical and virtual interfaces.
As a result, elasticsearch can now also serve on virtual
interfaces.
A test case has been added which makes sure that all
iterable interfaces can be found by their respective name.
Note that this PR is a second iteration over the previously
merged but later reverted #19537 because it causes tests
to fail when interfaces are down. The test has been modified
to take this into account now.
Closes#17473Closes#19568
Relates #19537
Match query throws parsing errors when an array of terms is provided, we should test that to make sure this behaviour doesn't change.
Relates to #15741
Once a primary is marked as relocated, we can not safely move it back to started (as we have no way of waiting on inflight operations that are performed on the target primary). If the master cancels the relocation in that state, we fail the primary. Sadly, there is a racing condition between the `updateRoutingEntry` method (which is called when the relocation is cancelled by the master) and the `relocated` method. That racing condition can leave the shard as marked "relocated" but have the routing entry not reflect the target relocation. This in turns causes NPEs in TransportReplicationAction:
```
java.util.Objects requireNonNull Objects.java 203
org.elasticsearch.action.support.replication.TransportReplicationAction$ConcreteShardRequest <init> TransportReplicationAction.java 982
```
Sadly, once we end up in this state, we will never recover.
This commit fixes that race condition by making sure `updateRoutingEntry` acquires the mutex when checking for the relocated status. While at it, I also tightened up the code and added lots of assertions/hard checks.
Adds the entire DiscoveryNode object to the trace log in AllocationDeciders.
The allocation decider logging at TRACE level can sometimes be helpful to determine why a shard is not getting allocated on specific nodes. Currently, we only log the node id for these messages. It will be helpful to also include the node name (esp. when dealing with a lot of nodes in the cluster).
In 5.x we allowed this with a deprecation warning. This removes the code
added for that deprecation, requiring the cluster name to not be in the
data path.
Resolves#20391
This change removes the guice dependency handling for SearchService and
several related classes like SearchTransportController and SearchPhaseController.
The latter two now have package private constructors and dependencies like FetchPhase
are now created by calling their constructors explicitly. This also cleans up several users
of the DefaultSearchContext and centralized it's creation inside SearchService.
The cause early termination of tests, which means we don't clean up and close shards, but also don't cause a failure. This in turns makes TestRuleTemporaryFilesCleanup fail on windows (because it does try to clean up, but the files are referenced). Getting stuff like:
```
> C:\jenkins\workspace\es_core_master_windows-2012-r2\core\build\testrun\test\J3\temp\org.elasticsearch.index.shard.IndexShardTests_68B5E1103D78A58B-001\tempDir-006\indices\_na_\0\translog\translog-1.tlog: java.nio.file.AccessDeniedException: C:\jenkins\workspace\es_core_master_windows-2012-r2\core\build\testrun\test\J3\temp\org.elasticsearch.index.shard.IndexShardTests_68B5E1103D78A58B-001\tempDir-006\indices\_na_\0\translog\translog-1.tlog
```
Splits the PrimaryShardAllocator and ReplicaShardAllocator's decision
making for a shard from the implementation of that decision on the
routing table. This is a step toward making it easier to use the same
logic for the cluster allocation explain APIs.
Introduce a base class for unit tests that are based on real `IndexShard`s. The base class takes care of all the little details needed to create and recover shards.
This commit also moves `IndexShardTests` and `ESIndexLevelReplicationTestCase` to use the new base class. All tests in `IndexShardTests` that required a full node environment were moved to a new `IndexShardIT` suite.
Before, when there was a new cluster state to publish,
zen discovery would first update the set of nodes to
ping based on the new cluster state, then publish the new
cluster state. This is problematic because if the cluster
state failed to publish, then the set of nodes to ping
should not have been updated.
This commit fixes the issue by updating the set of
nodes to ping for fault detection only *after* the new
cluster state has been published.
We should rather throw a clear exception to clearly point out that we cannot extract fields from _source. Note that this happens only when explicitly trying to extract fields from source. When source is disabled and no _source parameter is specified, no errors will be thrown and no source will be returned.
Closes#20408
Relates to #20093
Search section supports an ext section that is used to provide additional config needed from plugins. It is now tied to sub fetch phases because it is the only section that may need additional config, but there is no reason for the two to be tightly coupled.
It is now possible to register a searchExtParser independently from a sub fetch phase. All a search ext parser does is parsing some ext section of a search request, whose parsed resulting object is stored in the search context for later retrieval.
The parser is now needed only for sub fetch phases, but doesn't have to be strictly connected to them, it could be used for something else as well potentially
The context was an object where the parsed info are stored. That is more of what we call the builder since after the search refactoring. No need for generics in FetchSubPhaseParser then. Also the previous setHitsExecutionNeeded wasn't useful, it can be removed as well, given that once there is a parsed ext section, it will become a builder that can be retrieved by the sub fetch phase. The sub fetch phase is responsible for doing nothing in case the builder is not set, meaning that the fetch sub phase is plugged in but the request didn't have the corresponding section.
SearchParseElement is renamed to FetchSubPhaseParser and moved to the search.fetch package. Its parse method doesn't get the SearchContext as argument anymore, only the XContentParser, and the return type is what gets parsed (the fetch sub phase context which we may as well rename later).
It is the parser that initializes the FetchSubPhaseContext then. SearchService retrieves the parser by name, calls parse against it and stores the result of parsing by name. No need for FetchSubPhase.ContextFactory anymore, which can be removed.
Given that doc value fields is our own fetch sub phase, it doesn't need to be implemented like if it was plugged in from the outside. It doesn't need its own fetch sub phase context, but it can just be an instance member in SearchContext
Parse elements are always empty for all of our search phases. They can be non empty only for sub fetch phases as they are pluggable and search parse element is left to be used only for plugins that plug in their own sub fetch phase. Only FetchPhase needs now the parseElements method then.
Log4j has a bug where on shutdown it ignores that JMX might be disabled;
since it does not respect this on shutdown, it proceeds to attempt to
access JMX leading to a security exception that should have otherwise
not occurred had it respected that JMX is disabled. This commit
intentionally introduces jar hell with the Server class to work around
this bug until a fix is released.
Relates #20389
Previously we would disable console logging in certain circumstances
(for example, if Elasticsearch is not in the foreground, or if
Elasticsearch is in the foreground but an exception was thrown during
bootstrap). This commit makes this handling work with Log4j 2. This will
prevent users from seeing double bootstrap check failure messages.
Relates #20387
Since the sub query of a function score query is checked on CustomQueryScorer#extractUnknwonQuery we try to extract the terms from the rewritten form of the sub query.
MultiTermQuery rewrites query within a constant score query/weight which returns an empty array when extractTerms is called.
The extraction of the inner terms of a constant score query/weight changed in Lucene somewhere between ES version 2.3 and 2.4 (https://issues.apache.org/jira/browse/LUCENE-6425) which is why this problem occurs on ES > 2.3.
This change moves the extraction of the sub query from CustomQueryScorer#extractUnknownQuery to CustomQueryScorer#extract in order to do the extraction of the terms on the original form of the sub query.
This fixes highlighting of sub queries that extend MultiTermQuery since there is a special path for this kind of query in the QueryScorer (which extract the terms to highlight).
This was an error-prone version type that allowed overriding previous
version semantics. It could cause primaries and replicas to be out of
sync however, so it has been removed.
Resolves#19769
This adds a version field to Templates, which is itself is unused by Elasticsearch, but exists for users to better manage their own templates. Like description, it's optional.
Previous versions of Elasticsearch permitted unquoted JSON field names even though this is against the JSON spec. This leniency was disabled by default in the 5.x series of Elasticsearch but a backwards compatibility layer was added via a system property with the intention of removing this layer in 6.0.0. This commit removes this backwards compatibility layer.
Relates #20388
This commit removes an assertion regarding removing the support for
cluster name being part of the data path in favor of a tracking issue.
Relates #20391
This includes:
- All regular numeric types such as int, long, scaled-float, double, etc
- IP addresses
- Dates
- Geopoints and Geoshapes
Relates to #19784
The 5.x series of Elasticsearch emits a warning if any of the old
logging configuration formats are present. This commit removes that
warning.
Relates #20386
By default, when an exception causes the JVM to terminate, the stack
trace is printed. In the case of failing bootstrap checks, this stack
trace is useless to the user, and might even distract them from seeing
that the bootstrap checks failed for reasons under their control. With
this commit, we cause the stack trace for a failing bootstrap check to
be truncated.
We also modify some methods to not declare that they throw the top level
checked exception type Exception, but instead explicitly declare the
exceptions that they throw. These exceptions are caught and wrapped in a
BootstrapException so that we can percolate only two exception types out
of Bootstrap#init as checked exception, BootstrapException and
NodeValidationException.
Relates #19989
The collect_payloads parameter of the span_near query was previously
deprecated with the intention to be removed. This commit removes this
parameter.
Relates #20385
This commit cleans most of the methods of XContentBuilder so that:
- Jackson's convenience methods are used instead of our custom ones (ie field(String,long) now uses Jackson's writeNumberField(String, long) instead of calling writeField(String) then writeNumber(long))
- null checks are added for all field names and values
- methods are grouped by type in the class source
- methods have the same parameters names
- duplicated methods like field(String, String...) and array(String, String...) are removed
- varargs methods now have the "array" name to reflect that it builds arrays
- unused methods like field(String,BigDecimal) are removed
- all methods now follow the execution path: field(String,?) -> field(String) then value(?), and value(?) -> writeSomething() method. Methods to build arrays also follow the same execution path.
This change checks that `index.merge.scheduler.max_thread_count` < `index.merge.scheduler.max_merge_count` and fails index creation
and settings update if the condition is not met.
Fixes#20380
The logging configuration tests write to log files which are deleted at
the end of the test. If these files are not closed, some operating
systems will complain when these deletes are performed. This commit
ensures that the logging system is properly shutdown so that these files
can be properly deleted.
This was an error-prone version type that allowed overriding previous
version semantics. It could cause primaries and replicas to be out of
sync however, so it has been removed.
Resolves#19769
This change adds a `field.with.dots` to all 2.4 bwc indicse and above.
It also adds verification code to OldIndexBackwardsCompatibilityIT to
ensure we upgrade the indices cleanly and the field is present.
Closes#19956
Due to the way the nodes where shut down etc. we always flushed
away the translog. This means we never tested upgrades of transaction
logs from older version. This change regenerates all valid bwc indices
and repositories with transaction logs and adds correspondent changes
to the OldIndexBackwardsCompatibilityIT.java
Parsing a script on retrieval causes it to be re-parsed on every single script call, which can be very expensive for large frequently called scripts. This change switches to parsing scripts only once during store operation.
With the search refactoring we don't use SearchParseElement anymore to define our own parsing code but only for plugins. There was an abstract subclass called FetchSubPhaseParseElement in our production code, only used in one of our tests. We can remove that abstract class as it is not needed and not that useful for the test that depends on it.
FsInfo#total is removed in favour of getTotal, which allows to retrieve the total value
[TEST] fix FsProbeTests: null is not accepted as path constructor argument
During adding the new settings infrastructure the option to specify the
size of the filter cache as a percentage of the heap size which accidentally
removed. This change adds that ability back.
In addition the `Setting` class had multiple `.byteSizeSetting` methods
which all except one used `ByteSizeValue.parseBytesSizeValue` to parse
the value. One method used `MemorySizeValue.parseBytesSizeValueOrHeapRatio`.
This was confusing as the way the value was parsed depended on how many
arguments were provided.
This change makes all `Setting.byteSizeSetting` methods parse the value
the same way using `ByteSizeValue.parseBytesSizeValue` and adds
`Setting.memorySizeSetting` methods to parse settings that express memory
sizes (i.e. can be absolute bytes values or percentages). Relevant settings
have been moved to use these new methods.
Closes#20330
Exposing lucene 6.x minhash tokenfilter
Generate min hash tokens from an incoming stream of tokens that can
be used to estimate document similarity.
Closes#20149
This changes DiskThresholdDecider to only factor in leaving shards when
checking if a shard can remain. Previously, leaving shards were factored
in for both the `canAllocate` and `canRemain` checks, however, this
makes only the leaving shard sizes subtracted in the `canRemain` check.
It was possible that multiple shards relocating away from the node would
have their entire size subtracted, and the node had a chance to go over
the disk threshold (or hit the disk full) because it subtracted space
that was still being used for other in-progress relocations.
** The default script language is now maintained in `Script` class.
* Added `script.legacy.default_lang` setting that controls the default language for scripts that are stored inside documents (for example percolator queries). This defaults to groovy.
** Added `QueryParseContext#getDefaultScriptLanguage()` that manages the default scripting language. Returns always `painless`, unless loading query/search request in legacy mode then the returns what is configured in `script.legacy.default_lang` setting.
** In the aggregation parsing code added `ParserContext` that also holds the default scripting language like `QueryParseContext`. Most parser don't have access to `QueryParseContext`. This is for scripts in aggregations.
* The `lang` script field is always serialized (toXContent).
Closes#20122
When Elasticsearch depended on Log4j 1, there was jar hell from the
log4j and the apache-log4j-extras jar. As these dependencies are gone,
the jar hell exemption for Log4j 1 can be removed.
Relates #20336
This commit expands on the message printed when config files are
preserved when removing a plugin to give the user an indication of the
reason the config files are preserved.
Replicated operation consist of a routing action (the original), which is in charge of sending the operation to the primary shard, a primary action which executes the operation on the resolved primary and replica actions which performs the operation on a specific replica. This commit adds the targeted shard's allocation id to the primary and replica actions and makes sure that those match the shard the actions end up executing on.
This helps preventing extremely rare failure mode where a shard moves off a node and back to it, all between an action is sent and the time it's processed.
For example:
1) Primary action is sent to a relocating primary on node A.
2) The primary finishes relocation to node B and start relocating back.
3) The relocation back gets to the phase and opens up the target engine, on the original node, node A.
4) The primary action is executed on the target engine before the relocation finishes, at which the shard copy on node B is still the official primary - i.e., it is executed on the wrong primary.
When removing a plugin with a config directory, we preserve the config
directory. This is because the workflow for upgrading a plugin involves
removing and then installing the plugin again and losing the plugin
config in this case would be terrible. This commit causes a message
regarding this to be printed in case the user wants to manually delete
these files.
This commit removes a line-length violation in RemovePluginCommand.java
and removes this file from the list of files for which the line-length
check is suppressed.
We have intentionally introduced leniency for ThrowableProxy from Log4j
to work around a bug there. Yet, a test for this introduced leniency was
not addded. This commit introduces such a test.
Relates #20329
Previously we had an exemption for Joda-Time BaseDateTime because we
forked this class to remove the usage of a volatile field. This hack is
no longer in place, so the exemption is no longer necessary. This commit
removes that exemption.
Relates #20328
The BackgroundIndexer now uses auto-generated IDs randomly. This causes some problems
for tests that still rely on the fact that the IDs are increasing integers. This change
exposes all IDs via a Set<String> to iterate over for tests.
A warning was introduced if old log config files are present (e.g.,
logging.yml). However, this check is executed unconditionally. This can
lead to no such file exceptions when logging configs are not being
resolved, for example when installing a plugin. This commit moves this
check to only execute when logging configs are being resolved.
Some assertions in MaxMapCountCheckTests assert that certain messages
are logged. These assertions pass everywhere except Windows where the
JVM seems confused. The issue is not the javac compiler as the bytecode
produced on OS X and Windows is identical for the relevant classes so
this leaves a possible JVM bug. It is not worth investigating the
ultimate cause of this bug so instead this commit introduces a
workaround.
Log4j has a bug where it does not handle a security exception that can
be thrown when it is rendering a stack trace. This commit intentionally
introduces jar hell with the ThrowableProxy class to work around this
bug until a fix is a released.
Relates #20306
To ensure we don't add documents more than once even if it's mostly paranoia
except of one case where we relocated a shards away and back to the same node
while an initial request is in flight but has not yet finished AND is retried.
Yet, this is a possible case and for that reason we ensure we pass on the
maxUnsafeAutoIdTimestamp on when we prepare for translog recovery.
Relates to #20211
Currently it does not because our parsers do not support big integers/decimals
(on purpose) but we do not have to ask our parser for the number type, we can
just ask the jackson parser for a number representation of the value with the
right type.
Note that I did not add similar tests for big decimals because Jackson seems to
never return big decimals, even for decimal values that are out of the range of
values that can be represented by doubles.
Closes#11508
This commit configures test logging for Log4j 2. The default logger
configuration uses the console appender but at the error level, so most
tests are missing logging. Instead, this commit provides a configuration
for tests which is picked up from the classpath by Log4j 2 when it
initializes. However, this now means that we can no longer initialize
Log4j with a bare-bones configuration when tests run as doing so will
prevent Log4j 2 from attempting to configure logging via the
classpath. Consequently, we move this needed initialization (as
commented, to avoid a message about a status logger not being configured
when we are preparing to configure Log4j from properties files in the
config directory) to only run when we are explicitly configuring Log4j
from properties files.
Relates #20284
Rather than checking that those values are greater than 0, we can sum up the values gotten from all nodes and check that what is returned is that same value.
The mem section was buggy in cluster stats and removed. It is now added back with the same structure as in node stats, containing total memory, available memory, used memory and percentages. All the values are the sum of all the nodes across the cluster (or at least the ones that we were able to get the values from).
If elasticsearch controls the ID values as well as the documents
version we can optimize the code that adds / appends the documents
to the index. Essentially we an skip the version lookup for all
documents unless the same document is delivered more than once.
On the lucene level we can simply call IndexWriter#addDocument instead
of #updateDocument but on the Engine level we need to ensure that we deoptimize
the case once we see the same document more than once.
This is done as follows:
1. Mark every request with a timestamp. This is done once on the first node that
receives a request and is fixed for this request. This can be even the
machine local time (see why later). The important part is that retry
requests will have the same value as the original one.
2. In the engine we make sure we keep the highest seen time stamp of "retry" requests.
This is updated while the retry request has its doc id lock. Call this `maxUnsafeAutoIdTimestamp`
3. When the engine runs an "optimized" request comes, it compares it's timestamp with the
current `maxUnsafeAutoIdTimestamp` (but doesn't update it). If the the request
timestamp is higher it is safe to execute it as optimized (no retry request with the same
timestamp has been run before). If not we fall back to "non-optimzed" mode and run the request as a retry one
and update the `maxUnsafeAutoIdTimestamp` unless it's been updated already to a higher value
Relates to #19813
* master:
Avoid NPE in LoggingListener
Randomly use Netty 3 plugin in some tests
Skip smoke test client on JDK 9
Revert "Don't allow XContentBuilder#writeValue(TimeValue)"
[docs] Remove coming in 2.0.0
Don't allow XContentBuilder#writeValue(TimeValue)
[doc] Remove leftover from CONSOLE conversion
Parameter improvements to Cluster Health API wait for shards (#20223)
Add 2.4.0 to packaging tests list
Docs: clarify scale is applied at origin+offest (#20242)
We have specific support for writing `TimeValue`s in the form of
`XContentBuilder#timeValueField`. Writing a `TimeValue` using
`XContentBuilder#writeValue` is a bug waiting to happen.
* Params improvements to Cluster Health API wait for shards
Previously, the cluster health API used a strictly numeric value
for `wait_for_active_shards`. However, with the introduction of
ActiveShardCount and the removal of write consistency level for
replication operations, `wait_for_active_shards` is used for
write operations to represent values for ActiveShardCount. This
commit moves the cluster health API's usage of `wait_for_active_shards`
to be consistent with its usage in the write operation APIs.
This commit also changes `wait_for_relocating_shards` from a
numeric value to a simple boolean value `wait_for_no_relocating_shards`
to set whether the cluster health operation should wait for
all relocating shards to complete relocation.
* Addresses code review comments
* Don't be lenient if `wait_for_relocating_shards` is set
* master:
Increase visibility of deprecation logger
Skip transport client plugin installed on JDK 9
Explicitly disable Netty key set replacement
percolator: Fail indexing percolator queries containing either a has_child or has_parent query.
Make it possible for Ingest Processors to access AnalysisRegistry
Allow RestClient to send array-based headers
Silence rest util tests until the bogusness can be simplified
Remove unknown HttpContext-based test as it fails unpredictably on different JVMs
Tests: Improve rest suite names and generated test names for docs tests
Add support for a RestClient base path
The deprecation logger is an important way to make visible features of
Elasticsearch that are deprecated. Yet, the default logging makes the
log messages for the deprecation logger invisible. We want these log
messages to be visible, so the default logging for the deprecation
logger should enable these log messages. This commit changes the log
level of deprecation log message to warn, and configures the deprecation
logger so that these log messages are visible out of the box.
Relates #20254
This commit enables CLI tools to have console logging. For the CLI
tools, we skip configuring the logging infrastructure via the config
file, and instead set the level only via a system property.
This commit modifies the call sites that allocate a parameterized
message to use a supplier so that allocations are avoided unless the log
level is fine enough to emit the corresponding log message.
While removing an index isn't actually an alias action, if we add
an alias action that deletes an index then we can delete and index
and add an alias with the same name as the index atomically, in
the same cluster state update.
Closes#20064
This commit adds the support for exclusion filter to the response filtering (filter_path) feature. It changes the XContentBuilder APIs so that it now accepts two types of filters: inclusive and exclusive. Filters are no more String arrays but sets of String instead.
Instead of get, set and remove we do get, remove and then set to avoid type conflicts in IngestDocument.
If the set still fails we try to restore the original field in ingest document.
Closes#19892
already has a file of the same name in the Store, but is different
in content (different checksum/length), then those files are first
deleted before restoring the files in question.