discovery.zen.minimum_master_nodes is the single most important setting to set on a production cluster. We have no way of supplying a good default so it must be set by the user. Binding a node to a public IP (as opposed to the default local host) is a good enough indication that a node will be part of a production cluster cluster and thus it's a good tradeoff to enforce the settings. Note that nothing prevent users from setting it to 1 in a single node cluster.
Closes#17288
Primary terms is a way to make sure that operations replicated from stale primary are rejected by shards following a newly elected primary.
Original PRs adding this to the seq# feature branch #14062 , #14651 . Unlike those PR, here we take a different approach (based on newer code in master) where the primary terms are stored in the meta data only (and not in `ShardRouting` objects).
Relates to #17038Closes#17044
If a Writeable.Reader returns null it is always a bug, probably one that
will cause corruption in the StreamInput it was trying to read from. This
commit adds a check that attempts to catch these errors quickly including
the name of the reader.
We currently have a `discovery.zen.master_election.filter_client` setting that control whether their ping responses are ignored for master election (which is the current default). With the push to treat client nodes as normal nodes (and promote the transport/rest clients for client work), this should be changed. This commit remove this setting and it's companion `discovery.zen.master_election.filter_data` setting (currently defaulting to false) in favor of singe `discovery.zen.master_election.ignore_non_master_pings` setting with more intuitive name (defaulting to false).
Resolves#17325Closes#17329
readFrom is confusing because it requires an instance of the type that it
is reading but it doesn't modify it. But we also have (deprecated) methods
named readFrom that *do* modify the instance. The "right" way to implement
the non-modifying readFrom is to delegate to a constructor that takes a
StreamInput so that the read object can be immutable. Now that we have
`@FunctionalInterface`s it is fairly easy to register things by referring
directly to the constructor.
This change modifying NamedWriteableRegistry so that it does that. It keeps
supporting `registerPrototype` which registers objects to be read by
readFrom but deprecates it and delegates it to a new `register` method
that allows passing a simple functional interface. It also cuts Task.Status
subclasses over to using that method.
The start of #17085
The available memory metric was always set to `0` since 2.0.beta1 (bug). was left behind but never set. Turns out the section wasn't that useful, as it would only output the total memory available throughout all nodes in the cluster. We decided to remove the section then.
This commit fixes a line-length checkstyle violation in
YamlSettingsLoaderTests.java and removes this file from the checkstyle
line-length suppressions.
This commit fixes a line-length checkstyle violation in
JsonSettingsLoaderTests.java and removes this file from the checkstyle
line-length suppressions.
This commit adds a test that NoDuplicatesProperties throws a
NullPointerException if an attempt is made to put a key that corresponds
to a null value. This behavior ultimately comes from the super class
Properties but the test ensures that we retain this behavior.
This commit fixes a line-length checkstyle violation in
PropertiesSettingsLoader.java and removes this file from the checkstyle
line-length suppressions.
The sole constructor of XContentSettingsLoader accepts a boolean flag
that indicates whether or not null values parsed from the source should
be rejected or not. Previously a true value for this flag meant that
null values should be rejected. With this commit, the meaning of this
flag is reversed so that a true value means that null values should be
accepted (note that this is needed due to the way that settings are
unset via the cluster update settings API). The name of this flag has
been changed from guardAgainstNullValuedSettings to allowNullValues, for
clarity.
All our fields are supposed to support multi fields, so we could put the logic in
`TypeParsers.parseField` instead of duplicating the logic in every type parser.
Change version, required a minor fix in the RPM building.
In case of a alpha/beta version, the release will contain alpha/beta
as the RPM version cannot contains dashes/tildes.
Waiting for completion of list tasks tasks can cause an infinite loop of a list tasks task waiting for its own completion or completion of its children. To reproduce run:
```
curl "localhost:9200/_tasks?wait_for_completion"
```
This commit adds a guard against null-valued settings that are loaded
from yaml or json settings files, and also adds a test that ensures
the same remains true for settings loaded from properties files.
In #17198, we removed suggest transport action, which
used the `suggest` threadpool to execute requests. Now
`suggest` threadpool is unused and suggest requests are
executed on the `search` threadpool.
Switching from using list of BytesReference to real SortBuilder list in
SearchSourceBuilder, TopHitsAggregatorBuilder and TopHitsAggregatorFactory.
Removing SortParseElement and related sort parsers.
We can be better at checking `buffer_size` and `chunk_size` for S3 repositories.
For example, we know that:
* `buffer_size` should be more than `5mb`
* `chunk_size` should be no more than `5tb`
* `buffer_size` should be lower than `chunk_size`
Otherwise, setting `buffer_size` is useless.
For the record:
`chunk_size` is a Snapshot setting whatever the implementation is.
`buffer_size` is an S3 implementation setting.
Let say that you are snapshotting a 500mb file. If you set `chunk_size` to `200mb`, then Snapshot service will call S3 repository to snapshot 3 files with the following sizes:
* `200mb`
* `200mb`
* `100mb`
If you set `buffer_size` to `100mb` (AWS maximum size recommendation), the first file of `200mb` will be uploaded on S3 using the multipart feature in 2 chunks and the workflow is basically the following:
* create the multipart request and get back an `id` from AWS S3 platform
* upload part1: `100mb`
* upload part2: `100mb`
* "commit" the full upload using the `id`.
Closes#17244.
We lost some accounting code in the translog recover code during refactoring
which triggers a very rare assertion. If we fail on a recovery target with an
illegal mapping update (which can happen if the clusterstate is behind), then
we miss to rollback the # of processed ops in that batch and once we resume
the batch we trip an assertion that the stats are off.
This commit brings back the code lost in 8bc2332d9a
and improves the comment that explains why we need this rollback logic.
Now that string has been splitted into text and keyword, we use text as a
dynamic type when encountering string fields in a json document. However
this does not play well with existing templates that look like
```
{
"mapping": {
"index": "not_analyzed",
"type": "{dynamic_type}"
},
"match": "*"
}
```
Since we want existing templates to keep working as much as possible in 5.0,
this commit adds a hack to dynamic templates so that elasticsearch will create
a keyword field if the `index` property is set and is either `no` or
`not_analyzed`, similarly to what was done in #16991.
While this will make upgrades easier, we still need to figure out a way to
allow users to create keyword fields when using dynamic types.
The fielddata settings in mappings have been refatored so that:
- text and string have a `fielddata` (boolean) setting that tells whether it
is ok to load in-memory fielddata. It is true by default for now but the
plan is to make it default to false for text fields.
- text and string have a `fielddata_frequency_filter` which contains the same
thing as `fielddata.filter.frequency` used to (but validated at parsing time
instead of being unchecked settings)
- regex fielddata filtering is not supported anymore and will be dropped from
mappings automatically on upgrade.
- text, string and _parent fields have an `eager_global_ordinals` (boolean)
setting that tells whether to load global ordinals eagerly on refresh.
- in-memory fielddata is not supported on keyword fields anymore at all.
- the `fielddata` setting is not supported on other fields that text and string
and will be dropped when upgrading if specified.
Currently, both Gsub and Grok parse regex strings during
Pipeline creation. Thrown parsing exceptions were leaking out, this
commit wraps those exceptions in ElasticsearchParseExceptions.
This commit mocks the value of rlimit infinity in the max size virtual
memory check test. This is to avoid attempting to load the native C
library during the test on Windows which would lead to a permissions
violation (the native C library needs to be loaded before the security
manager is setup).
Archive cluster level settings if unknown or broken
We already archive index level settings if we find an unknown or invalid/broken
value for a setting on node startup. The same could potentially happen for persistent
cluster level settings if we remove a setting or if we add validation to a setting that
didn't exist in the past. To ensure that only valid settings are recovered into the cluster
state we archive them (prefix them with `archive.` and log a warning. Tools that check the
cluster settings can then warn users that they have broken settings in their clusterstate that
got archived.
This commit adds a bootstrap check on Linux and OS X for the max size of
virtual memory (address space) to the user running the Elasticsearch
process.
Closes#16935
Currently if you run an `exists` query on an object, it will resolve all sub
fields and create a disjunction for all those fields. However the `_field_names`
mapper indexes paths for objects so we could query object paths directly.
I also changed the query parser to reject `exists` queries if the `_field_names`
field is disabled since it would be a big performance trap.
We already archive index level settings if we find an unknown or invalid/broken
value for a setting on node startup. The same could potentially happen for persistent
cluster level settings if we remove a setting or if we add validation to a setting that
didn't exist in the past. To ensure that only valid settings are recovered into the cluster
state we archive them (prefix them with `archive.` and log a warning. Tools that check the
cluster settings can then warn users that they have broken settings in their clusterstate that
got archived.
This commit sets up the default filesystem used during install plugins
tests. A hack is neeeded to handle the temporary directory because the
system property "java.io.tmpdir" will have been initialized to a value
that is sensible for the default filesystem, but not necessarily to a
value that makes sense for the mock filesystem in use during the
tests. This property is restored after each test.
For the refactoring of SortBuilders related to #10217, each SortBuilder needs to get a build()
method that produces a SortField according to the SortBuilder parameters on the shard.
This commit fixes string formatting issues in the error handling and
provides a bettter error message if malformed input is detected.
This commit also adds tests for both situations.
Relates to #17212
this is the last step to remove node level service from IndexShard.
This means that tests can now more easily create an IndexShard instance
without starting a node and removes the dependency between IndexShard and Client/ScriptService
When a master is elected, it reaches out to all master nodes for their cluster state, selecting the one with the highest version. At the moment, we do another round to select the index metadata with the highest version as well. This is not needed - the election of a cluster state is enough - we should just use whatever indices are in it.
Closes#17233
In #17187, we upgrade index state after upgrading
index folder structure. As we don't have to write
the upgraded state in the old index folder structure,
we can cleanup how we write upgraded index state.
We can do better than just throwing an error when we don't find a
setting. It's actually trivial to leverage lucenes slow LD StringDistance
to find possible candiates for a setting to detect missspellings and suggest
a possible setting.
This commit adds error messages like:
* `unknown setting [index.numbe_of_replica] did you mean [index.number_of_replicas]?`
rather than just reporting the setting as unknown
Today if something is wrong with the IndexMetaData we detect it very
late and most of the time if that happens we already allocated the index
and get endless loops and full log files on data-nodes. This change tries
to verify IndexService creattion during initial state recovery on the master
and if the recovery fails the index is imported as `closed` and won't be allocated
at all.
Closes#17187
In 5.0 we don't allow index settings to be specified on the node level ie.
in yaml files or via commandline argument. This can cause problems during
upgrade if this was used extensively. For instance if analyzers where
specified on a node level this might cause the index to be closed when
imported (see #17187). In such a case all indices relying on this
must be updated via `PUT /${index}/_settings`. Yet, this API has slightly
different semantics since it overrides existing settings. To make this less
painful this change adds a `preserve_existing` parameter on that API to ensure
we have the same semantics as if the setting was applied on the node level.
This change also adds a better error message and a change to the migration guide
to ensure upgrades are smooth if index settings are specified on the node level.
If a index setting is detected this change fails the node startup and prints a message
like this:
```
*************************************************************************************
Found index level settings on node level configuration.
Since elasticsearch 5.x index level settings can NOT be set on the nodes
configuration like the elasticsearch.yaml, in system properties or command line
arguments.In order to upgrade all indices the settings must be updated via the
/${index}/_settings API. Unless all settings are dynamic all indices must be closed
in order to apply the upgradeIndices created in the future should use index templates
to set default values.
Please ensure all required values are updated on all indices by executing:
curl -XPUT 'http://localhost:9200/_all/_settings?preserve_existing=true' -d '{
"index.number_of_shards" : "1",
"index.query.default_field" : "main_field",
"index.translog.durability" : "async",
"index.ttl.disable_purge" : "true"
}'
*************************************************************************************
```
This commit adds a hack to detect when Jimfs throws an IAE where it
should be throwing an UOE. Namely, the method
FileSystemProvider#createDirectory should be throwing an UOE if an
attempt is made to set attributes that the filesystem does not support,
but instead Jimfs violates this and throws an IAE.
There are some implementation of StreamInput that implement the available method
and there are others that do not implement this method. This change makes the
available method abstract in the StreamInput class and implements the method where
it was not previously implemented.
This commit refactors the unit tests for installing plugins to test
against mock filesystems (as well as the native filesystem) for better
test coverage. This commit also adds tests that cover the POSIX
attributes handling when installing plugins (e.g., ensuring that the
plugins directory has the right permissions, the bin directory has
execute permissions, and the config directory has the same owner and
group as its parent).
We current have a ClusterService interface, implemented by InternalClusterService and a couple of test classes. Since the decoupling of the transport service and the cluster service, one can construct a ClusterService fairly easily, so we don't need this extra indirection.
Closes#17183
During the aggregation refactoring the default value for `keyed` in the `percentiles` and `percentile_ranks` aggregation was inadvertently changed from `true` to `false`. This change reverts the defaults to the old (correct) value
Also replaced the PercolatorQueryRegistry with the new PercolatorQueryCache.
The PercolatorFieldMapper stores the rewritten form of each percolator query's xcontext
in a binary doc values field. This make sure that the query rewrite happens only during
indexing (some queries for example fetch shapes, terms in remote indices) and
the speed up the loading of the queries in the percolator query cache.
Because the percolator now works inside the search infrastructure a number of features
(sorting fields, pagination, fetch features) are available out of the box.
The following feature requests are automatically implemented via this refactoring:
Closes#10741Closes#7297Closes#13176Closes#13978Closes#11264Closes#10741Closes#4317
This commit fixes the line-length checkstyle violations in
InstallPluginCommand.java and removes this from the list of files for
which the line-length check is suppressed.
Also adding checks for median SortMode on non-numeric field types
to FieldSortBuilder, removing some unused code and switching
GeoDistanceSortBuilder to using ParseField.
Provide better error message when an incompatible node connects to a node
We should give a better exception message when an incompatible node connects
and we receive a messeage. This commit adds a clear excpetion based on the
protocol version received instead of throwing cryptic messages about not fully reaed
buffer etc.
Relates to #17090
We should give a better exception message when an incompatible node connects
and we receive a messeage. This commit adds a clear excpetion based on the
protocol version received instead of throwing cryptic messages about not fully reaed
buffer etc.
Relates to #17090
The previous method sorted first by _score, then _uid. In certain situations, this allowed
floating point errors to slightly alter the sort order, causing test failure.
We only sort on _uid now, which should be deterministic and allow comparison of ten
documents. Not quite as useful, but less fragile and we still check to make sure num hits
and max score are identical.
Closes#17164
If this query doesn't take the child type into account then it can match other
child document types pointing to the same parent type and that have the same id too.
Today we allow to set all kinds of index level settings on the node level which
is error prone and difficult to get right in a consistent manner.
For instance if some analyzers are setup in a yaml config file some nodes might
not have these analyzers and then index creation fails.
Nevertheless, this change allows some selected settings to be specified on a node level
for instance:
* `index.codec` which is used in a hot/cold node architecture and it's value is really per node or per index
* `index.store.fs.fs_lock` which is also dependent on the filesystem a node uses
All other index level setting must be specified on the index level. For existing clusters the index must be closed
and all settings must be updated via the API on each of the indices.
Closes#16799
This commit fixes an OOM error that happens when the XContentParser.readList() method is asked to parse a single value instead of an array. It fixes the UpdateRequest parsing as well as remove some leniency in the readList() method so that it expect to be in an array before parsing values.
closes#15338
Now we have 16870 we can enable the request cache by default. The caching can still be disabled on a per request basis and can still be disabled in the settings, only the default value has changed. For now this is done regardless of whether the shard is active or inactive.
Closes#17134
This change adds a rewrite phase to the queries on the shard before they are assessed for caching or executed. This allows the opportunity to rewrite queries as faster running simpler queries based on attributes known to only the shard itself. The first query to implement this is the RangeQueryBuilder which will rewrite to a MatchAllQueryBuilder if the range of terms on the shard is a subset of the query and rewrites to a MatchNoneQueryBuilder if the range of terms on the shard is completely outside the query.
Currently, global and index level state format type can be configured through gateway.format.
This commit removes the ability to configure format type for these states.
Now we always store these states in SMILE format and ensure we always write them
to disk in the most compact way.
Adds support for scheduling commands to run at a later time on another
thread pool in the current thread's context:
```java
Runnable someCommand = () -> {System.err.println("Demo");};
someCommand = threadPool.getThreadContext().preserveContext(someCommand);
threadPool.schedule(timeValueMinutes(1), Names.GENERAL, someCommand);
```
This happens automatically for calls to `threadPool.execute` but `schedule`
and `scheduleWithFixedDelay` don't do that, presumably because scheduled
tasks are usually context-less. Rather than preserve the current context
on all scheduled tasks this just makes it possible to preserve it using
the syntax above.
To make this all go it moves the Runnables that wrap the commands from
EsThreadPoolExecutor into ThreadContext.
This, or something like it, is required to support reindex throttling.
For the current refactoring of SortBuilders related to #10217,
each SortBuilder should get a build() method that produces a
SortField according to the SortBuilder parameters on the shard.
This change also slightly refactors the current parse method in
SortParseElement to extract an internal parse method that returns
a list of sort fields only needs a QueryShardContext as input
instead of a full SearchContext. This allows using this internal
parse method for testing.
Without this commit fetching the status of a reindex from a node that isn't
coordinating the reindex will fail. This commit properly registers reindex's
status so this doesn't happen. To do so it moves all task status registration
into NetworkModule and creates a method to register other statuses which the
reindex plugin calls.
When specifying more than one `filter` in a `constant_score`
query, the last one will be the only one that will be
executed, overwriting previous filters. It should rather
raise a ParseException to notify the user that only one
filter query is accepted.
Closes#17126
Fix a potential parsing problem in GeoDistanceSortParser.
For an input like `{ [...], "coerce" = true, "ignore_malformed" = false }` the parser will
fail to parse the ignore_malformed boolean flag and will fall through to the last
else-branch where the boolean flag will be parsed as geo-hash and `ignore_malformed`
treated as field name.
The build currently uses the old maven support in gradle. This commit
switches to use the newer maven-publish plugin. This will allow future
changes, for example, easily publishing to artifactory.
An additional part of this change makes publishing of build-tools part
of the normal publishing, instead of requiring a separate upload step
from within buildSrc. That also sets us up for a follow up to enable
precomit checks on the buildSrc code itself.
This commit removes some dead code that resulted from removing the
ability for a field to have different names (after enforcing that fields
have the same full and index name).
Closes#17127
Test revealed a potential problem in the current GeoDistanceSortParser.
For an input like `{ [...], "coerce" = true, "ignore_malformed" = false }
the parser will fail to parse the `ignore_malformed` boolean flag and
will fall through to the last else-branch where the boolean flag will be
parsed as geo-hash and `ignore_malformed` treated as field name.
Adding fix and test that will fail with the old parser code.
Try to renew sync ID if `flush=true` on forceMerge
Today we do a force flush which wipes the sync ID if there is one which
can cause the lost of all benefits of the sync ID ie. fast recovery.
This commit adds a check to renew the sync ID if possible. The flush call
is now also not forced since the IW will show pending changes if the forceMerge added new segments.
if we keep using force we will wipe the sync ID even if no renew was actually needed.
Closes#17019
Adding methods and tests to ScriptSortBuilder that makes it implement NamedWritable and adds a fromXContent() method needed to read itseld from xContent. Also changing sortMode() setters in
FieldSortBuilder, GeoDistanceSortBuilder and ScriptSortBuilder to accept an enum instead of a String
value.
Relates to #15178
IndicesStore checks for `allocated elsewhere` for every shard not alocated on the local node
On each cluster-state update we check on the local node if we can delete some shards content.
For this we linearly walk all shards and check if they are allocated and started on another node
and if we can delete them locally. if we can delete them locally we go and ask other nodes if we can
delete them and then if the shared IS active elsewhere issue a state update task to delete it. Yet,
there is a bug in IndicesService#canDeleteShardContent which returns `true` even if that shards
datapath doesn't exist on the node which causes tons of unnecessary node to node communication and
as many state update task to be issued. This can have large impact on the cluster state processing
speed.
**NOTE:** This only happens for shards that have at least one shard allocated on the node ie. if an `IndexService` exists.
On each clusterstate update we check on the local node if we can delete some shards content.
For this we linearly walk all shards and check if they are allocated and started on another node
and if we can delete them locally. if we can delete them locally we go and ask other nodes if we can
delete them and then if the shared IS active elsewhere issue a state update task to delete it. Yet,
there is a bug in IndicesService#canDeleteShardContent which returns `true` even if that shards
datapath doesn't exist on the node which causes tons of unnecessary node to node communciation and
as many state update task to be issued. This can have large impact on the cluster state processing
speed.
**NOTE:** This only happens for shards that have at least one shard allocated on the node
ie. if an `IndexService` exists.
Closes#17106
Today we do a force flush which wipes the sync ID if there is one which
can cause the lost of all benefits of the sync ID ie. fast recovery.
This commit adds a check to renew the sync ID if possible. The flush call
is now also not forced since the IW will show pending changes if the forceMerge added new segments.
if we keep using force we will wipe the sync ID even if no renew was actually needed.
Closes#17019
Add infrastructure to run REST tests on a multi-version cluster
This change adds the infrastructure to run the rest tests on a multi-node
cluster that users 2 different minor versions of elasticsearch. It doesn't implement
any dedicated BWC tests but rather leverages the existing REST tests.
Since we don't have a real version to test against, the tests uses the current version
until the first minor / RC is released to ensure the infrastructure works.
Given the amount of problems this change already found I think it's worth having this run with our test suite by default. The structure of this infra will likely change over time but for now it's a step into the right direction. We will likely want to split it up into integTests and integBwcTests etc. so each plugin can have it's own bwc tests but that's left for future refactoring.