If a terms aggregation was ordered by a metric nested in a single bucket aggregator which did not collect any documents (e.g. a filters aggregation which did not match in that term bucket) an ArrayOutOfBoundsException would be thrown when the ordering code tried to retrieve the value for the metric. This fix fixes all numeric metric aggregators so they return their default value when a bucket ordinal is requested which was not collected.
Closes#17225
This adds a new `/_cluster/allocation/explain` API that explains why a
shard can or cannot be allocated to nodes in the cluster. Additionally,
it will show where the master *desires* to put the shard, according to
the `ShardsAllocator`.
It looks like this:
```
GET /_cluster/allocation/explain?pretty
{
"index": "only-foo",
"shard": 0,
"primary": false
}
```
Though, you can optionally send an empty body, which means "explain the
allocation for the first unassigned shard you find".
The output when a shard is unassigned looks like this:
```
{
"shard" : {
"index" : "only-foo",
"index_uuid" : "KnW0-zELRs6PK84l0r38ZA",
"id" : 0,
"primary" : false
},
"assigned" : false,
"unassigned_info" : {
"reason" : "INDEX_CREATED",
"at" : "2016-03-22T20:04:23.620Z"
},
"nodes" : {
"V-Spi0AyRZ6ZvKbaI3691w" : {
"node_name" : "Susan Storm",
"node_attributes" : {
"bar" : "baz"
},
"final_decision" : "NO",
"weight" : 0.06666675,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
},
"Qc6VL8c5RWaw1qXZ0Rg57g" : {
"node_name" : "Slipstream",
"node_attributes" : {
"bar" : "baz",
"foo" : "bar"
},
"final_decision" : "NO",
"weight" : -1.3833332,
"decisions" : [ {
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "the shard cannot be allocated on the same node id [Qc6VL8c5RWaw1qXZ0Rg57g] on which it already exists"
} ]
},
"PzdyMZGXQdGhqTJHF_hGgA" : {
"node_name" : "The Symbiote",
"node_attributes" : { },
"final_decision" : "NO",
"weight" : 2.3166666,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
}
}
}
```
And when the shard *is* assigned, the output looks like:
```
{
"shard" : {
"index" : "only-foo",
"index_uuid" : "KnW0-zELRs6PK84l0r38ZA",
"id" : 0,
"primary" : true
},
"assigned" : true,
"assigned_node_id" : "Qc6VL8c5RWaw1qXZ0Rg57g",
"nodes" : {
"V-Spi0AyRZ6ZvKbaI3691w" : {
"node_name" : "Susan Storm",
"node_attributes" : {
"bar" : "baz"
},
"final_decision" : "NO",
"weight" : 1.4499999,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
},
"Qc6VL8c5RWaw1qXZ0Rg57g" : {
"node_name" : "Slipstream",
"node_attributes" : {
"bar" : "baz",
"foo" : "bar"
},
"final_decision" : "CURRENTLY_ASSIGNED",
"weight" : 0.0,
"decisions" : [ {
"decider" : "same_shard",
"decision" : "NO",
"explanation" : "the shard cannot be allocated on the same node id [Qc6VL8c5RWaw1qXZ0Rg57g] on which it already exists"
} ]
},
"PzdyMZGXQdGhqTJHF_hGgA" : {
"node_name" : "The Symbiote",
"node_attributes" : { },
"final_decision" : "NO",
"weight" : 3.6999998,
"decisions" : [ {
"decider" : "filter",
"decision" : "NO",
"explanation" : "node does not match index include filters [foo:\"bar\"]"
} ]
}
}
}
```
Only "NO" decisions are returned by default, but all decisions can be
shown by specifying the `?include_yes_decisions=true` parameter in the
request.
Resolves#14593
* master: (419 commits)
Remove PROTOTYPE from ShapeBuilders
Take filterNodeIds into consideration while sending tasks actions requests to nodes
test: cleanup imports and method rename
Remove PROTOTYPE from SortBuilders
percolator: Add query extract support for the blended term query and the common terms query.
Don't iterate over shard routing if it's null
[TEST] Reduce size of random shapes
Add some debug logging to testPrimaryRelocationWhileIndexing
Order methods in IndicesClusterStateService according to execution
Tidied up percolator doc annotations
In cat.snapshots, repository is required
Do not retrieve all indices stats when checking for cache resets
Enforce `discovery.zen.minimum_master_nodes` is set when bound to a public ip #17288
Port Primary Terms to master #17044
Revert "Add debug logging for Vagrant upgrade test"
Ownership for data, logs, and configs for packages
add on_failure exception metadata to ingest document for verbose simulate
Revert "Merge pull request #16843 from xuzha/s3-encryption"
Update Format, add new settings into the setting test
Update and rebase the init implementation.
...
Node roles are now serialized as well, they are not part of the node attributes anymore. DiscoveryNodeService takes care of dividing settings into attributes and roles. DiscoveryNode always requires to pass in attributes and roles separately.
Primary terms is a way to make sure that operations replicated from stale primary are rejected by shards following a newly elected primary.
Original PRs adding this to the seq# feature branch #14062 , #14651 . Unlike those PR, here we take a different approach (based on newer code in master) where the primary terms are stored in the meta data only (and not in `ShardRouting` objects).
Relates to #17038Closes#17044
Change version, required a minor fix in the RPM building.
In case of a alpha/beta version, the release will contain alpha/beta
as the RPM version cannot contains dashes/tildes.
The fielddata settings in mappings have been refatored so that:
- text and string have a `fielddata` (boolean) setting that tells whether it
is ok to load in-memory fielddata. It is true by default for now but the
plan is to make it default to false for text fields.
- text and string have a `fielddata_frequency_filter` which contains the same
thing as `fielddata.filter.frequency` used to (but validated at parsing time
instead of being unchecked settings)
- regex fielddata filtering is not supported anymore and will be dropped from
mappings automatically on upgrade.
- text, string and _parent fields have an `eager_global_ordinals` (boolean)
setting that tells whether to load global ordinals eagerly on refresh.
- in-memory fielddata is not supported on keyword fields anymore at all.
- the `fielddata` setting is not supported on other fields that text and string
and will be dropped when upgrading if specified.
this is the last step to remove node level service from IndexShard.
This means that tests can now more easily create an IndexShard instance
without starting a node and removes the dependency between IndexShard and Client/ScriptService
We current have a ClusterService interface, implemented by InternalClusterService and a couple of test classes. Since the decoupling of the transport service and the cluster service, one can construct a ClusterService fairly easily, so we don't need this extra indirection.
Closes#17183
Also replaced the PercolatorQueryRegistry with the new PercolatorQueryCache.
The PercolatorFieldMapper stores the rewritten form of each percolator query's xcontext
in a binary doc values field. This make sure that the query rewrite happens only during
indexing (some queries for example fetch shapes, terms in remote indices) and
the speed up the loading of the queries in the percolator query cache.
Because the percolator now works inside the search infrastructure a number of features
(sorting fields, pagination, fetch features) are available out of the box.
The following feature requests are automatically implemented via this refactoring:
Closes#10741Closes#7297Closes#13176Closes#13978Closes#11264Closes#10741Closes#4317
Today we allow to set all kinds of index level settings on the node level which
is error prone and difficult to get right in a consistent manner.
For instance if some analyzers are setup in a yaml config file some nodes might
not have these analyzers and then index creation fails.
Nevertheless, this change allows some selected settings to be specified on a node level
for instance:
* `index.codec` which is used in a hot/cold node architecture and it's value is really per node or per index
* `index.store.fs.fs_lock` which is also dependent on the filesystem a node uses
All other index level setting must be specified on the index level. For existing clusters the index must be closed
and all settings must be updated via the API on each of the indices.
Closes#16799
Without this commit fetching the status of a reindex from a node that isn't
coordinating the reindex will fail. This commit properly registers reindex's
status so this doesn't happen. To do so it moves all task status registration
into NetworkModule and creates a method to register other statuses which the
reindex plugin calls.
Today, certain bootstrap properties are set and read via system
properties. This action-at-distance way of managing these properties is
rather confusing, and completely unnecessary. But another problem exists
with setting these as system properties. Namely, these system properties
are interpreted as Elasticsearch settings, not all of which are
registered. This leads to Elasticsearch failing to startup if any of
these special properties are set. Instead, these properties should be
kept as local as possible, and passed around as method parameters where
needed. This eliminates the action-at-distance way of handling these
properties, and eliminates the need to register these non-setting
properties. This commit does exactly that.
Additionally, today we use the "-D" command line flag to set the
properties, but this is confusing because "-D" is a special flag to the
JVM for setting system properties. This creates confusion because some
"-D" properties should be passed via arguments to the JVM (so via
ES_JAVA_OPTS), and some should be passed as arguments to
Elasticsearch. This commit changes the "-D" flag for Elasticsearch
settings to "-E".
The ingest stats include the following statistics:
* `ingest.total.count`- The total number of document ingested during the lifetime of this node
* `ingest.total.time_in_millis` - The total time spent on ingest preprocessing documents during the lifetime of this node
* `ingest.total.current` - The total number of documents currently being ingested.
* `ingest.total.failed` - The total number ingest preprocessing operations failed during the lifetime of this node
Also these stats are returned on a per pipeline basis.
Currently, the cluster service is tightly coupled to the transport service by both managing node connections and requiring the bound address in order to create the local disco node. This commit introduces a new NodeConnectionsService which is in charge of node connection management and makes it possible to remove all network related calls from the cluster service. The local DiscoNode is now created by DiscoveryNodeService and is set both the cluster service and the transport service during node start up.
Closes#16788Closes#16872
Use index UUID to lookup indices on IndicesService
Today we use the index name to lookup index instances on the IndicesService
which applied to search requests but also to index deletion etc. This commit
moves the interface to expect an Index instance which is a tuple
and looks up the index by uuid rather than by name. This prevents accidental modification
of the wrong index if and index is recreated or searching from the wrong index in such a case.
Accessing an index that has the same name but different UUID will now result in an IndexNotFoundException.
Today we use the index name to lookup index instances on the IndicesService
which applied to search reqeusts but also to index deletion etc. This commit
moves the interface to expcet and `Index` instance which is a <name, uuid> tuple
and looks up the index by uuid rather than by name. This prevents accidential modificaiton
of the wrong index if and index is recreated or searching from the _wrong_ index in such a case.
Accessing an index that has the same name but different UUID will now result in an IndexNotFoundException.
Closes#17001
_wait_for_completion defaults to false. If set to true then the API will
wait for all the tasks that it finds to stop running before returning. You
can use the timeout parameter to prevent it from waiting forever. If you
don't set a timeout parameter it'll default to 30 seconds.
Also adds a log message to rest tests if any tasks overrun the test. This
is just a log (instead of failing the test) because lots of tasks are run
by the cluster on its own and they shouldn't cause the test to fail. Things
like fetching disk usage from the other nodes, for example.
Switches the request to getter/setter style methods as we're going that
way in the Elasticsearch code base. Reindex is all getter/setter style.
Closes#16906
* master: (350 commits)
Note to configuration docs on number of threads
Reduce maximum number of threads in boostrap check
Limit generic thread pool
Remove NodeService injection to Discovery
Prevent closing index during snapshot restore
[TEST] Fix newline issue in PluginCliTests on Windows
ParseFieldMatcher should log when using deprecated settings. #16988
fix checkstyle error
Add test for the index_options on a keyword field. #16990
Analysis : Allow string explain param in JSON
Analysis : Allow string explain param in JSON
fix typo
Remove SNAPSHOT from versions in plugin descriptors
Add support for alpha versions
Enable unmap hack for java 9
Simplify mock scripts
Adding `time_zone` parameter to daterange-aggregation docs
Adding tests for `time_zone` parameter for date range aggregation
Added ingest info to node info API, which contains a list of available processors.
Remove bw compat from size mapper
...
Closes#16964
Squashed commit of the following:
commit a23f9d2d29220991aa498214530753d7a5a148c6
Merge: eec9c4e 0b0a251
Author: Robert Muir <rmuir@apache.org>
Date: Mon Mar 7 04:12:02 2016 -0500
Merge branch 'master' into lucene6
commit eec9c4e5cd11e9c3e0b426f04894bb2a6dae4f21
Merge: bc67205 675d940
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 13:45:00 2016 -0500
Merge branch 'master' into lucene6
commit bc67205bdfe1526eae277ab7856fc050ecbdb7b2
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 09:56:31 2016 -0500
fix test bug
commit a60723b007ff12d97b1810cef473bd7b553a0327
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 15:35:35 2016 +0100
Fix SimpleValidateQueryIT to put braces around boosted terms
commit ae3a49d7ba7ced448d2a5262e5d8ec98671a9090
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 15:27:25 2016 +0100
fix multimatchquery
commit ae23fdb88a8f6d3fb7ba60fd1aaf3fd72d899aa5
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 15:20:49 2016 +0100
Rewrite DecayFunctionScoreIT to be independent of the similarity used
This test relied a lot on the term scoring and compared scores
that are dependent on the similarity. This commit changes the base query
to be a predictable constant score query.
commit 366c2d518c35d31251033f1b6f6a93f6e2ae327d
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 14:06:14 2016 +0100
Fix scoring in tests due to changes to idf calculation.
Lucene 6 uses a different default similarity as well as a different
way to calculate IDF. In contrast to older version lucene 6 uses docCount per field
to calculate the IDF not the # of docs in the index to overcome the sparse field
cases.
commit dac99fd64ac2fa71b8d8d106fe68825e574c49f8
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 08:21:57 2016 -0500
don't hardcoded expected termquery score
commit 6e9f340ba49ab10eed512df86d52a121aa775b0f
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 08:04:45 2016 -0500
suppress deprecation warning until migrated to points
commit 3ac8908424b3fdad44a90a4f7bdb3eff7efd077d
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 07:21:43 2016 -0500
Remove invalid test: all commits have IDs, and its illegal to do this.
commit c12976288124ad1a26467e7e848fb810548e7eab
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 07:06:14 2016 -0500
don't test with unsupported back compat
commit 18bbfe76128570bc70883bf91ff4c44c82d27817
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 07:02:18 2016 -0500
remove now invalid lucene 4 backcompat test
commit 7e730e572886f0ef2d3faba712e4256216ff01ec
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 06:58:52 2016 -0500
remove now invalid lucene 4 backwards test
commit 244d2ab6868ba5ac9e0bcde3c2833743751a25ec
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 06:47:23 2016 -0500
use 6.0 codec
commit 5f64d4a431a6fdaa1234adca23f154c2a1de8284
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 06:43:08 2016 -0500
compile, javadocs, forbidden-apis, etc
commit 1f273cd62a7fe9ca8f8944acbbfc5cbdd3d81ccb
Merge: cd33921 29e3443
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 10:45:29 2016 +0100
Merge branch 'master' into lucene6
commit cd33921ac742ef9fb351012eff35f3c7dbda7264
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:58:37 2016 -0500
fix hunspell dictionary loading
commit c7fdbd837b01f7defe9cb1c24e2ec65604b0dc96
Merge: 4d4190f d8948ba
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:41:53 2016 -0500
Merge branch 'master' into lucene6
commit 4d4190fd82601aaafac6b8254ccb3edf218faa34
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:39:14 2016 -0500
remove nocommit
commit 77ca69e288b1a41aa9595c921ed166c272a00ea8
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:38:24 2016 -0500
clean up numericutils vs legacynumericutils
commit a466d696fbaad04b647ffbc0857a9439b583d0bf
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:32:43 2016 -0500
upgrade spatial4j
commit 5412c747a8cfe638bacedbc8233163cb75cc3dc5
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:19:28 2016 -0500
move to 6.0.0-snapshot-8eada27
commit b32bfe924626b87e540692375ece09e7c2edb189
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 11:30:09 2016 +0100
Fix some test compile errors.
commit 6ccde35e9840b03c68d1a2cd47c7923a06edf64a
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 11:25:51 2016 +0100
Current Lucene version is 6.0.0.
commit f62e1015d931b4cc04c778298a8fa1ba65e97ad9
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 11:20:48 2016 +0100
Fix compile errors in NGramTokenFilterFactory.
commit 6837c6eabf96075f743649da9b9b52dd39611c58
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:50:59 2016 +0100
Fix the edge ngram tokenizer/filter.
commit ccd7f070de5efcdfbeb34b9555c65c4990bf1ba6
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:42:44 2016 +0100
The missing value is now accessible through a getter.
commit bd3b77f9b28e5b05daa3d49683a9922a6baf2963
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:41:51 2016 +0100
Remove IndexCacheableQuery.
commit 05f3091c347aeae80eeb16349ac51d2b53cf86f7
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:39:43 2016 +0100
Fix compilation of function_score queries.
commit 81cda79a2431ac78f56b0cc5a5765387f662d801
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:35:02 2016 +0100
Fix compile errors in BlendedTermQuery.
commit 70994ce8dd1eca0b995870974a38e20f26f96a7b
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 23:33:03 2016 -0500
add bug ID
commit 29d4f1a71f36f646b5a6060bed3db019564a279d
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 21:02:32 2016 -0500
easy .store changes
commit 5e1a1e6fd665fa455e88d3a8987362fad5f44bb1
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 20:47:24 2016 -0500
cleanups mostly around boosting
commit 333a669ec6c305ada5645d13ed1da0e19ec1d053
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 20:27:56 2016 -0500
more simple fixes
commit bd5cd98a1e089c866b6b4a5e159400b110140ce6
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 19:49:38 2016 -0500
more easy fixes and removal of ancient cruft
commit a68f419ee47da5f9c9ce5b372f01d707e902474c
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 19:35:02 2016 -0500
cutover numerics
commit 4ca5dc1fa47dd5892db00899032133318fff3116
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 18:34:18 2016 -0500
fix some constants
commit 88710a17817086e477c6c021ec346d0534b7fb88
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 18:14:25 2016 -0500
Add spatial-extras jar as a core dependency
commit c8cd6726583e5ce3f546ed355d4eca037164a30d
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 18:03:33 2016 -0500
update to lucene 6 jars
This commit simplifies and consolidates the two different
implementations of terminals used in tests. There is now a single
MockTerminal which captures output, and allows accessing as one large
string (with unix style \n as newlines), as well as configuring
input.
No changes were really needed in our test infra as it didn't use `node.client`. Yet it didn't take into account ingest nodes, what we used to call client nodes in InternalTestCluster were actually ingest only nodes, which now become coordinating only nodes.
Also renamed some method to get rid of the node client terminology as much as possible in favour or coordinating only node.
As discussed in #16565, the node.client setting is an unnecessary shortcut to node.data: false and node.master: false. We have places where we treat nodes with node.client set to true differently compared to master false and data false, which is not correct. Also, with the addition of node.ingest or potentially new roles, it becomes confusing to figure out if a node client should support ingestion or not.
This commit removes the node.client setting in favour being explicit using node.master, node.data and node.ingest instead.
This commit modifies TransportBulkAction to use relative time instead of
absolute time when measuring how long a bulk request took to be
processed, and adds tests for this functionality.
Closes#16916
The big win here is catching tests that are incorrectly named and will
be skipped by gradle, providing a false sense of security.
The whole thing takes about 10 seconds on my Macbook Air, not counting
compiling the test classes, which seems worth it. Because this runs as
a gradle task with propery UP-TO-DATE handling it can be skipped if the
tests haven't been changed which should save some time.
I chose to keep this in test:framework rather than a new subproject of
buildSrc because ESIntegTestCase and doesn't inroduce any additional
dependencies.
DiscoveryService was a bridge into the discovery universe. This is unneeded and we can just access discovery directly or do things in a different way.
One of those different ways, is not having a dedicated discovery implementation for each our dicovery plugins but rather reuse ZenDiscovery.
UnicastHostProviders are now classified by discovery type, removing unneeded checks on plugins.
Closes#16821
Instead of modifying methods each time we need to add a new behavior for settings, we can simply pass `SettingsProperty... properties` instead.
`SettingsProperty` could be defined then:
```
public enum SettingsProperty {
Filtered,
Dynamic,
ClusterScope,
NodeScope,
IndexScope
// HereGoesYours;
}
```
Then in setting code, it become much more flexible.
TODO: Note that we need to validate SettingsProperty which are added to a Setting as some of them might be mutually exclusive.
This commit works around an issue with hostname verification in HttpClient when using IPv6
addresses in URLs. When an IPv6 address is used in a URL it is typically wrapped with square
brackets. The hostname verifier for HttpClient does not recognize these as valid IPv6 addresses
and instead treats them as a DNS name. We wrap the strict hostname verifier for this version
of HttpClient and strip brackets if we need to.
The corresponding issue in HttpClient is https://issues.apache.org/jira/browse/HTTPCLIENT-1698
but the fix has not been released yet in a stable version.
Currently dynamic mappings propgate through call semantics, where deeper
dynamic mappings are merged into higher level mappings through
return values of recursive method calls. This makese it tricky
to handle multiple updates in the same method, for example when
trying to create parent object mappers dynamically for a field name
that contains dots.
This change makes the api for adding mappers a simple list
of new mappers, and moves construction of the root level mapping
update to the end of doc parsing.
Today we might start a node and some of the paths might not have the
required permissions. This commit goes through all data directories as
well as index, shard and state directories and ensures we have write access.
To make this work across all OS etc. we are trying to write a real file
and remove it again in each of those directories
Today we have the notion of a snapshot inside Version.java which makes
releasing complicated since to do a release Version.java must be changed.
This commit removes all notions of snapshot from the code and allows to
switch between snapshot and release build by specifying a system property on
the build. For instance running:
```
gradle run -Dbuild.snapshot=false
```
will build and package a release build while the default always
builds snapshots. Calls to the main rest action will still get the snapshot
information rendered out with the response.
This commit moves IndicesRequestCache into o.e.indics and makes all API in this
class package private. All references to SearchReqeust, SearchContext etc. have been factored
out and relevant glue code has been added to IndicesService. The IndicesRequestCache is not a
simple class without any hard dependencies on ThreadPool nor SearchService or IndexShard. This now
allows to add unittests.
This commit also removes two settings `indices.requests.cache.clean_interval` and `indices.fielddata.cache.clean_interval`
in favor of `indices.cache.clean_interval` which cleans both caches.
this is a minor cleanup that detaches `IndicesRequestCache` and `IndicesQueryCache`
from guice and moves it into `IndicesService`. It also decouples the `IndexShard` and `IndexService`
from these caches which are unnecessary dependencies.
This PR renames the following three variables to fix a typo `settting` into `setting`.
* Rename a static class member:
INDEX_TRANSLOG_FLUSH_THRESHOLD_SIZE_SETTTING -> INDEX_TRANSLOG_FLUSH_THRESHOLD_SIZE_SETTING
* Rename a parameter: aSettting --> aSetting
* Rename a local variable: indexSetttings -> indexSettings
IndexShard currently holds an arbitraritly used `getQueryShardContext` that comes
out of a ThreadLocal. It's usage is undefined and arbitraty since there is also
such a method with different semantics on `IndexService` This commit removes the threadLocal on
IndexShard as well as on the context itself. It's types are now a member and the QueryShardContext
lifecycle is managed byt SearchContext which passes the types on from the SearchRequest.
There is no need for IndicesWarmer to be a global accessible class. All it needs
access to is inside IndexService. It also doesn't need to be mutable once it's not a per node
instance. This commit move IndicesWarmer to IndexWarmer and makes the default impls like field data and
norms warming an impl detail. Also the IndexShard doesn't depend on this class anymore, instead it accepts
an Engine.Warmer as a ctor argument which delegates to the actual warmer from the index.
Indices level field data cacheing belongs into IndicesService and doesn't need to be
wired by guice. This commit also moves the async cache refresh out of the class into
IndicesService such that threadpool dependencies are removed and testing / creation becomes
simpler.
This change documents the Terminal abstraction that cli tools use, as
well as simplifies the api to be a minimal set of methods to interact
with a terminal.
This commit adds a convenience method for producing random positive time
values which can be useful for places where non-negative and non-zero
time values are expected.
This commit enableds strict settings validation on node startup. All settings
passed to elasticsearch either through system properties, yaml files or any other
way to pass settings must be registered and valid. Settings that are unknown ie. due to
typos or due to deprecation or removal will cause the node to NOT start up. Plugins
have to declare all their settings on the `SettingsModule#registerSetting` and settings for
plugins that are not installed must be removed.
This commit also removes the ability to specify the nodes name via `-Des.name` or just `name` in the
configuration files. The node name must be prefixed with the node prexif like `node.name: Boom`. Left over
usage of `name` will also cause startup to fail.
When primary relocation completes, a cluster state is propagated that deactivates the old primary and marks the new primary as active.
As cluster state changes are not applied synchronously on all nodes, there can be a time interval where the relocation target has processed
the cluster state and believes to be the active primary and the relocation source has not yet processed the cluster state update and
still believes itself to be the active primary. This commit ensures that, before completing the relocation, the reloction source deactivates
writing to its store and delegates requests to the relocation target.
Closes#15900
The plugin cli currently is extremely lenient, allowing most errors to
simply be logged. This can lead to either corrupt installations (eg
partially installed plugins), or confused users.
This change rewrites the plugin cli to have almost no leniency.
Unfortunately it was not possible to remove all leniency, due in
particular to how config files are handled.
The following functionality was simplified:
* The format of the name argument to install a plugin is now an official
plugin name, maven coordinates, or a URL.
* Checksum files are required, and only checked, for official plugins
and maven plugins. Checksums are also only SHA1.
* Downloading no longer uses a separate thread, and no longer has a timeout.
* Installation, and removal, attempts to be atomic. This only truly works
when no config or bin files exist.
* config and bin directories are verified before copying is attempted.
* Permissions and user/group are no longer set on config and bin files.
We rely on the users umask.
* config and bin directories must only contain files, no subdirectories.
* The code is reorganized so each command is a separate class. These
classes already existed, but were embedded in the plugin cli class, as
an extra layer between the cli code and the code running for each command.
With the recent change to allow ESSingleNodeTestCase to specify plugins
and Version for the node, node creation became lazy, where it now
happens before the first test to run. However, there were many static
methods on ESSingleNodeTestCase that still try to access the node. This
change makes those methods non-static.
This commit removes and forbids the use of lowercase ells ('l') in long
literals because they are often hard to distinguish from the digit
representing one ('1').
Closes#16329
We are leaking all kinds of resources if something during Node#close() barfs.
This commit cuts over to a list of closeables to release resources that
also closed remaining services if one or more services fail to close.
Closes#13685
In junit5, a neat assertion method is added which makes testing expected
failures a little more straightforward. The block of code that is
expected to throw is passed in with a lambda expression, and the caught
exception returned for inspection.
This change adds an implementation of expectThrows to ESTestCase. When
junit5 is eventually releassed and we switch to it, we can remove.
Now MasterNodeOperations, ReplicationAllShards, ReplicationSingleShard, BroadcastReplication and BroadcastByNode actions keep track of their parent tasks.
Note for many of the settings we had a netty specific override to a generic transport setting. I have removed those as they are not needed imo (ex. "transport.connections_per_node.recovery" was overriden by "transport.netty.connections_per_node.recovery", which I removed). The netty prefix was kept for settings that are tied to the netty universe - ex. "transport.netty.max_cumulation_buffer_capacity"
Closes#16200
In the early days Elasticsearch used to use the index name as the index identity. Around 1.0.0 we introduced a unique index uuid which is stored in the index setting. Since then we used that uuid in a few places but it is by far not the main identifier when working with indices, partially because it's not always readily available in all places.
This PR start to make a move in the direction of using uuids instead of name by making sure that the uuid is available on the Index class (currently just a wrapper around the name) and as such also available via ShardRouting and ShardId.
Note that this is by no means an attempt to do the right thing with the uuid in all places. In almost all places it falls back to the name based comparison that was done before. It is meant as a first step towards slowly improving the situation.
Closes#16217
This commit limits the `index.translog.sync_interval` to a value not less than `100ms` and
removes the support for fsync on every operation which used to be enabled if `index.translog.sync_interval` was set to `0s`
Now this pr also only schedules an async fsync if the durability is set to `async`. By default not async task is scheduled.
Closes#16152
This commit method renames the ScriptEngineService interface methods
types, extensions, and sandboxed to getTypes, getExtensions, and
isSandboxed, respectively.
This commit converts the script mode settings to the new settings
infrastructure. This is a major refactoring of the handling of script
mode settings. This refactoring is necessary because these settings are
determined at runtime based on the registered script engines and the
registered script contexts.
The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values.
This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on.
NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document.
Fixes#8192
When a Version is passed to `Settings#put(String, Version)` it's id is used as
an integer which should also be used to deserialize it on the consumer end.
Today AssertinLocalTransport expects Version#toString() to be used which can lead
to subtile bugs in tests.
This commit migrates all the settings under network service to the new settings infra.
It also adds some chaining utils to make fall back settings slightly less verbose.
Breaking (but I think acceptable) - network.tcp.no_delay and network.tcp.keep_alive used to accept the value `default` which make us not set them at all on netty. Our default was true so we weren't using this feature. I removed it and now we only accept a true boolean.
* `test.cluster.node.seed` was only used in one place where it wasn't adding value and was now replaced with a constant
* `tests.portsfile` is now an official setting and has been renamed to `node.portsfile`
this commit also convert `search.default_keep_alive` and `search.keep_alive_interval` to the new settings infrastrucutre
This commit allows an integ test to wrap all clients that are exposed by the
InternalTestCluster which is useful for request interception or to add general headers
to request or to intercept client to server call even if the client calls are done from within
the test framework.
This commit fixes an issue in the handling of TransportExceptions in
ShardStateAction. There were two cases not being handled correctly.
- when the local node is shutting down, handlers will be notified with
a TransportException with a message starting "transport stopped"
- when the remote node disconnects, handlers will be notified with a
NodeDisconnectedException
In both of these cases, the cause of the exception will be null and this
was incorrectly being handled. The first case can passed to the listener
like any other critical non-channel failure, and the second case can be
handled by modifying the logic for detecting master channel exceptions.
There was a third case of NodeNotConnectedException that was not being
treated as a master channel exception but should be.
This commit adds an integration test that simulates the handling of a
shard failure request during a network partition. By isolating the
master from the cluster while a shard failed request is in flight, this
test simulates that we wait until a new master is elected and then retry
sending that shard failed request to the newly elected master.
This commit adds methods to CapturingTransport to separate local and
remote transport exceptions. The motivation for this change is that
local transport exceptions are delivered to listeners (usually, but not
always) wrapped in SendRequestTransportException while remote transport
exceptions are delivered to listeners wrapped in
RemoteTransportException. By making this distinction clear in the
CapturingTransport, this makes it less likely that tests will make
incorrect assumptions about the exceptions coming out of the transport
layer to listeners.
Closes#16057
The rest test framework, because it used to be tightly integrated with
ESIntegTestCase, currently expects the addresses for the test cluster to
be passed using the transport protocol port. However, it only uses this
to then find the http address.
This change makes ESRestTestCase extend from ESTestCase instead of
ESIntegTestCase, and changes the sysprop used to tests.rest.cluster,
which now takes the http address.
closes#15459
We have two similar tests with the same name, ContextAndHeaderTransportTests.
They shared lots of common code so I extracted much of it into
ActionRecordingPlugin, a plugin which records all action requests for later
inspection.
I also removed all the warnings from both tests. That made lang-mustache
compile cleanly without any custom -Xlint so I removed those. To remove
the warnings I had to add type parameters to ActionFilter which seemed
like a good idea anyway.
The old infa has been removed in this commit such that nothing uses `DynamicSettings` anymore
and all index-scoped settings require to be registered before the node has fully started up.
Site plugins used to be used for things like kibana and marvel, but
there is no longer a need since kibana (and marvel as a kibana plugin)
uses node.js. This change removes site plugins, as well as the flag for
jvm plugins. Now all plugins are jvm plugins.
This undocumented setting was mainly used for testing and a safety net for
a new flushOnClose feature back in 1.x. We can now safely remove this setting.
The test usage now uses a mock plugin setting to achive the same.
Today we maintain a lot of settings on the shard level which are all index level settings.
In order to cut over to the new settings API where we register update listener we have to move
all of them on to the index level otherwise we need a way to un-register listeners which is error-prone
and requires additional handling when shards are closed. It's simpler and also more accurate to handle all of
them on the index level where we can trash the entire registry for update listener once the index goes out of scope.
ContextAndHeaders has a massive impact on the core infrastructure since it has to
be manually passed on to all relevant places across threads/network calls etc. For the same reason
it's also very error prone and easily forgotten on potentially relevant APIs.
The new ThreadContext is associated with a ThreadPool (node or transport client) and ensures that
headers and context registered on a current thread are inherited to new threads spawned, send across
the network to be deserialized on the receiver end as well as restored on the response handling thread
once the response is received.
This commit adds convenience methods to o.e.t.t.CapturingTransport
that enables capturing requests and clearing the captured requests
with a single method. This is to simplify a common pattern in tests of
capturing requests, and then clearing the captured requests.
1. Uses forbidden patterns to prevent things from referencing
java.io.Serializable or from mentioning serialVersionUID.
2. Uses -Xlint:-serial so we don't have to hear from javac that we aren't
declaring serialVersionUID on any classes that we make that happen to extend
Serializable.
3. Remove Serializable and serialVersionUID declarations.
I didn't use forbidden apis because it doesn't look like it has a way to ban
explicitly implementing Serializable. If you try to ban Serializable with
forbidden apis you end up banning all Exceptions and all Strings.
Closes#15847
This commit removes and now forbids use of
org.apache.lucene.index.IndexWriter#isLocked as this method was
deprecated in LUCENE-6508. The deprecation is due to the fact that
checking if a lock is held before acquiring that lock is subject to a
time-of-check-to-time-of-use race condition. There were three uses of
IndexWriter#isLocked in the code base:
- a logging statement in o.e.i.e.InternalEngine where we are already in
an exceptional condition that the lock was held; in this case,
logging whether or not the directory is locked is superfluous
- in o.e.c.l.u.VersionsTests where we were verifying that a write lock
is released upon closing an IndexWriter; in this case, the check is
not needed as successfully closing an IndexWriter releases its
write lock
- in o.e.t.s.MockFSDirectoryService where we were verifying that a
directory is not write-locked before (implicitly) trying to obtain
such a write lock in org.apache.lucene.index.CheckIndex#<init> (this
is the exact type of a situation that is subject to a race
condition); in this case we can proceed by just (implicitly) trying
to obtain the write lock and failing if we encounter a
LockObtainFailedException
* Added percolator field mapper that extracts the query terms and indexes these terms with the percolator query.
* At percolate time these extracted terms are used to query percolator queries that are like to be evaluated. This can significantly cut down the time it takes to percolate. Whereas before all percolator queries were evaluated if they matches with the document being percolated.
* Changes made to percolator queries are no longer immediately visible, a refresh needs to happen before the changes are visible.
* By default the percolate api only returns upto 10 matches instead of returning all matching percolator queries.
* Made percolate more modular, so that it is easier to add unit tests.
* Added unit tests for the percolator.
Closes#12664Closes#13646
Adds task manager class and enables all activities to register with the task manager. Currently, the immutable Transport*Activity class represents activity itself shared across all requests. This PR adds and an additional structure Task that keeps track of currently running requests and can be used to communicate with these requests using TransportTaskAction.
Related to #15117
When waiting indefinitely for a new cluster state in a test,
TestClusterService#add will throw a NullPointerException if the timeout
is null. Instead, TestClusterService#add should guard against a null
timeout and not even attempt to add a notification for the timeout
expiring. Note that the usage of null is the agreed upon contract for
specifying an indefinite wait from ClusterStateObserver.
DedicatedClusterSnapshotRestoreIT#testRestoreIndexWithMissingShards took ~1.5 min to finish
due to timeouts that are applied if not all shards are allocated. Now that the index that has
unallocated shareds is not refreshed the test is more reasonable and runs in 15 sec
Today we throttle recoveries only for incoming recoveries. Nodes that have a lot
of primaries can get overloaded due to too many recoveries. To still keep that at bay
we limit the number of threads that are sending files to the target to overcome this problem.
The right solution here is to also throttle the outgoing recoveries that are today unbounded on
the master and don't start the recovery until we have enough resources on both source and target nodes.
The concurrency aspects of the recovery source also added a lot of complexity and additional threadpools
that are hard to configure. This commit removes the concurrent streamns notion completely and sends files
in the thread that drives the recovery simplifying the recovery code considerably.
Outgoing recoveries are not throttled on the master via a allocation decider.
Today the logic to async - commit the translog is in every translog instance
itself. While the setting is a per index setting we manageing it per shard. This
polluts the translog code and can more easily be managed in IndexService.
Today we have two variants of translogs for indexing. We only recommend the buffered
one which also has a 20% advantage in indexing speed. This commit removes the option and defaults
to the buffered case. It also hard-wires the translog buffer to 8kb instead of 64kb. We used to
adjust that buffer based on if the shard is active or not, this code has also been removed and
instead we just keep an 8kb buffer arround.
This commit removes `index.translog.flush_threshold_ops` and `index.translog.disable_flush`
in favor of `index.translog.flush_threshold_size`. The number of operations is meaningless by itself and
can easily be turned into a size value with knowledge of the data. Disabling the flush is only useful in
tests and we can set the size value to a really high value. If users really need to do this they can
also apply a very high value like `1PB`.
This change adds a Fixture class for use by gradle. A Fixture is an
external process that integration tests will use. It can be added as a
dependsOn for integTest, and will automatically be shutdown upon success
or failure, as well as relevant information dumped on failure. There is
also an example fixture in this change.