This commit removes some dead code that resulted from removing the
ability for a field to have different names (after enforcing that fields
have the same full and index name).
Closes#17127
Test revealed a potential problem in the current GeoDistanceSortParser.
For an input like `{ [...], "coerce" = true, "ignore_malformed" = false }
the parser will fail to parse the `ignore_malformed` boolean flag and
will fall through to the last else-branch where the boolean flag will be
parsed as geo-hash and `ignore_malformed` treated as field name.
Adding fix and test that will fail with the old parser code.
Try to renew sync ID if `flush=true` on forceMerge
Today we do a force flush which wipes the sync ID if there is one which
can cause the lost of all benefits of the sync ID ie. fast recovery.
This commit adds a check to renew the sync ID if possible. The flush call
is now also not forced since the IW will show pending changes if the forceMerge added new segments.
if we keep using force we will wipe the sync ID even if no renew was actually needed.
Closes#17019
Adding methods and tests to ScriptSortBuilder that makes it implement NamedWritable and adds a fromXContent() method needed to read itseld from xContent. Also changing sortMode() setters in
FieldSortBuilder, GeoDistanceSortBuilder and ScriptSortBuilder to accept an enum instead of a String
value.
Relates to #15178
IndicesStore checks for `allocated elsewhere` for every shard not alocated on the local node
On each cluster-state update we check on the local node if we can delete some shards content.
For this we linearly walk all shards and check if they are allocated and started on another node
and if we can delete them locally. if we can delete them locally we go and ask other nodes if we can
delete them and then if the shared IS active elsewhere issue a state update task to delete it. Yet,
there is a bug in IndicesService#canDeleteShardContent which returns `true` even if that shards
datapath doesn't exist on the node which causes tons of unnecessary node to node communication and
as many state update task to be issued. This can have large impact on the cluster state processing
speed.
**NOTE:** This only happens for shards that have at least one shard allocated on the node ie. if an `IndexService` exists.
On each clusterstate update we check on the local node if we can delete some shards content.
For this we linearly walk all shards and check if they are allocated and started on another node
and if we can delete them locally. if we can delete them locally we go and ask other nodes if we can
delete them and then if the shared IS active elsewhere issue a state update task to delete it. Yet,
there is a bug in IndicesService#canDeleteShardContent which returns `true` even if that shards
datapath doesn't exist on the node which causes tons of unnecessary node to node communciation and
as many state update task to be issued. This can have large impact on the cluster state processing
speed.
**NOTE:** This only happens for shards that have at least one shard allocated on the node
ie. if an `IndexService` exists.
Closes#17106
Today we do a force flush which wipes the sync ID if there is one which
can cause the lost of all benefits of the sync ID ie. fast recovery.
This commit adds a check to renew the sync ID if possible. The flush call
is now also not forced since the IW will show pending changes if the forceMerge added new segments.
if we keep using force we will wipe the sync ID even if no renew was actually needed.
Closes#17019
Add infrastructure to run REST tests on a multi-version cluster
This change adds the infrastructure to run the rest tests on a multi-node
cluster that users 2 different minor versions of elasticsearch. It doesn't implement
any dedicated BWC tests but rather leverages the existing REST tests.
Since we don't have a real version to test against, the tests uses the current version
until the first minor / RC is released to ensure the infrastructure works.
Given the amount of problems this change already found I think it's worth having this run with our test suite by default. The structure of this infra will likely change over time but for now it's a step into the right direction. We will likely want to split it up into integTests and integBwcTests etc. so each plugin can have it's own bwc tests but that's left for future refactoring.
After another round of input from @cbuescher this adds a few more sanity
checks to request parsing.
In addition adds (back) support for the reverse option.
Today index names are often resolved lazily, only when they are really
needed. This can be problematic especially when it gets to mapping updates
etc. when a node sends a mapping update to the master but while the request
is in-flight the index changes for whatever reason we would still apply the update
since we use the name of the index to identify the index in the clusterstate.
The problem is that index names can be reused which happens in practice and sometimes
even in a automated way rendering this problem as realistic.
In this change we resolve the index including it's UUID as early as possible in places
where changes to the clusterstate are possible. For instance mapping updates on a node use a
concrete index rather than it's name and the master will fail the mapping update iff
the index can't be found by it's <name, uuid> tuple.
Closes#17048
Changes:
- no more option to configure eager/lazy loading of the norms (useless now
that orms are disk-based)
- only the `string`, `text` and `keyword` fields support the `norms` setting
- the `norms` setting takes a boolean that decides whether norms should be
stored in the index but old options are still supported to give users time
to upgrade
- setting a `boost` no longer implicitely enables norms (for new indices only,
this is still needed for old indices)
Today, certain bootstrap properties are set and read via system
properties. This action-at-distance way of managing these properties is
rather confusing, and completely unnecessary. But another problem exists
with setting these as system properties. Namely, these system properties
are interpreted as Elasticsearch settings, not all of which are
registered. This leads to Elasticsearch failing to startup if any of
these special properties are set. Instead, these properties should be
kept as local as possible, and passed around as method parameters where
needed. This eliminates the action-at-distance way of handling these
properties, and eliminates the need to register these non-setting
properties. This commit does exactly that.
Additionally, today we use the "-D" command line flag to set the
properties, but this is confusing because "-D" is a special flag to the
JVM for setting system properties. This creates confusion because some
"-D" properties should be passed via arguments to the JVM (so via
ES_JAVA_OPTS), and some should be passed as arguments to
Elasticsearch. This commit changes the "-D" flag for Elasticsearch
settings to "-E".
To make API's output more easy to read we are suppressing stack traces (#12991) unless explicitly requested by setting `error_trace=true` on the request. To compensate we are logging the stacktrace into the logs so people can look it up even the error_trace wasn't enabled. Currently we do so using the `INFO` level which can be verbose if an api is called repeatedly by some automation. For example, if someone tries to read from an index that doesn't exist we will respond with a 404 exception and log under info every time. We should reduce the level to `DEBUG` as we do with other API driven errors. Internal errors (rest codes >=500) are logged as WARN.
Closes#16627
This change adds the infrastructure to run the rest tests on a multi-node
cluster that users 2 different minor versions of elasticsearch. It doesn't implement
any dedicated BWC tests but rather leverages the existing REST tests.
Since we don't have a real version to test against, the tests uses the current version
until the first minor / RC is released to ensure the infrastructure works.
Relates to #14406Closes#17072
When unzipping a plugin zip, the zip entries are resolved relative to
the directory being unzipped into. However, there are currently no
checks that the entry name was not absolute, or relatively points
outside of the plugin dir. This change adds a check for those two cases.
This adds methods and tests to ScriptSortBuilder that
makes it implement NamedWritable and adds the fromXContent
method needed to read itseld from xContent.
Test failures showed problems with passing down
the same collate parameter map reference from the
phrase suggestion builder to the context where.
This changes the collate parameter setters to
make a shallow copy of the map passed in.
This commit adds fields bytes_recovered and files_recovered to the cat
recovery API. These fields, respectively, indicate the total number of
bytes and files recovered. Additionally, for consistency, some totals
fields and translog recovery fields have been renamed.
Closes#17064
This tries to remove friction to upgrade to 5.0 that would be caused by mapping
changes:
- old ways to specify mapping settings (eg. store: yes instead of store:true)
will still work but a deprecation warning will be logged
- string mappings that only use the most common options will be upgraded
automatically to text/keyword
The change to move dynamic mapping handling to the end of document
parsing has an edge case which can cause dynamic mappings to fail
document parsing. If field a.b is added as an as part of the root update,
followed by a.c.d, then we need to expand the mappers on the stack,
since a is hidden inside the root update which exists on the stack.
This change adds a test for this case, as well as tries to better
document how the logic works for building up the stack before adding a
dynamic mapper.
Sequence of events that lead to the NPE:
- avg metric returns NaN for buckets
- Movavg skips NaN or null buckets, and simply re-uses the existing bucket (e.g. doesn't add
a 'movavg' field)
- Derivative references Movavg, the bucket resolution returns null because Movavg wasn't added
to the bucket, NPE when trying to subtract null values
The ingest stats include the following statistics:
* `ingest.total.count`- The total number of document ingested during the lifetime of this node
* `ingest.total.time_in_millis` - The total time spent on ingest preprocessing documents during the lifetime of this node
* `ingest.total.current` - The total number of documents currently being ingested.
* `ingest.total.failed` - The total number ingest preprocessing operations failed during the lifetime of this node
Also these stats are returned on a per pipeline basis.
Currently, the cluster service is tightly coupled to the transport service by both managing node connections and requiring the bound address in order to create the local disco node. This commit introduces a new NodeConnectionsService which is in charge of node connection management and makes it possible to remove all network related calls from the cluster service. The local DiscoNode is now created by DiscoveryNodeService and is set both the cluster service and the transport service during node start up.
Closes#16788Closes#16872
Currently all SortBuilder implementations have their separate order
field. This PR moves this up to SortBuilder, together with setter
and getter and makes sure the default is set to SortOrder.ASC
except for `_score` sorting where the default is SortOrder.DESC.
This commit updates the documentation for GeoPointField by removing all references to the coerce and doc_values parameters. DocValues are enabled in lucene GeoPointField by default (required for boundary filtering). The QueryBuilders are updated to automatically normalize points (ignoring the coerce parameter) for any index created onOrAfter version 2.2.
Use index UUID to lookup indices on IndicesService
Today we use the index name to lookup index instances on the IndicesService
which applied to search requests but also to index deletion etc. This commit
moves the interface to expect an Index instance which is a tuple
and looks up the index by uuid rather than by name. This prevents accidental modification
of the wrong index if and index is recreated or searching from the wrong index in such a case.
Accessing an index that has the same name but different UUID will now result in an IndexNotFoundException.
Today we use the index name to lookup index instances on the IndicesService
which applied to search reqeusts but also to index deletion etc. This commit
moves the interface to expcet and `Index` instance which is a <name, uuid> tuple
and looks up the index by uuid rather than by name. This prevents accidential modificaiton
of the wrong index if and index is recreated or searching from the _wrong_ index in such a case.
Accessing an index that has the same name but different UUID will now result in an IndexNotFoundException.
Closes#17001
Occasionally the .geohash suffix in Geo{Distance|DistanceRange}Query would conflict with a mapping that defines a sub-field by the same name. This occurs often with nested and multi-fields a mapping defines a geo_point sub-field using the field name "geohash". Since the QueryParser already handles parsing geohash encoded geopoints without requiring the ".geohash" suffix, the suffix parsing can be removed altogether.
This commit removes the .geohash suffix parsing, adds explicit test coverage for the nested query use-case, and adds random distance queries to the nested query test suite.
In particular, this test ensures we don't restart the master node until
we know the index deletion has taken effect on master and the master
eligible nodes.
Closes#16917Closes#16890
This change makes ScoreSortBuilder implement NamedWriteable, adds
equals() and hashCode() and also implements parsing ScoreSortBuilder
back from xContent. This is needed for the ongoing Search refactoring.
This was lost in refactoring even on the 2.x branch. The slow-log
is not per index not per shard anymore such that we don't add the
shard ID as the logger prefix. This commit adds back the index
name as part of the logging message not as a prefix on the logger
for better testabilitly.
Closes#17025
_wait_for_completion defaults to false. If set to true then the API will
wait for all the tasks that it finds to stop running before returning. You
can use the timeout parameter to prevent it from waiting forever. If you
don't set a timeout parameter it'll default to 30 seconds.
Also adds a log message to rest tests if any tasks overrun the test. This
is just a log (instead of failing the test) because lots of tasks are run
by the cluster on its own and they shouldn't cause the test to fail. Things
like fetching disk usage from the other nodes, for example.
Switches the request to getter/setter style methods as we're going that
way in the Elasticsearch code base. Reindex is all getter/setter style.
Closes#16906
Currently the message stays in the `UnassignedInfo` for the shard,
however, it would be very useful to know the exact point (time-wise)
that the cancellation happened when diagnosing an issue.
Relates to debugging #16357
Apparently lucene6 is way more picky with respect to corrupting files
that are not fsynced that's why this test sometimes failed after the lucene6
upgrade.
This commit reduces the maximum number of threads required in the
bootstrap check. This limit can be reduced since the generic thread pool
is no longer unbounded.
Relates #17003
The generic thread pool was previously configured to be able to create
an unlimited number of threads. The thinking is that tasks that are
submitted to its work queue must execute and should not block waiting
for a worker. However, in cases of heavy load, this can lead to an
explosion in the number of threads; this can even lead to a feedback
loop that exacerbates the problem. What is more, this can even bump into
OS limits on the number of threads that can be created.
This commit limits the number of threads in the generic thread pool to
four times the bounded number of processors.
Relates #17003
This merge commit merges #16682 which adds the ability to define a index-global
default similarity but at the same time prevents overriding built-in similarities
for new indices. Old indices where able to do this int the past which should not
be punished even though this is not possible anymore.
Closes#16594Closes#16682
This removes the need for accessing the SearchContext when parsing Sort elements
to queries. After applying the patch only a QueryShardContext is needed.
Relates to #15178
Move some test methods from AnalylzeActionIT to RestAnalyzeActionTest
Allow string explain param if it can parse
Fix wrong param name in rest-api-spec
Closes#16925
We removed leniencey from version parsing which caught problems with
-SNAPSHOT suffixes on plugin properies. This commit removes the -SNAPSHOT
from both es and the extension version and adds tests to ensure we can
parse older versions that allowed -SNAPSHOT in BWC way.
Elasticsearch 5.0 will come with alpha versions which is not supported
in the current version scheme. This commit adds support for aplpha starting
with es 5.0.0 in a backwards compatible way.
Date range aggregations used to be unable to use a `time_zone` parameter to e.g. be applied
int date math roundings like in `now/d` (see #10130 as an example). After the aggregation
refactoring, the time_zone parameter has been pulled up to ValuesSourceAggregatorBuilder
and can now be used in date range aggregations as well.
This change adds randomized time zone settings to the existing IT tests to verify that the
`time_zone` parameter is honored when calculating the bucket boundaries. Also moving
the DateRangeTests from module-groovy/messy back to core as DateRangeIT, sharing
common script mocks with DateHistogramIT and adding documentation for the
`time_zone` parameter in the date range aggregation docs.
Closes#10130
Internally the put pipeline API uses this information in node info API to validate if all specified processors in a pipeline exist on all nodes in the cluster.
Closes#16964
Squashed commit of the following:
commit a23f9d2d29220991aa498214530753d7a5a148c6
Merge: eec9c4e 0b0a251
Author: Robert Muir <rmuir@apache.org>
Date: Mon Mar 7 04:12:02 2016 -0500
Merge branch 'master' into lucene6
commit eec9c4e5cd11e9c3e0b426f04894bb2a6dae4f21
Merge: bc67205 675d940
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 13:45:00 2016 -0500
Merge branch 'master' into lucene6
commit bc67205bdfe1526eae277ab7856fc050ecbdb7b2
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 09:56:31 2016 -0500
fix test bug
commit a60723b007ff12d97b1810cef473bd7b553a0327
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 15:35:35 2016 +0100
Fix SimpleValidateQueryIT to put braces around boosted terms
commit ae3a49d7ba7ced448d2a5262e5d8ec98671a9090
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 15:27:25 2016 +0100
fix multimatchquery
commit ae23fdb88a8f6d3fb7ba60fd1aaf3fd72d899aa5
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 15:20:49 2016 +0100
Rewrite DecayFunctionScoreIT to be independent of the similarity used
This test relied a lot on the term scoring and compared scores
that are dependent on the similarity. This commit changes the base query
to be a predictable constant score query.
commit 366c2d518c35d31251033f1b6f6a93f6e2ae327d
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 14:06:14 2016 +0100
Fix scoring in tests due to changes to idf calculation.
Lucene 6 uses a different default similarity as well as a different
way to calculate IDF. In contrast to older version lucene 6 uses docCount per field
to calculate the IDF not the # of docs in the index to overcome the sparse field
cases.
commit dac99fd64ac2fa71b8d8d106fe68825e574c49f8
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 08:21:57 2016 -0500
don't hardcoded expected termquery score
commit 6e9f340ba49ab10eed512df86d52a121aa775b0f
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 08:04:45 2016 -0500
suppress deprecation warning until migrated to points
commit 3ac8908424b3fdad44a90a4f7bdb3eff7efd077d
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 07:21:43 2016 -0500
Remove invalid test: all commits have IDs, and its illegal to do this.
commit c12976288124ad1a26467e7e848fb810548e7eab
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 07:06:14 2016 -0500
don't test with unsupported back compat
commit 18bbfe76128570bc70883bf91ff4c44c82d27817
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 07:02:18 2016 -0500
remove now invalid lucene 4 backcompat test
commit 7e730e572886f0ef2d3faba712e4256216ff01ec
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 06:58:52 2016 -0500
remove now invalid lucene 4 backwards test
commit 244d2ab6868ba5ac9e0bcde3c2833743751a25ec
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 06:47:23 2016 -0500
use 6.0 codec
commit 5f64d4a431a6fdaa1234adca23f154c2a1de8284
Author: Robert Muir <rmuir@apache.org>
Date: Fri Mar 4 06:43:08 2016 -0500
compile, javadocs, forbidden-apis, etc
commit 1f273cd62a7fe9ca8f8944acbbfc5cbdd3d81ccb
Merge: cd33921 29e3443
Author: Simon Willnauer <simonw@apache.org>
Date: Fri Mar 4 10:45:29 2016 +0100
Merge branch 'master' into lucene6
commit cd33921ac742ef9fb351012eff35f3c7dbda7264
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:58:37 2016 -0500
fix hunspell dictionary loading
commit c7fdbd837b01f7defe9cb1c24e2ec65604b0dc96
Merge: 4d4190f d8948ba
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:41:53 2016 -0500
Merge branch 'master' into lucene6
commit 4d4190fd82601aaafac6b8254ccb3edf218faa34
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:39:14 2016 -0500
remove nocommit
commit 77ca69e288b1a41aa9595c921ed166c272a00ea8
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:38:24 2016 -0500
clean up numericutils vs legacynumericutils
commit a466d696fbaad04b647ffbc0857a9439b583d0bf
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:32:43 2016 -0500
upgrade spatial4j
commit 5412c747a8cfe638bacedbc8233163cb75cc3dc5
Author: Robert Muir <rmuir@apache.org>
Date: Thu Mar 3 23:19:28 2016 -0500
move to 6.0.0-snapshot-8eada27
commit b32bfe924626b87e540692375ece09e7c2edb189
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 11:30:09 2016 +0100
Fix some test compile errors.
commit 6ccde35e9840b03c68d1a2cd47c7923a06edf64a
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 11:25:51 2016 +0100
Current Lucene version is 6.0.0.
commit f62e1015d931b4cc04c778298a8fa1ba65e97ad9
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 11:20:48 2016 +0100
Fix compile errors in NGramTokenFilterFactory.
commit 6837c6eabf96075f743649da9b9b52dd39611c58
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:50:59 2016 +0100
Fix the edge ngram tokenizer/filter.
commit ccd7f070de5efcdfbeb34b9555c65c4990bf1ba6
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:42:44 2016 +0100
The missing value is now accessible through a getter.
commit bd3b77f9b28e5b05daa3d49683a9922a6baf2963
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:41:51 2016 +0100
Remove IndexCacheableQuery.
commit 05f3091c347aeae80eeb16349ac51d2b53cf86f7
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:39:43 2016 +0100
Fix compilation of function_score queries.
commit 81cda79a2431ac78f56b0cc5a5765387f662d801
Author: Adrien Grand <jpountz@gmail.com>
Date: Thu Mar 3 10:35:02 2016 +0100
Fix compile errors in BlendedTermQuery.
commit 70994ce8dd1eca0b995870974a38e20f26f96a7b
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 23:33:03 2016 -0500
add bug ID
commit 29d4f1a71f36f646b5a6060bed3db019564a279d
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 21:02:32 2016 -0500
easy .store changes
commit 5e1a1e6fd665fa455e88d3a8987362fad5f44bb1
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 20:47:24 2016 -0500
cleanups mostly around boosting
commit 333a669ec6c305ada5645d13ed1da0e19ec1d053
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 20:27:56 2016 -0500
more simple fixes
commit bd5cd98a1e089c866b6b4a5e159400b110140ce6
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 19:49:38 2016 -0500
more easy fixes and removal of ancient cruft
commit a68f419ee47da5f9c9ce5b372f01d707e902474c
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 19:35:02 2016 -0500
cutover numerics
commit 4ca5dc1fa47dd5892db00899032133318fff3116
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 18:34:18 2016 -0500
fix some constants
commit 88710a17817086e477c6c021ec346d0534b7fb88
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 18:14:25 2016 -0500
Add spatial-extras jar as a core dependency
commit c8cd6726583e5ce3f546ed355d4eca037164a30d
Author: Robert Muir <rmuir@apache.org>
Date: Wed Mar 2 18:03:33 2016 -0500
update to lucene 6 jars
This commit simplifies and consolidates the two different
implementations of terminals used in tests. There is now a single
MockTerminal which captures output, and allows accessing as one large
string (with unix style \n as newlines), as well as configuring
input.
If the recovery throws an exception we miss to notify the recovery
listener and bubble up the uncaught exception. This commit uses
an AbstractRunnable that also catches rejected execution exceptions
etc. and notifies the listener accordingly.
ClusterModuleTests tests what ShardsAllocatorModuleIT tests without starting
a cluster. Unittests should be preferrred over IT tests anyway and the instantiation
of the balanced shards allocator is tested with every other integration test.
This seems to be an error introduced in refactoring around #16821
where we now wait 30seconds by default if the node already joined
a cluster and got a master. This can slow down tests dramatically
espeically on slow boxes and notebooks.
Closes#16956
Decommissioning a node or applying a filter inclusion / exclusion can potentially lead to many shards that need to be moved to other nodes. This commit reuses the model across all
shard movements in an allocation round: It calculates the shard model once and simulates the application of all shards that can be moved on this model.
Closes#16926
Today we might run into a rejected execution exception when
we shutdown the node while handling a transport exception. The
exception is run in a seperate thread but that thread might
not be able to execute due to the shutdown. Today we barf and fill
the logs with large exception. This commit catches this exception
and logs it as debug logging instead.