Commit Graph

3763 Commits

Author SHA1 Message Date
Martijn van Groningen d4901a71d7
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-10-14 10:27:17 +02:00
Nhat Nguyen 8180cf1e68 Mute testDoNotInfinitelyWaitForMapping
Tracked at #47974
2019-10-13 22:06:50 -04:00
Nhat Nguyen 2995d4a9c0 Sequence number based replica allocation (#46959)
With this change, shard allocation prefers allocating replicas on a node
that already has a copy of the shard that is as close as possible to the
primary, so that it is as cheap as possible to bring the new replica in
sync with the primary. Furthermore, if we find a copy that is identical
to the primary then we cancel an ongoing recovery because the new copy
which is identical to the primary needs no work to recover as a replica.

We no longer need to perform a synced flush before performing a rolling
upgrade or full cluster start with this improvement.

Closes #46318
2019-10-13 22:06:50 -04:00
Nhat Nguyen 4f06225928 Avoid unneeded refresh with concurrent realtime gets (#47895)
This change should reduce refreshes for a use-case where we perform 
multiple realtime gets at the same time on an active index. Currently,
we only call refresh if the index operation is still on the versionMap.
However, at the time we call refresh, that operation might be already or
will be included in the latest reader. Hence, we do not need to refresh.
Adding another lock here is not an issue as the refresh is already
sequential.
2019-10-13 20:08:21 -04:00
Nhat Nguyen 4c1bb210cb Force flush in translog retention policy test (#47879)
If we roll translog but do not index, then a flush without force is a 
noop. In this case, the number of retained translog files will be higher
than the value specified by the retention policy.

Closes #4741
2019-10-13 20:08:21 -04:00
Przemyslaw Gomulka 6ab58de7ef
[7.x] Enable ResolverStyle.STRICT for java formatters backport(#46675) (#47913)
Joda was using ResolverStyle.STRICT when parsing. This means that date will be validated to be a correct year, year-of-month, day-of-month
However, we also want to make it works with Year-Of-Era as Joda used to, hence custom temporalquery.localdate in DateFormatters.from
Within DateFormatters we use the correct uuuu year instead of yyyy year of era

worth noting: if yyyy(without an era) is used in code, the parsing result will be a TemporalAccessor which will fail to be converted into LocalDate. We mostly use DateFormatters.from so this takes care of this. If possible the uuuu format should be used.
2019-10-11 21:19:56 +02:00
Christoph Büscher 2ef12c37f5 Add builder for distance_feature to QueryBuilders (#47846)
The QueryBuilders convenience class is currently missing a shortcut to construct
a DistanceFeatureQueryBuilder, which is added here.

Closes #47767
2019-10-11 18:20:01 +02:00
Alan Woodward ec9198d0e2
Adjust Version.V_6_8_4 to refer to Lucene 7.7.2 (#47926)
6.8.4 will ship with Lucene 7.7.2, so we need to change our version settings to
reflect this.

Relates #47901
2019-10-11 17:01:42 +01:00
David Turner ba62eb3dce Allow truncation of clean translog (#47866)
Today the `elasticsearch-shard remove-corrupted-data` tool will only truncate a
translog it determines to be corrupt. However there may be other cases in which
it is desirable to truncate the translog, for instance if an operation in the
translog cannot be replayed for some reason other than corruption. This commit
adds a `--truncate-clean-translog` option to skip the corruption check on the
translog and blindly truncate it.
2019-10-11 15:48:12 +01:00
Henning Andersen a0d0866f59 Shrink should not touch max_retries (#47719)
Shrink would set `max_retries=1` in order to avoid retrying. This
however sticks to the shrunk index afterwards, causing issues when a
shard copy later fails to allocate just once.

Avoiding a retry of a shrink makes sense since there is no new node
to allocate to and a retry will likely fail again. However, the downside of
having max_retries=1 afterwards outweigh the benefit of not retrying
the failed shrink a few times. This change ensures shrink no longer
sets max_retries and also makes all resize operations (shrink, clone,
split) leave the setting at default value rather than copy it from source.
2019-10-11 14:22:56 +02:00
Przemyslaw Gomulka 0c439fe495
[7.x] Allow partial parsing dates (#47872) backport(#46814)
Enable partial parsing of date part. This is making the behaviour in java.time implementation the same as with joda.
2018, 2018-01 and 2018-01-01 are all valid dates for date_optional_time or strict_date_optional_time
closes #45284
closes #47473
2019-10-11 11:17:19 +02:00
Zachary Tong 2de3411c9c Make sibling pipeline agg ctor's protected (#42808)
SiblingPipelineAggregator is a public interfaces,
but the ctor was package-private.  These should be protected so that
plugin authors can extend and implement their own sibling pipeline agg.
2019-10-10 12:31:14 -04:00
Martijn van Groningen 102016d571
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-10-10 14:44:05 +02:00
Jim Ferenczi bd6e2592a7 Remove the SearchContext from the highlighter context (#47733)
Today built-in highlighter and plugins have access to the SearchContext through the
highlighter context. However most of the information exposed in the SearchContext are not needed and a QueryShardContext
would be enough to perform highlighting. This change replaces the SearchContext by the informations that are absolutely
required by highlighter: a QueryShardContext and the SearchContextHighlight. This change allows to reduce the exposure of the
complex SearchContext and remove the needs to clone it in the percolator sub phase.

Relates #47198
Relates #46523
2019-10-10 10:34:10 +02:00
Jim Ferenczi 3d334a262b Ensure that we don't call listener twice when detecting a partial failure in _search (#47694)
This change fixes a bug that can occur when a shard failure is detected while we build
the search response and accept partial failures in set to false. In this case we currently
call onFailure on the provided listener but also continue the search as if the failure didn't
occur. This can lead to a listener called twice, once with onFailure and once with onSuccess
which is forbidden by design.
2019-10-10 09:59:49 +02:00
dengweisysu dc4224fbdf Sync translog without lock before trim unreferenced readers (#47790)
This commit is similar to the optimization made in #45765. With this
change, we fsync most of the data of the current generation without
holding writeLock when trimming unreferenced readers.

Relates #45765
2019-10-09 17:56:30 -04:00
Armin Braun 302e09decf
Simplify some Common ActionRunnable Uses (#47799) (#47828)
Especially in the snapshot code there's a lot
of logic chaining `ActionRunnables` in tricky
ways now and the code is getting hard to follow.
This change introduces two convinience methods that
make it clear that a wrapped listener is invoked with
certainty in some trickier spots and shortens the code a bit.
2019-10-09 23:29:50 +02:00
Igor Motov 12e4e7ef54 Geo: implement proper handling of out of bounds geo points (#47734)
This is the first iteration in improving of handling of out of
bounds geopoints with a latitude outside of the -90 - +90 range
and a longitude outside of the -180 - +180 range.

Relates to #43916
2019-10-09 20:30:59 +04:00
Igor Motov f8b8afdc70 Geo: Fixes indexing of linestrings that go around the globe (#47471)
LINESTRING (0 0, 720 20) is now decomposed into 3 strings:
multilinestring (
  (0.0 0.0, 180.0 5.0),
  (-180.0 5.0, 180 15),
  (-180.0 15.0, 0 20)
)

It also fixes issues with linestrings that intersect antimeridian
more than 5 times.

Fixes #43837
Fixes #43826
2019-10-09 20:30:59 +04:00
Tim Brooks d18ff24dbe
Fix BulkByScrollResponseTests exception assertions (#45519)
Currently in the x content serialization tests we compare the exception
messages that are serialized. These exceptions messages are not
equivalent because the exception often changes when serialized to x
content. This commit removes this assertion.
2019-10-09 10:15:58 -06:00
Tim Brooks 02622c1ef9
Fix issues with serializing BulkByScrollResponse (#45357)
Currently there are two issues with serializing BulkByScrollResponse.
First, when deserializing from XContent, indexing exceptions and search
exceptions are switched. Additionally, search exceptions do no retain
the appropriate RestStatus code, so you must evaluate the status code
from the exception. However, the exception class is not always correctly
retained when serialized.

This commit adds tests in the failure case. Additionally, fixes the
swapping of failure types and adds the rest status code to the search
failure.
2019-10-09 10:12:14 -06:00
Martijn van Groningen da1e2ea461
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-10-09 09:06:13 +02:00
Armin Braun 96b36b5a8c
Make loadShardSnapshot Exceptions Consistent (#47728) (#47735)
Similar to #47507. We are throwing `SnapshotException` when
you (and SLM tests) would expect a `SnapshotMissingException`
for concurrent snapshot status and snapshot delete operations
with a very low probability.
Fixed the exception type and added a test for this scenario.
2019-10-08 21:04:51 +02:00
Armin Braun 5cef4752f7
Fix Ex. Handling in SnapshotsService#snapshots (#47507) (#47727)
We're needlessly wrapping a `SnapshotMissingException` which
itself is a `SnapshotException` when trying to load a missing
snapshot. This leads to failure #47442 which expects a
`SnapshotMissingException` in this case.

Closes #47442
2019-10-08 17:01:54 +02:00
Henning Andersen ce91ba7c25 Dangling indices strip aliases (#47581)
Importing dangling indices with aliases risks breaking functionalities
using those aliases. For instance, writing to an alias may break if
there is no is_write_index indication on the existing alias and the
dangling index import adds a second index to the alias. Or an
application could have an assumption about the alias only ever pointing
to one index and suddenly seeing the alias also linked to an old index
could break it. With this change we strip aliases of the index meta data
found before importing a dangling index.
2019-10-08 12:09:30 +02:00
David Turner bb5f750ab4 Deprecate include_relocations setting (#47443)
Setting `cluster.routing.allocation.disk.include_relocations` to `false` is a
bad idea since it will lead to the kinds of overshoot that were otherwise fixed
in #46079. This commit deprecates this setting so it can be removed in the next
major release.
2019-10-08 08:19:04 +01:00
Tal Levy a17f394e27
Geo-Match Enrich Processor (#47243) (#47701)
this commit introduces a geo-match enrich processor that looks up a specific
`geo_point` field in the enrich-index for all entries that have a geo_shape match field
that meets some specific relation criteria with the input field.

For example, the enrich index may contain documents with zipcodes and their respective
geo_shape. Ingesting documents with a geo_point field can be enriched with which zipcode
they associate according to which shape they are contained within.

this commit also refactors some of the MatchProcessor by moving a lot of the shared code to
AbstractEnrichProcessor.

Closes #42639.
2019-10-07 15:03:46 -07:00
Armin Braun b669b8f046
Simplify Snapshot Delete Further (#47626) (#47644)
This change removes the special path for deleting
the index metadata blobs and moves deleting them to
the bulk delete of unreferenced blobs at the end of
the snapshot delete process.
This saves N RPC calls for a snapshot containing N
indices and simplifies the code.
Also, this change moves the unreferenced data cleanup up
the stack to make it more obvious that any exceptions during
this pahse will be ignored and not fail the delete request.
Lastly, this change removes the needless chaining of
first deleting unreferenced data from the snapshot delete
and then running the stale data cleanup (that would also run
from the cleanup endpoint) and simply fires off the cleanup
right after updating the repository data (index-N) in parallel
to the other delete operations to speed up the delete some more.
2019-10-07 14:18:41 +02:00
Armin Braun 1359ef73a3
Add IT for Snapshot Issue in 47552 (#47627) (#47634)
* Add IT for Snapshot Issue in 47552 (#47627)

Adding a specific integration test that reproduces the problem
fixed in #47552. The issue fixed only reproduces in the snapshot
resiliency otherwise which are not available in 6.8 where the
fix is being backported to as well.
2019-10-07 10:38:19 +02:00
Armin Braun 6bd033931b
Add Consistency Assertion to SnapshotsInProgress (#47598) (#47633)
Assert given input shards and indices are consistent.
Also, fixed the equality check for SnapshotsInProgress.
Before this change the tests never had more than a single waiting
shard per index so they never failed as a result of the
waiting shards list not being ordered.

Follow up to #47552
2019-10-07 10:37:56 +02:00
Luca Cavanna 736fceb18b
Fold InitialSearchPhase into AbstractSearchAsyncAction (#47182)
Historically, we have two base classes for search actions that generally need to fan out to multiple shards and then move on to the following phase: InitialSearchPhase and AbstractSearchAsyncAction that extends it. Practically, every search action extends the latter, and there are no direct subclasses of InitialSearchPhase in our codebase.

This commit folds InitialSearchPhase into AbstractSearchAsyncAction in the attempt of simplifying things and making the search code running on the coordinating node easier to reason about.
2019-10-07 10:10:04 +02:00
Martijn van Groningen f2f2304c75
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-10-07 10:07:56 +02:00
Armin Braun 22679c7932
Fix Snapshot Corruption in Edge Case (#47552) (#47620)
This fixes missing to marking shard snapshots as failures when
multiple data-nodes are lost during the snapshot process or
shard snapshot failures have occured before a node left the cluster.

The problem was that we were simply not adding any shard entries for completed
shards on node-left events. This has no effect for a successful shard, but
for a failed shard would lead to that shard not being marked as failed during
snapshot finalization. Fixed by corectly keeping track of all previous completed
shard states as well in this case.
Also, added an assertion that without this fix would trip on almost every run of the
resiliency tests and adjusted the serialization of SnapshotsInProgress.Entry so
we have a proper assertion message.

Closes #47550
2019-10-05 15:01:06 +02:00
Armin Braun f2d2ca21e2
Cleaner Handling of Store Refcount in BlobStoreRepository (#47560) (#47594)
If a shard gets closed we properly abort its snapshot
before closing it. We should in thise case make sure to
not throw a confusing exception about trying to increment
the reference on an already closed shard in the async tasks
if the snapshot is already aborted.
Also, added an assertion to make sure that aborts are in
fact the only situation in which we run into a concurrently
closed store.
2019-10-05 09:45:10 +02:00
Gordon Brown e47bdf760e
Fix Rollover error when alias has closed indices (#47148) (#47539)
Rollover previously requested index stats for all indices in the
provided alias, which causes an exception when there is a closed index
with that alias.

This commit adjusts the IndicesOptions used on the index stats
request so that closed indices are ignored, rather than throwing
an exception.
2019-10-04 17:40:05 -06:00
Jason Tedor 35ca3d68d7
Validating monitoring hosts setting while parsing (#47571)
This commit lifts the validation of the monitoring hosts setting into
the setting itself, rather than when the setting is used. This prevents
a scenario where an invalid value for the setting is accepted, but then
later fails while applying a cluster state with the invalid setting.
2019-10-04 17:32:49 -04:00
Mark Tozzi e404f7ea80
DocValueFormat implementation for date range fields (#47472) (#47605) 2019-10-04 17:21:17 -04:00
Lee Hinman 79376b7219 Set default SLM retention invocation time (#47604)
This adds a default for the `slm.retention_schedule` setting, setting it
to `0 30 1 * * ?` which is 1:30am every day.

Having retention unset meant that it would never be invoked and clean up
snapshots. We determined it would be better to have a default than never
to be run. When coming to a decision, we weighed the option of an
absolute time (such as 1:30am) versus a periodic invocation (like every
12 hours). In the end we decided on the absolute time because it has
better predictability and consistency than a periodic invocation, which
would rely on when the master node were elected or restarted.

Relates to #43663
2019-10-04 15:00:20 -06:00
Armin Braun c1be7a802c
Simplify Snapshot Delete Process (#47439) (#47533)
We don't need to read the SnapshotInfo for a
snapshot to determine the indices that need to
be updated when it is deleted as the `RepositoryData`
contains that information already.
This PR makes it so the `RepositoryData` is used to
determine which indices to update and also removes
the special handling for deleting snapshot metadata
and the CS snapshot blob and has those simply be
deleted as part of the deleting of other unreferenced
blobs in the last step of the delete.

This makes the snapshot delete a little faster and
more resilient by removing two RPC calls
(the separate delete and the get).

Also, this shortens the diff with #46250 as a
side-effect.
2019-10-04 13:55:16 +02:00
David Roberts defc97a300 Remove fallback for controller location (#47104)
This change removes the temporary controller
location fallback introduced in #47013.

Relates elastic/ml-cpp#593
2019-10-04 09:50:26 +01:00
Ryan Ernst f32692208e
Add explanations to script score queries (#46693) (#47548)
While function scores using scripts do allow explanations, they are only
creatable with an expert plugin. This commit improves the situation for
the newer script score query by adding the ability to set the
explanation from the script itself.

To set the explanation, a user would check for `explanation != null` to
indicate an explanation is needed, and then call
`explanation.set("some description")`.
2019-10-03 21:05:05 -07:00
Nhat Nguyen 5e4732f2bb Limit number of retaining translog files for peer recovery (#47414)
Today we control the extra translog (when soft-deletes is disabled) for
peer recoveries by size and age. If users manually (force) flush many
times within a short period, we can keep many small (or empty) translog
files as neither the size or age condition is reached. We can protect
the cluster from running out of the file descriptors in such a situation
by limiting the number of retaining translog files.
2019-10-03 20:45:29 -04:00
Armin Braun bac119f672
Fix getSnapshotIndexMetaData Exception Behavior (#47488) (#47496)
If we fail to read the global metadata in a snapshot
we would throw `SnapshotMissingException` but wouldn't
do so for the index metadata.
This is breaking SLM tests at a low rate because they
use `SnapshotMissingException` thrown from snapshot status APIs
to wait for a snapshot being gone.
Also, we should be consistent here in general and not leak the
`NoSuchFileException` to the transport layer for index meta.

Closes #46508
2019-10-03 12:47:50 +02:00
Armin Braun 7549be4489
Fix es.http.cname_in_publish_address Deprecation Logging (#47451)
Since the property defaulted to `true` this deprecation logging
runs every time unless its set to `false` manually (in which case
it should've also logged but didn't).
I didn't add a tests and removed the tests we had in `7.x` that
covered this logging. I did move the check out of the `if (InetAddresses.isInetAddress(hostString) == false) {` condition so this is sort-of covered by the REST tests.
IMO, any unit-test of this would be somewhat redundant and would've forced
adding a field that just indicates that the deprecated property was used to
every instance which seemed pointless.

Closes #47436
2019-10-03 11:10:48 +02:00
Alpar Torok 0a14bb174f Remove eclipse conditionals (#44075)
* Remove eclipse conditionals

We used to have some meta projects with a `-test` prefix because
historically eclipse could not distinguish between test and main
source-sets and could only use a single classpath.
This is no longer the case for the past few Eclipse versions.

This PR adds the necessary configuration to correctly categorize source
folders and libraries.
With this change eclipse can import projects, and the visibility rules
are correct e.x. auto compete doesn't offer classes from test code or
`testCompile` dependencies when editing classes in `main`.

Unfortunately the cyclic dependency detection in Eclipse doesn't seem to
take the difference between test and non test source sets into account,
but since we are checking this in Gradle anyhow, it's safe to set to
`warning` in the settings. Unfortunately there is no setting to ignore
it.

This might cause problems when building since Eclipse will probably not
know the right order to build things in so more wirk might be necesarry.
2019-10-03 11:55:00 +03:00
Armin Braun 0beb5263b4
Fix Snapshot Finalization not Waiting for Index Metadata (#47445) (#47459)
* Fix Snapshot Finalization not Waiting for Index Metadata

We were mixing up the listeners here which led to the final listener
that should be called after all the metadata has been written
to be called before that.
I fixed this by removing the one redundant listener and flattening
the logic out.

* Closes #47425
2019-10-02 23:26:18 +02:00
Jason Tedor 52b97ec539
Allow setting validation against arbitrary types (#47264)
Today when settings validate, they can only validate against settings
that are of the same type. While this strong-type is convenient from a
development perspective, it is too limiting in that some settings need
to validate against settings of a different type. For example, the list
setting xpack.monitoring.exporters.<namespace>.host wants to validate
that it is non-empty if and only if the string setting
xpack.monitoring.exporters.<namespace>.type is "http". Today this is
impossible since the settings validation framework only allows that
setting to validate against other list settings. This commit increases
the flexibility here to validate against settings of arbitrary type, at
the expense of losing strong-typing during development.
2019-10-02 16:31:06 -04:00
Jim Ferenczi c340814b34 Fix highlighting of overlapping terms in the unified highlighter (#47227)
The passage formatter that the unified highlighter use doesn't handle terms with overlapping offsets.
For tokenizer that provides multiple segmentation of the same terms (edge ngram for instance) the formatter
should select the largest span in order to highlight the term only once. This change implements this logic.
2019-10-02 16:34:12 +02:00
Yannick Welsch f7980e9745 Adapt version constants after backport (#47353) 2019-10-02 14:26:23 +02:00
Yannick Welsch 99d2fe295d Use optype CREATE for single auto-id index requests (#47353)
Changes auto-id index requests to use optype CREATE, making it compliant with our docs.
This will also make these auto-id index requests compatible with the new "create-doc" index
privilege (which is based on the optype), the default optype is changed to create, just as it is
already documented.
2019-10-02 14:16:52 +02:00