2747 Commits

Author SHA1 Message Date
Armin Braun
0a604e3b24
Fix Two Races that Lead to Stuck Snapshots (#37686)
* Fixes two broken spots:
    1. Master failover while deleting a snapshot that has no shards will get stuck if the new master finds the 0-shard snapshot in `INIT` when deleting
    2. Aborted shards that were never seen in `INIT` state by the `SnapshotsShardService` will not be notified as failed, leading to the snapshot staying in `ABORTED` state and never getting deleted with one or more shards stuck in `ABORTED` state
* Tried to make fixes as short as possible so we can backport to `6.x` with the least amount of risk
* Significantly extended test infrastructure to reproduce the above two issues
  * Two new test runs:
      1. Reproducing the effects of node disconnects/restarts in isolation
      2. Reproducing the effects of disconnects/restarts in parallel with shard relocations and deletes
* Relates #32265 
* Closes #32348
2019-02-01 05:45:40 +01:00
Nhat Nguyen
b8b843476d
Disable dynamic mapping in testSimpleGetFieldMappingsWithDefaults (#38045)
Since #31140 we no longer require acking on the dynamic mapping of index
requests. Thus, a returned mapping from a get mapping request does not
necessarily contain the dynamic updates from the index request. This
commit replaces the dynamic mapping update with a manual put mapping.

Relates #31140
Closes #37928
2019-01-31 21:01:41 -05:00
Nhat Nguyen
a8ebe2a217
Fix random params in testSoftDeletesRetentionLock (#38114)
Since #37992 the retainingSequenceNumber is initialized with 0 
while the global checkpoint can be -1.

Relates #37992
2019-01-31 20:50:41 -05:00
Lee Hinman
c67a9663af
Fix MasterServiceTests.testClusterStateUpdateLogging (#38116)
This changes the test to not use a `CountDownlatch`, instead adding an assertion
for the final logging message and waiting until the `MockAppender` has seen it
before proceeding.

Related to df2c06f6f30f7e23a6863a3f72fc3bdb7648885c
Resolves #23739
2019-01-31 17:13:19 -07:00
Yuri Astrakhan
f3cde06a1d
geotile_grid implementation (#37842)
Implements `geotile_grid` aggregation

This patch refactors previous implementation https://github.com/elastic/elasticsearch/pull/30240

This code uses the same base classes as `geohash_grid` agg, but uses a different hashing
algorithm to allow zoom consistency.  Each grid bucket is aligned to Web Mercator tiles.
2019-01-31 19:11:30 -05:00
Pascal Christoph
a3d9ba3f4b Log document id when MapperParsingException occurs (#37800)
Closes #37658
2019-01-31 16:33:13 -05:00
Nhat Nguyen
237fcda2cc
Disable dynamic mapping update in testTransportBulkTasks (#38073)
If a replica does not have a right mapping yet, we will retry the index
request on that replica; then the actual tasks is higher than the
expected tasks. Since #31140 this happens more frequently for we no
longer require acking on the dynamic mapping of index requests.

Relates #31140
Closes #37893
2019-01-31 13:16:52 -05:00
Przemyslaw Gomulka
28b5c7ce78
Do not set up NodeAndClusterIdStateListener in test (#38110)
When extending ESIntegTestCase are run on the same jvm, the static field in
NodeAndClusterIdConverter will throw an AlreadySet exceptions.
overriding the configuration method from Node.configureNodeAndClusterIdStateListener in the MockNode will prevent the listener registration from happening
relates #32850
2019-01-31 18:59:40 +01:00
Nhat Nguyen
8e95780f98
Soft-deletes policy should always fetch latest leases (#37940)
If a new retention lease is added while a primary's soft-deletes policy
is locked for peer-recovery, that lease won't be baked into the Lucene
commit.

Relates #37165
Relates #37375
2019-01-31 12:02:57 -05:00
Henning Andersen
68ed72b923
Handle scheduler exceptions (#38014)
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.

This is a continuation of #28667, #36137 and also fixes #37708.
2019-01-31 17:51:45 +01:00
David Turner
7f738e8541
Minor logging improvements (#38084)
Fixes some log messages that caused some minor confusion when digging through a
log generated by a failing test.
2019-01-31 16:41:04 +00:00
Tal Levy
9923f0fe6a
fix a few versionAdded values in ElasticsearchExceptions (#37877)
TooManyBucketsException was introduced in v6.2
and SnapshotInProgressException was introduced in v6.7
2019-01-31 08:28:20 -08:00
Tanguy Leroux
7a597cad0d
Reenable BWC tests after backport of #37899 (#38093)
This commit adapts the version used in StartedShardEntry serialization
 after the backport of  #37899 and reenables bwc tests.

Related to #37899
Related to #38074
2019-01-31 16:53:28 +01:00
Henning Andersen
7487be3d3c Un-mute NoMasterNodeIT.testNoMasterActionsWriteMasterBlock 2019-01-31 15:31:01 +01:00
Jason Tedor
a9b12b38f0
Push primary term to replication tracker (#38044)
This commit pushes the primary term into the replication tracker. This
is a precursor to using the primary term to resolving ordering problems
for retention leases. Namely, it can be that out-of-order retention
lease sync requests arrive on a replica. To resolve this, we need a
tuple of (primary term, version). For this to be, the primary term needs
to be accessible in the replication tracker. As the primary term is part
of the replication group anyway, this change conceptually makes sense.
2019-01-31 09:19:49 -05:00
Luca Cavanna
622fb7883b
Introduce ability to minimize round-trips in CCS (#37828)
With #37566 we have introduced the ability to merge multiple search responses into one. That makes it possible to expose a new way of executing cross-cluster search requests, that makes CCS much faster whenever there is network latency between the CCS coordinating node and the remote clusters. The coordinating node can now send a single search request to each remote cluster, which gets reduced by each one of them. from + size results are requested to each cluster, and the reduce phase in each cluster is non final (meaning that buckets are not pruned and pipeline aggs are not executed). The CCS coordinating node performs an additional, final reduction, which produces one search response out of the multiple responses received from the different clusters.

This new execution path will be activated by default for any CCS request unless a scroll is provided or inner hits are requested as part of field collapsing. The search API accepts now a new parameter called ccs_minimize_roundtrips that allows to opt-out of the default behaviour.

Relates to #32125
2019-01-31 15:12:14 +01:00
Armin Braun
ae9f4df361
Don't Assert Ack on when Publish Timeout is 0 in Test (#38077)
* Publish timeout is set to `0` so out of order processing of states on the node can lead to a `false` ack response
  * See #30672
* Closes #36813
2019-01-31 14:35:11 +01:00
Alexander Reelsen
9f026bb8ad
Reduce object creation in Rounding class (#38061)
This reduces objects creations in the rounding class (used by aggs) by properly
creating the objects only once. Furthermore a few unneeded ZonedDateTime objects
were created in order to create other objects out of them. This was
changed as well.

Running the benchmarks shows a much faster performance for all of the
java time based Rounding classes.
2019-01-31 14:18:28 +01:00
Adrien Grand
a536fa7755
Treat put-mapping calls with _doc as a top-level key as typed calls. (#38032)
Currently the put-mapping API assumes that because the type name is `_doc` then
it is dealing with a typeless put-mapping call. Yet we still allow running the
put-mapping API in a typed fashion with `_doc` as a type name. The current logic
triggers surprising errors when doing a typed put-mapping call with `_doc` as a
type name on an index that has a type already.

This is a bit of a corner-case, but is more important on 6.x due to the fact
that using the index API with `_doc` as a type name triggers typed calls to the
put-mapping API with `_doc` as a type name.
2019-01-31 13:57:42 +01:00
David Turner
eadcb5f0f8
Fix size of rolling-upgrade bootstrap config (#38031)
Zen2 nodes will bootstrap themselves once they believe there to be no remaining
Zen1 master-eligible nodes in the cluster, as long as minimum_master_nodes is
satisfied.

Today the bootstrap configuration comprises just the ids of the known
master-eligible nodes, and this might be too small to be safe. For instance, if
there are 5 master-eligible nodes (so that minimum_master_nodes is 3) then the
bootstrap configuration could comprise just 3 nodes, of which 2 form a quorum,
and this does not intersect other quorums that might arise, leading to a
split-brain.

This commit fixes this by expanding the bootstrap configuration so that its
quorums satisfy minimum_master_nodes, by adding some of the IDs of the other
master-eligible nodes in the last-published cluster state.
2019-01-31 08:00:11 +00:00
Alexander Reelsen
b94acb608b
Speed up converting of temporal accessor to zoned date time (#37915)
The existing implementation was slow due to exceptions being thrown if
an accessor did not have a time zone. This implementation queries for
having a timezone, local time and local date and also checks for an
instant preventing to throw an exception and thus speeding up the conversion.

This removes the existing method and create a new one named
DateFormatters.from(TemporalAccessor accessor) to resemble the naming of
the java time ones.

Before this change an epoch millis parser using the toZonedDateTime
method took approximately 50x longer.

Relates #37826
2019-01-31 08:55:40 +01:00
Alexander Reelsen
160d1bd4dd
Work around JDK8 timezone bug in tests (#37968)
The timezone GMT0 cannot be properly parsed on java8.
The randomZone() method now excludes GMT0, if java8 is used.

Closes #37814
2019-01-31 08:52:35 +01:00
Nhat Nguyen
f5398d6511 Mute testRetentionLeasesSyncOnExpiration
Tracked at #37963
2019-01-31 00:57:27 -05:00
Jason Tedor
a6a534f1f0
Reenable BWC testing after retention lease stats (#38062)
This commit adjusts the BWC version on retention leases in stats, so
with this we also reenable BWC testing.
2019-01-30 20:34:27 -05:00
Tim Brooks
b88bdfe958
Add dispatching to HandledTransportAction (#38050)
This commit allows implementors of the `HandledTransportAction` to
specify what thread the action should be executed on. The motivation for
this commit is that certain CCR requests should be performed on the
generic threadpool.
2019-01-30 15:40:49 -07:00
Michael Basnight
945ad05d54
Update verify repository to allow unknown fields (#37619)
The subparser in verify repository allows for unknown fields. This
commit sets the value to true for the parser and modifies the test such
that it accurately tests it.

Relates #36938
2019-01-30 14:31:16 -06:00
David Turner
81c443c9de
Deprecate minimum_master_nodes (#37868)
Today we pass `discovery.zen.minimum_master_nodes` to nodes started up in
tests, but for 7.x nodes this setting is not required as it has no effect.
This commit removes this setting so that nodes are started with more realistic
configurations, and deprecates it.
2019-01-30 20:09:15 +00:00
Armin Braun
a070b8acc0
Extract TransportRequestDeduplication from ShardStateAction (#37870)
* Extracted the logic for master request duplication so it can be reused by the snapshotting logic
* Removed custom listener used by `ShardStateAction` to not leak these into future users of this class
* Changed semantics slightly to get rid of redundant instantiations of the composite listener
* Relates #37686
2019-01-30 19:21:09 +01:00
Jason Tedor
6500b0cbd7
Expose retention leases in shard stats (#37991)
This commit exposes retention leases via shard-level stats.
2019-01-30 13:20:40 -05:00
Jason Tedor
c468b2f7ca
Make primary terms fields private in index shard (#38036)
This commit encapsulates the primary terms fields in index shard. This
is a precursor to pushing the operation primary term down to the
replication tracker.
2019-01-30 12:56:58 -05:00
Nhat Nguyen
ed460c2815 Log flush_stats and commit_stats in testMaybeFlush
This test failed a few times over the last several months. It seems that
we triggered a flush, but CI was too slow to finish it in several
seconds. I added the flush stats and commit stats and unmuted this test.
We should have a good clue if this test fails again.

Relates #37896
2019-01-30 12:54:31 -05:00
Christoph Büscher
ecbaa38864
Remove deprecated Plugin#onModule extension points (#37866)
Removes some guice index level extension point marked as @Deprecated since at
least 6.0. They served as a signpost for plugin authors upgrading from 2.x but
this shouldn't be relevant in 7.0 anymore.
2019-01-30 17:17:54 +01:00
Igor Motov
23805fa41a
Geo: Fix Empty Geometry Collection Handling (#37978)
Fixes handling empty geometry collection and re-enables
testParseGeometryCollection test.

Fixes #37894
2019-01-30 09:20:30 -05:00
Luca Cavanna
b91d587275
Move SearchHit and SearchHits to Writeable (#37931)
This allowed to make SearchHits immutable, while quite a few fields in
SearchHit have to stay mutable unfortunately.

Relates to #34389
2019-01-30 12:05:54 +01:00
Jason Tedor
ba285a56a7
Fix limit on retaining sequence number (#37992)
We only assign non-negative sequence numbers to operations, so the lower
limit on retaining sequence numbers should be that it is non-negative
only.
2019-01-30 05:25:17 -05:00
Alexander Reelsen
9ec4abc31e
Ensure date parsing BWC compatibility (#37929)
In order to retain BWC this changes the java date formatters to be able to
parse nanoseconds resolution, even if only milliseconds are supported.
This used to work on joda time as well so that a user could store a date
like `2018-10-03T14:42:44.613469+0000` and then just loose the precision
on anything lower than millisecond level.
2019-01-30 10:47:12 +01:00
Adrien Grand
c8af0f4bfa
Use mappings to format doc-value fields by default. (#30831)
Doc-value fields now return a value that is based on the mappings rather than
the script implementation by default.

This deprecates the special `use_field_mapping` docvalue format which was added
in #29639 only to ease the transition to 7.x and it is not necessary anymore in
7.0.
2019-01-30 10:31:51 +01:00
Adrien Grand
b63b50b945
Give precedence to index creation when mixing typed templates with typeless index creation and vice-versa. (#37871)
Currently if you mix typed templates and typeless index creation or typeless
templates and typed index creation then you will end up with an error because
Elasticsearch tries to create an index that has multiple types: `_doc` and
the explicit type name that you used.

This commit proposes to give precedence to the index creation call so that
the type from the template will be ignored if the index creation call is
typeless while the template is typed, and the type from the index creation
call will be used if there is a typeless template.

This is consistent with the fact that index creation already "wins" if a field
is defined differently in the index creation call and in a template: the
definition from the index creation call is used in such cases.

Closes #37773
2019-01-30 10:28:24 +01:00
Jim Ferenczi
2732bb5cf3
Fix fetch source option in expand search phase (#37908)
This change fixes the copy of the fetch source option into the
expand search request that is used to retrieve the documents of each
collapsed group.

Closes #23829
2019-01-30 08:46:14 +01:00
Jim Ferenczi
5dcc805dc9
Restore a noop _all metadata field for 6x indices (#37808)
This commit restores a noop version of the AllFieldMapper that is instanciated only
for indices created in 6x. We need this metadata field mapper to be present in this version
in order to allow the upgrade of indices that explicitly disable _all (enabled: false).
The mapping of these indices contains a reference to the _all field that we cannot remove
in 7 so we'll need to keep this metadata mapper in 7x. Since indices created in 6x will not
be compatible with 8, we'll remove this noop mapper in the next major version.

Closes #37429
2019-01-30 08:45:50 +01:00
Marios Trivyzas
f5b9b4d89c Add version 6.6.1 (#37975) 2019-01-30 15:33:01 +11:00
markharwood
b889221f75
Types removal - deprecate include_type_name with index templates (#37484)
Added deprecation warnings for use of include_type_name in put/get index templates.
HLRC changes:
GetIndexTemplateRequest has a new client-side class which is a copy of server's GetIndexTemplateResponse but modified to be typeless.
PutIndexTemplateRequest has a new client-side counterpart which doesn't use types in the mappings
Relates to #35190
2019-01-29 20:52:41 +00:00
jimczi
193017672a Handle completion suggestion without contexts
This change fixes the handling of completion suggestion without contexts.

Relates #36996
2019-01-29 20:31:46 +01:00
Tim Brooks
00ace369af
Use CcrRepository to init follower index (#35719)
This commit modifies the put follow index action to use a
CcrRepository when creating a follower index. It routes 
the logic through the snapshot/restore process. A 
wait_for_active_shards parameter can be used to configure
how long to wait before returning the response.
2019-01-29 11:47:29 -07:00
Albert Zaharovits
d05a4b9d14
Get Aliases with wildcard exclusion expression (#34230)
This commit adds the code in the HTTP layer that will parse exclusion wildcard
expressions.
The existing code issues 404s for wildcards as well as explicit indices.
But, in general, in an expression with exclude wildcards (-...*) following other
include wildcards, there is no way to tell if the include wildcard produced no
results or they were subsequently excluded.
Therefore, the proposed change is breaking the behavior of 404s for
wildcards. Specifically, no 404s will be returned for wildcards, even
if they are not followed by exclude wildcards or the exclude wildcards
could not possibly exclude what has previously been included.
Only explicitly requested aliases will be called out as missing.
2019-01-29 18:56:20 +02:00
Boaz Leskes
218df3009a
Move update and delete by query to use seq# for optimistic concurrency control (#37857)
The delete and update by query APIs both offer protection against overriding concurrent user changes to the documents they touch. They currently are using internal versioning. This PR changes that to rely on sequences numbers and primary terms.

Relates #37639 
Relates #36148 
Relates #10708
2019-01-29 10:23:05 -05:00
Yannick Welsch
3c9f7031b9
Enforce cluster UUIDs (#37775)
This commit adds join validation around cluster UUIDs, preventing a node to join a cluster if it was
previously part of another cluster. The commit introduces a new flag to the cluster state,
clusterUUIDCommitted, which denotes whether the node has locked into a cluster with the given
uuid. When a cluster is committed, this flag will turn to true, and subsequent cluster state updates
will keep the information about committal. Note that coordinating-only nodes are still free to switch
clusters at will (after restart), as they don't carry any persistent state.
2019-01-29 15:41:05 +01:00
Luca Cavanna
09a11a34ef
Remove clusterAlias instance member from QueryShardContext (#37923)
The clusterAlias member is only used in the copy constructor, to be able
to reconstruct the fully qualified index. It is also possible to remove
the instance member and add a private constructor that accepts the already built Index object which contains the cluster alias.
2019-01-29 15:31:49 +01:00
Boaz Leskes
65a9b61a91
Add Seq# based optimistic concurrency control to UpdateRequest (#37872)
The update request has a lesser known support for a one off update of a known document version. This PR adds an a seq# based alternative to power these operations.

Relates #36148 
Relates #10708
2019-01-29 09:18:05 -05:00
Tanguy Leroux
5d1964bcbf
Ignore shard started requests when primary term does not match (#37899)
This commit changes the StartedShardEntry so that it also contains the 
primary term of the shard to start. This way the master node can also 
checks that the primary term from the start request is equal to the current 
shard's primary term in the cluster state, and it can ignore any shard 
started request that would concerns a previous instance of the shard that 
would have been allocated to the same node.

Such situation are likely to happen with frozen (or restored) indices and 
the replication of closed indices, because with replicated closed indices 
the shards will be initialized again after the index is closed and can 
potentially be re initialized again if the index is reopened as a frozen 
index. In such cases the lifecycle of the shards would be something like:
* shard is STARTED
* index is closed
* shards is INITIALIZING (index state is CLOSED, primary term is X)
* index is reopened
* shards are INITIALIZING again (index state is OPENED, potentially frozen, 
primary term is X+1)

Adding the primary term to the shard started request will allow to discard 
potential StartedShardEntry requests received by the master node if the 
request concerns the shard with primary term X because it has been 
moved/reinitialized in the meanwhile under the primary term X+1.

Relates to #33888
2019-01-29 15:09:40 +01:00