Commit Graph

2433 Commits

Author SHA1 Message Date
Jason Tedor a9b12b38f0
Push primary term to replication tracker (#38044)
This commit pushes the primary term into the replication tracker. This
is a precursor to using the primary term to resolving ordering problems
for retention leases. Namely, it can be that out-of-order retention
lease sync requests arrive on a replica. To resolve this, we need a
tuple of (primary term, version). For this to be, the primary term needs
to be accessible in the replication tracker. As the primary term is part
of the replication group anyway, this change conceptually makes sense.
2019-01-31 09:19:49 -05:00
Luca Cavanna 622fb7883b
Introduce ability to minimize round-trips in CCS (#37828)
With #37566 we have introduced the ability to merge multiple search responses into one. That makes it possible to expose a new way of executing cross-cluster search requests, that makes CCS much faster whenever there is network latency between the CCS coordinating node and the remote clusters. The coordinating node can now send a single search request to each remote cluster, which gets reduced by each one of them. from + size results are requested to each cluster, and the reduce phase in each cluster is non final (meaning that buckets are not pruned and pipeline aggs are not executed). The CCS coordinating node performs an additional, final reduction, which produces one search response out of the multiple responses received from the different clusters.

This new execution path will be activated by default for any CCS request unless a scroll is provided or inner hits are requested as part of field collapsing. The search API accepts now a new parameter called ccs_minimize_roundtrips that allows to opt-out of the default behaviour.

Relates to #32125
2019-01-31 15:12:14 +01:00
Armin Braun ae9f4df361
Don't Assert Ack on when Publish Timeout is 0 in Test (#38077)
* Publish timeout is set to `0` so out of order processing of states on the node can lead to a `false` ack response
  * See #30672
* Closes #36813
2019-01-31 14:35:11 +01:00
Alexander Reelsen 9f026bb8ad
Reduce object creation in Rounding class (#38061)
This reduces objects creations in the rounding class (used by aggs) by properly
creating the objects only once. Furthermore a few unneeded ZonedDateTime objects
were created in order to create other objects out of them. This was
changed as well.

Running the benchmarks shows a much faster performance for all of the
java time based Rounding classes.
2019-01-31 14:18:28 +01:00
Adrien Grand a536fa7755
Treat put-mapping calls with `_doc` as a top-level key as typed calls. (#38032)
Currently the put-mapping API assumes that because the type name is `_doc` then
it is dealing with a typeless put-mapping call. Yet we still allow running the
put-mapping API in a typed fashion with `_doc` as a type name. The current logic
triggers surprising errors when doing a typed put-mapping call with `_doc` as a
type name on an index that has a type already.

This is a bit of a corner-case, but is more important on 6.x due to the fact
that using the index API with `_doc` as a type name triggers typed calls to the
put-mapping API with `_doc` as a type name.
2019-01-31 13:57:42 +01:00
David Turner eadcb5f0f8
Fix size of rolling-upgrade bootstrap config (#38031)
Zen2 nodes will bootstrap themselves once they believe there to be no remaining
Zen1 master-eligible nodes in the cluster, as long as minimum_master_nodes is
satisfied.

Today the bootstrap configuration comprises just the ids of the known
master-eligible nodes, and this might be too small to be safe. For instance, if
there are 5 master-eligible nodes (so that minimum_master_nodes is 3) then the
bootstrap configuration could comprise just 3 nodes, of which 2 form a quorum,
and this does not intersect other quorums that might arise, leading to a
split-brain.

This commit fixes this by expanding the bootstrap configuration so that its
quorums satisfy minimum_master_nodes, by adding some of the IDs of the other
master-eligible nodes in the last-published cluster state.
2019-01-31 08:00:11 +00:00
Alexander Reelsen b94acb608b
Speed up converting of temporal accessor to zoned date time (#37915)
The existing implementation was slow due to exceptions being thrown if
an accessor did not have a time zone. This implementation queries for
having a timezone, local time and local date and also checks for an
instant preventing to throw an exception and thus speeding up the conversion.

This removes the existing method and create a new one named
DateFormatters.from(TemporalAccessor accessor) to resemble the naming of
the java time ones.

Before this change an epoch millis parser using the toZonedDateTime
method took approximately 50x longer.

Relates #37826
2019-01-31 08:55:40 +01:00
Alexander Reelsen 160d1bd4dd
Work around JDK8 timezone bug in tests (#37968)
The timezone GMT0 cannot be properly parsed on java8.
The randomZone() method now excludes GMT0, if java8 is used.

Closes #37814
2019-01-31 08:52:35 +01:00
Nhat Nguyen f5398d6511 Mute testRetentionLeasesSyncOnExpiration
Tracked at #37963
2019-01-31 00:57:27 -05:00
Jason Tedor a6a534f1f0
Reenable BWC testing after retention lease stats (#38062)
This commit adjusts the BWC version on retention leases in stats, so
with this we also reenable BWC testing.
2019-01-30 20:34:27 -05:00
Tim Brooks b88bdfe958
Add dispatching to `HandledTransportAction` (#38050)
This commit allows implementors of the `HandledTransportAction` to
specify what thread the action should be executed on. The motivation for
this commit is that certain CCR requests should be performed on the
generic threadpool.
2019-01-30 15:40:49 -07:00
Michael Basnight 945ad05d54
Update verify repository to allow unknown fields (#37619)
The subparser in verify repository allows for unknown fields. This
commit sets the value to true for the parser and modifies the test such
that it accurately tests it.

Relates #36938
2019-01-30 14:31:16 -06:00
David Turner 81c443c9de
Deprecate minimum_master_nodes (#37868)
Today we pass `discovery.zen.minimum_master_nodes` to nodes started up in
tests, but for 7.x nodes this setting is not required as it has no effect.
This commit removes this setting so that nodes are started with more realistic
configurations, and deprecates it.
2019-01-30 20:09:15 +00:00
Armin Braun a070b8acc0
Extract TransportRequestDeduplication from ShardStateAction (#37870)
* Extracted the logic for master request duplication so it can be reused by the snapshotting logic
* Removed custom listener used by `ShardStateAction` to not leak these into future users of this class
* Changed semantics slightly to get rid of redundant instantiations of the composite listener
* Relates #37686
2019-01-30 19:21:09 +01:00
Jason Tedor 6500b0cbd7
Expose retention leases in shard stats (#37991)
This commit exposes retention leases via shard-level stats.
2019-01-30 13:20:40 -05:00
Jason Tedor c468b2f7ca
Make primary terms fields private in index shard (#38036)
This commit encapsulates the primary terms fields in index shard. This
is a precursor to pushing the operation primary term down to the
replication tracker.
2019-01-30 12:56:58 -05:00
Nhat Nguyen ed460c2815 Log flush_stats and commit_stats in testMaybeFlush
This test failed a few times over the last several months. It seems that
we triggered a flush, but CI was too slow to finish it in several
seconds. I added the flush stats and commit stats and unmuted this test.
We should have a good clue if this test fails again.

Relates #37896
2019-01-30 12:54:31 -05:00
Christoph Büscher ecbaa38864
Remove deprecated Plugin#onModule extension points (#37866)
Removes some guice index level extension point marked as @Deprecated since at
least 6.0. They served as a signpost for plugin authors upgrading from 2.x but
this shouldn't be relevant in 7.0 anymore.
2019-01-30 17:17:54 +01:00
Igor Motov 23805fa41a
Geo: Fix Empty Geometry Collection Handling (#37978)
Fixes handling empty geometry collection and re-enables
testParseGeometryCollection test.

Fixes #37894
2019-01-30 09:20:30 -05:00
Luca Cavanna b91d587275
Move SearchHit and SearchHits to Writeable (#37931)
This allowed to make SearchHits immutable, while quite a few fields in
SearchHit have to stay mutable unfortunately.

Relates to #34389
2019-01-30 12:05:54 +01:00
Jason Tedor ba285a56a7
Fix limit on retaining sequence number (#37992)
We only assign non-negative sequence numbers to operations, so the lower
limit on retaining sequence numbers should be that it is non-negative
only.
2019-01-30 05:25:17 -05:00
Alexander Reelsen 9ec4abc31e
Ensure date parsing BWC compatibility (#37929)
In order to retain BWC this changes the java date formatters to be able to
parse nanoseconds resolution, even if only milliseconds are supported.
This used to work on joda time as well so that a user could store a date
like `2018-10-03T14:42:44.613469+0000` and then just loose the precision
on anything lower than millisecond level.
2019-01-30 10:47:12 +01:00
Adrien Grand c8af0f4bfa
Use mappings to format doc-value fields by default. (#30831)
Doc-value fields now return a value that is based on the mappings rather than
the script implementation by default.

This deprecates the special `use_field_mapping` docvalue format which was added
in #29639 only to ease the transition to 7.x and it is not necessary anymore in
7.0.
2019-01-30 10:31:51 +01:00
Adrien Grand b63b50b945
Give precedence to index creation when mixing typed templates with typeless index creation and vice-versa. (#37871)
Currently if you mix typed templates and typeless index creation or typeless
templates and typed index creation then you will end up with an error because
Elasticsearch tries to create an index that has multiple types: `_doc` and
the explicit type name that you used.

This commit proposes to give precedence to the index creation call so that
the type from the template will be ignored if the index creation call is
typeless while the template is typed, and the type from the index creation
call will be used if there is a typeless template.

This is consistent with the fact that index creation already "wins" if a field
is defined differently in the index creation call and in a template: the
definition from the index creation call is used in such cases.

Closes #37773
2019-01-30 10:28:24 +01:00
Jim Ferenczi 2732bb5cf3
Fix fetch source option in expand search phase (#37908)
This change fixes the copy of the fetch source option into the
expand search request that is used to retrieve the documents of each
collapsed group.

Closes #23829
2019-01-30 08:46:14 +01:00
Jim Ferenczi 5dcc805dc9
Restore a noop _all metadata field for 6x indices (#37808)
This commit restores a noop version of the AllFieldMapper that is instanciated only
for indices created in 6x. We need this metadata field mapper to be present in this version
in order to allow the upgrade of indices that explicitly disable _all (enabled: false).
The mapping of these indices contains a reference to the _all field that we cannot remove
in 7 so we'll need to keep this metadata mapper in 7x. Since indices created in 6x will not
be compatible with 8, we'll remove this noop mapper in the next major version.

Closes #37429
2019-01-30 08:45:50 +01:00
Marios Trivyzas f5b9b4d89c Add version 6.6.1 (#37975) 2019-01-30 15:33:01 +11:00
markharwood b889221f75
Types removal - deprecate include_type_name with index templates (#37484)
Added deprecation warnings for use of include_type_name in put/get index templates.
HLRC changes:
GetIndexTemplateRequest has a new client-side class which is a copy of server's GetIndexTemplateResponse but modified to be typeless.
PutIndexTemplateRequest has a new client-side counterpart which doesn't use types in the mappings
Relates to #35190
2019-01-29 20:52:41 +00:00
jimczi 193017672a Handle completion suggestion without contexts
This change fixes the handling of completion suggestion without contexts.

Relates #36996
2019-01-29 20:31:46 +01:00
Tim Brooks 00ace369af
Use `CcrRepository` to init follower index (#35719)
This commit modifies the put follow index action to use a
CcrRepository when creating a follower index. It routes 
the logic through the snapshot/restore process. A 
wait_for_active_shards parameter can be used to configure
how long to wait before returning the response.
2019-01-29 11:47:29 -07:00
Albert Zaharovits d05a4b9d14
Get Aliases with wildcard exclusion expression (#34230)
This commit adds the code in the HTTP layer that will parse exclusion wildcard
expressions.
The existing code issues 404s for wildcards as well as explicit indices.
But, in general, in an expression with exclude wildcards (-...*) following other
include wildcards, there is no way to tell if the include wildcard produced no
results or they were subsequently excluded.
Therefore, the proposed change is breaking the behavior of 404s for
wildcards. Specifically, no 404s will be returned for wildcards, even
if they are not followed by exclude wildcards or the exclude wildcards
could not possibly exclude what has previously been included.
Only explicitly requested aliases will be called out as missing.
2019-01-29 18:56:20 +02:00
Boaz Leskes 218df3009a
Move update and delete by query to use seq# for optimistic concurrency control (#37857)
The delete and update by query APIs both offer protection against overriding concurrent user changes to the documents they touch. They currently are using internal versioning. This PR changes that to rely on sequences numbers and primary terms.

Relates #37639 
Relates #36148 
Relates #10708
2019-01-29 10:23:05 -05:00
Yannick Welsch 3c9f7031b9
Enforce cluster UUIDs (#37775)
This commit adds join validation around cluster UUIDs, preventing a node to join a cluster if it was
previously part of another cluster. The commit introduces a new flag to the cluster state,
clusterUUIDCommitted, which denotes whether the node has locked into a cluster with the given
uuid. When a cluster is committed, this flag will turn to true, and subsequent cluster state updates
will keep the information about committal. Note that coordinating-only nodes are still free to switch
clusters at will (after restart), as they don't carry any persistent state.
2019-01-29 15:41:05 +01:00
Luca Cavanna 09a11a34ef
Remove clusterAlias instance member from QueryShardContext (#37923)
The clusterAlias member is only used in the copy constructor, to be able
to reconstruct the fully qualified index. It is also possible to remove
the instance member and add a private constructor that accepts the already built Index object which contains the cluster alias.
2019-01-29 15:31:49 +01:00
Boaz Leskes 65a9b61a91
Add Seq# based optimistic concurrency control to UpdateRequest (#37872)
The update request has a lesser known support for a one off update of a known document version. This PR adds an a seq# based alternative to power these operations.

Relates #36148 
Relates #10708
2019-01-29 09:18:05 -05:00
Tanguy Leroux 5d1964bcbf
Ignore shard started requests when primary term does not match (#37899)
This commit changes the StartedShardEntry so that it also contains the 
primary term of the shard to start. This way the master node can also 
checks that the primary term from the start request is equal to the current 
shard's primary term in the cluster state, and it can ignore any shard 
started request that would concerns a previous instance of the shard that 
would have been allocated to the same node.

Such situation are likely to happen with frozen (or restored) indices and 
the replication of closed indices, because with replicated closed indices 
the shards will be initialized again after the index is closed and can 
potentially be re initialized again if the index is reopened as a frozen 
index. In such cases the lifecycle of the shards would be something like:
* shard is STARTED
* index is closed
* shards is INITIALIZING (index state is CLOSED, primary term is X)
* index is reopened
* shards are INITIALIZING again (index state is OPENED, potentially frozen, 
primary term is X+1)

Adding the primary term to the shard started request will allow to discard 
potential StartedShardEntry requests received by the master node if the 
request concerns the shard with primary term X because it has been 
moved/reinitialized in the meanwhile under the primary term X+1.

Relates to #33888
2019-01-29 15:09:40 +01:00
Luca Cavanna 2325fb9cb3
Remove test only SearchShardTarget constructor (#37912)
Remove SearchShardTarget test only constructor and replace all the usages with calls to the other constructor that accepts a ShardId.
2019-01-29 14:58:11 +01:00
Luca Cavanna 42eec55837
Replace failure.get().addSuppressed with failure.accumulateAndGet() (#37649)
Also add a test for concurrent incoming failures
2019-01-29 14:57:33 +01:00
Luca Cavanna a6d4838a67
Clean up allowPartialSearchResults serialization (#37911)
When serializing allowPartialSearchResults to the shards through ShardSearchTransportRequest, we use an optional boolean field, though
the corresponding instance member is declared `boolean` which can never
be null. We also have an assert to verify that the incoming search
request provides a non-null value for the flag, and a comment explaining
that null should be considered a bug.

This commit makes the allowPartialSearchResults method in
ShardSearchRequest return a `boolean` rather than a `Boolean` and
changes the serialization from optional to non optional, in a bw comp manner.
2019-01-29 14:56:22 +01:00
Tanguy Leroux 460f10ce60
Close Index API should force a flush if a sync is needed (#37961)
This commit changes the TransportVerifyShardBeforeCloseAction so that it issues a 
forced flush, forcing the translog and the Lucene commit to contain the same max seq 
number and global checkpoint in the case the Translog contains operations that were 
not written in the IndexWriter (like a Delete that touches a non existing doc). This way 
the assertion added in #37426 won't trip.

Related to #33888
2019-01-29 13:15:58 +01:00
Yannick Welsch 504a89feaf
Step down as master when configured out of voting configuration (#37802)
Abdicates to another master-eligible node once the active master is reconfigured out of the voting
configuration, for example through the use of voting configuration exclusions.

Follow-up to #37712
2019-01-29 12:43:04 +01:00
Yannick Welsch 827c4f6567 Make Version.java aware of 6.x Lucene upgrade
Relates to #37913
2019-01-29 10:44:01 +01:00
Przemyslaw Gomulka 891320f5ac
Elasticsearch support to JSON logging (#36833)
In order to support JSON log format, a custom pattern layout was used and its configuration is enclosed in ESJsonLayout. Users are free to use their own patterns, but if smooth Beats integration is needed, they should use ESJsonLayout. EvilLoggerTests are left intact to make sure user's custom log patterns work fine.

To populate additional fields node.id and cluster.uuid which are not available at start time, 
a cluster state update will have to be received and the values passed to log4j pattern converter.
A ClusterStateObserver.Listener is used to receive only one ClusteStateUpdate. Once update is received the nodeId and clusterUUid are set in a static field in a NodeAndClusterIdConverter. 

Following fields are expected in JSON log lines: type, tiemstamp, level, component, cluster.name, node.name, node.id, cluster.uuid, message, stacktrace
see ESJsonLayout.java for more details and field descriptions

Docker log4j2 configuration is now almost the same as the one use for ES binary. 
The only difference is that docker is using console appenders, whereas ES is using file appenders.

relates: #32850
2019-01-29 07:20:09 +01:00
Nhat Nguyen 9ceb218d85 Adjust bwc version for put mapping requests
Relates #37675
2019-01-28 10:57:11 -05:00
Armin Braun 0d109396fa
Increase Timeout in #testSnapshotCanceled (#37890)
* The test failure reported in the issue looks like a mere timeout. Logging suggestst hat the snapshot completes/aborts correctly but the busy
loop polling the snapshot state times out too early.
* Closes #37888
2019-01-28 14:13:02 +01:00
Luca Cavanna a9adc16922 Mute failing SearchQueryIT test
Relates to #37814
2019-01-28 13:41:13 +01:00
Alpar Torok 64b98db973
Add an alias for :server:integTest so it runs as part of internalClusterTest (#37910) 2019-01-28 14:26:22 +02:00
Jason Tedor 194cdfe208
Sync retention leases on expiration (#37902)
This commit introduces a sync of retention leases when a retention lease
expires. As expiration of retention leases is lazy, their expiration is
managed only when getting the current retention leases from the
replication tracker. At this point, we callback to our full retention
lease sync to sync and flush these on all shard copies. With this
change, replicas do not locally manage expiration of retention leases;
instead, that is done only on the primary.
2019-01-28 07:11:51 -05:00
Tanguy Leroux 758eb9d451 Track accurate total hits in CloseIndexIT
The test was not using the TRACK_TOTAL_HITS_ACCURATE and thus
encountered a different issue tracked in #37907. In the meanwhile
we can adapt the test to not fail anymore.

Closes #37897
2019-01-28 11:30:20 +01:00
Martijn van Groningen 4e1a779773
Prepare ShardFollowNodeTask to bootstrap when it fall behind leader shard (#37562)
* Changed `LuceneSnapshot` to throw an `OperationsMissingException` if the requested ops are missing.
* Changed the shard changes api to handle the `OperationsMissingException` and wrap the exception into `ResourceNotFound` exception and include metadata to indicate the requested range can no longer be retrieved.
* Changed `ShardFollowNodeTask` to handle this `ResourceNotFound` exception with the included metdata header.

Relates to #35975
2019-01-28 09:30:04 +01:00