Commit Graph

2400 Commits

Author SHA1 Message Date
Luca Cavanna 09a11a34ef
Remove clusterAlias instance member from QueryShardContext (#37923)
The clusterAlias member is only used in the copy constructor, to be able
to reconstruct the fully qualified index. It is also possible to remove
the instance member and add a private constructor that accepts the already built Index object which contains the cluster alias.
2019-01-29 15:31:49 +01:00
Boaz Leskes 65a9b61a91
Add Seq# based optimistic concurrency control to UpdateRequest (#37872)
The update request has a lesser known support for a one off update of a known document version. This PR adds an a seq# based alternative to power these operations.

Relates #36148 
Relates #10708
2019-01-29 09:18:05 -05:00
Tanguy Leroux 5d1964bcbf
Ignore shard started requests when primary term does not match (#37899)
This commit changes the StartedShardEntry so that it also contains the 
primary term of the shard to start. This way the master node can also 
checks that the primary term from the start request is equal to the current 
shard's primary term in the cluster state, and it can ignore any shard 
started request that would concerns a previous instance of the shard that 
would have been allocated to the same node.

Such situation are likely to happen with frozen (or restored) indices and 
the replication of closed indices, because with replicated closed indices 
the shards will be initialized again after the index is closed and can 
potentially be re initialized again if the index is reopened as a frozen 
index. In such cases the lifecycle of the shards would be something like:
* shard is STARTED
* index is closed
* shards is INITIALIZING (index state is CLOSED, primary term is X)
* index is reopened
* shards are INITIALIZING again (index state is OPENED, potentially frozen, 
primary term is X+1)

Adding the primary term to the shard started request will allow to discard 
potential StartedShardEntry requests received by the master node if the 
request concerns the shard with primary term X because it has been 
moved/reinitialized in the meanwhile under the primary term X+1.

Relates to #33888
2019-01-29 15:09:40 +01:00
Luca Cavanna 2325fb9cb3
Remove test only SearchShardTarget constructor (#37912)
Remove SearchShardTarget test only constructor and replace all the usages with calls to the other constructor that accepts a ShardId.
2019-01-29 14:58:11 +01:00
Luca Cavanna 42eec55837
Replace failure.get().addSuppressed with failure.accumulateAndGet() (#37649)
Also add a test for concurrent incoming failures
2019-01-29 14:57:33 +01:00
Luca Cavanna a6d4838a67
Clean up allowPartialSearchResults serialization (#37911)
When serializing allowPartialSearchResults to the shards through ShardSearchTransportRequest, we use an optional boolean field, though
the corresponding instance member is declared `boolean` which can never
be null. We also have an assert to verify that the incoming search
request provides a non-null value for the flag, and a comment explaining
that null should be considered a bug.

This commit makes the allowPartialSearchResults method in
ShardSearchRequest return a `boolean` rather than a `Boolean` and
changes the serialization from optional to non optional, in a bw comp manner.
2019-01-29 14:56:22 +01:00
Tanguy Leroux 460f10ce60
Close Index API should force a flush if a sync is needed (#37961)
This commit changes the TransportVerifyShardBeforeCloseAction so that it issues a 
forced flush, forcing the translog and the Lucene commit to contain the same max seq 
number and global checkpoint in the case the Translog contains operations that were 
not written in the IndexWriter (like a Delete that touches a non existing doc). This way 
the assertion added in #37426 won't trip.

Related to #33888
2019-01-29 13:15:58 +01:00
Yannick Welsch 504a89feaf
Step down as master when configured out of voting configuration (#37802)
Abdicates to another master-eligible node once the active master is reconfigured out of the voting
configuration, for example through the use of voting configuration exclusions.

Follow-up to #37712
2019-01-29 12:43:04 +01:00
Yannick Welsch 827c4f6567 Make Version.java aware of 6.x Lucene upgrade
Relates to #37913
2019-01-29 10:44:01 +01:00
Przemyslaw Gomulka 891320f5ac
Elasticsearch support to JSON logging (#36833)
In order to support JSON log format, a custom pattern layout was used and its configuration is enclosed in ESJsonLayout. Users are free to use their own patterns, but if smooth Beats integration is needed, they should use ESJsonLayout. EvilLoggerTests are left intact to make sure user's custom log patterns work fine.

To populate additional fields node.id and cluster.uuid which are not available at start time, 
a cluster state update will have to be received and the values passed to log4j pattern converter.
A ClusterStateObserver.Listener is used to receive only one ClusteStateUpdate. Once update is received the nodeId and clusterUUid are set in a static field in a NodeAndClusterIdConverter. 

Following fields are expected in JSON log lines: type, tiemstamp, level, component, cluster.name, node.name, node.id, cluster.uuid, message, stacktrace
see ESJsonLayout.java for more details and field descriptions

Docker log4j2 configuration is now almost the same as the one use for ES binary. 
The only difference is that docker is using console appenders, whereas ES is using file appenders.

relates: #32850
2019-01-29 07:20:09 +01:00
Nhat Nguyen 9ceb218d85 Adjust bwc version for put mapping requests
Relates #37675
2019-01-28 10:57:11 -05:00
Armin Braun 0d109396fa
Increase Timeout in #testSnapshotCanceled (#37890)
* The test failure reported in the issue looks like a mere timeout. Logging suggestst hat the snapshot completes/aborts correctly but the busy
loop polling the snapshot state times out too early.
* Closes #37888
2019-01-28 14:13:02 +01:00
Luca Cavanna a9adc16922 Mute failing SearchQueryIT test
Relates to #37814
2019-01-28 13:41:13 +01:00
Alpar Torok 64b98db973
Add an alias for :server:integTest so it runs as part of internalClusterTest (#37910) 2019-01-28 14:26:22 +02:00
Jason Tedor 194cdfe208
Sync retention leases on expiration (#37902)
This commit introduces a sync of retention leases when a retention lease
expires. As expiration of retention leases is lazy, their expiration is
managed only when getting the current retention leases from the
replication tracker. At this point, we callback to our full retention
lease sync to sync and flush these on all shard copies. With this
change, replicas do not locally manage expiration of retention leases;
instead, that is done only on the primary.
2019-01-28 07:11:51 -05:00
Tanguy Leroux 758eb9d451 Track accurate total hits in CloseIndexIT
The test was not using the TRACK_TOTAL_HITS_ACCURATE and thus
encountered a different issue tracked in #37907. In the meanwhile
we can adapt the test to not fail anymore.

Closes #37897
2019-01-28 11:30:20 +01:00
Martijn van Groningen 4e1a779773
Prepare ShardFollowNodeTask to bootstrap when it fall behind leader shard (#37562)
* Changed `LuceneSnapshot` to throw an `OperationsMissingException` if the requested ops are missing.
* Changed the shard changes api to handle the `OperationsMissingException` and wrap the exception into `ResourceNotFound` exception and include metadata to indicate the requested range can no longer be retrieved.
* Changed `ShardFollowNodeTask` to handle this `ResourceNotFound` exception with the included metdata header.

Relates to #35975
2019-01-28 09:30:04 +01:00
Jim Ferenczi a056804831 Track total hits in tests that index more than 10,000 docs
This change sets track_total_hits to true on a test that requires
to check the total hits of a query that can return more than 10,000 docs.

Closes #37895
2019-01-28 09:24:32 +01:00
Dimitrios Liappis 290c6637c2
Refactor into appropriate uses of scheduleUnlessShuttingDown (#37709)
Replace `threadPool().schedule()` / catch
`EsRejectedExecutionException` pattern with direct calls to
`ThreadPool#scheduleUnlessShuttingDown()`.

Closes #36318
2019-01-28 10:01:26 +02:00
Julie Tibshirani b1735aa93b
Support both typed and typeless 'get mapping' requests in the HLRC. (#37796)
From previous PRs, we've already added support for include_type_name to
the get mapping API. We had also taken an approach to the HLRC where the
server-side `GetMappingResponse#fromXContent` could only handle typeless
input.

This PR updates the HLRC for 'get mapping' to be in line with our new approach:

* Add a typeless 'get mappings' method to the Java HLRC, that accepts new
client-side request and response objects. This new response only handles
typeless mapping definitions.
* Switch the old version of `GetMappingResponse` back to expecting typed
mappings, and deprecate the corresponding method on the HLRC.

Finally, the PR also does some small, related clean-up around 'get field mappings'.
2019-01-27 16:02:22 -08:00
Jason Tedor f24dce1122
Fix newlines in retention lease sync action tests
There is a method invocation here spanning multiple lines. This commit
breaks it up into a line per parameter as this is friendlier to future
changes and diffs.
2019-01-27 08:16:14 -05:00
Jason Tedor 3801925cf0
Copy retention leases under lock
When adding a retention lease, we make a reference copy of the retention
leases under lock and then make a copy of that collection outside of the
lock. However, since we merely copied a reference to the retention
leases, after leaving a lock the underlying collection could change on
us. Rather, we want to copy these under lock. This commit adds a
dedicated method for doing this, asserts that we hold a lock when we use
this method, and changes adding a retention lease to use this method.

This commit was intended to be included with #37398 but was pushed to
the wrong branch.
2019-01-27 08:13:47 -05:00
Jason Tedor 5fddb631a2
Introduce retention lease syncing (#37398)
This commit introduces retention lease syncing from the primary to its
replicas when a new retention lease is added. A follow-up commit will
add a background sync of the retention leases as well so that renewed
retention leases are synced to replicas.
2019-01-27 07:49:56 -05:00
Nhat Nguyen 780b4c72fe
Make ChannelActionListener a top-level class (#37797)
We start using this class more often. Let's make it a top-level class.
2019-01-26 22:01:30 -05:00
Julie Tibshirani afc60bb0e5 Mute DynamicMappingIT#testConflictingDynamicMappings
Tracked in #37898.
2019-01-25 18:09:34 -08:00
Tal Levy eb973a4744 fix GeoHashGridTests precision parsing error
Previously, a hardcoded precision value of 4 was
used by these tests resulting in no approximation
errors. Now that the precision is between 1-12,
precision values of 1 and 2 result in potential
bucketing errors.

This commit adjusts the range to be 4-12.

Fixes #37892.
2019-01-25 17:29:04 -08:00
Julie Tibshirani 58301ead6d Mute IndexShardIT#testMaybeFlush
Tracked in #37896.
2019-01-25 17:12:16 -08:00
Julie Tibshirani 23b0d9b3ed Mute RecoveryWhileUnderLoadIT#testRecoverWhileUnderLoadAllocateReplicasRelocatePrimariesTest
Tracked in #37895.
2019-01-25 16:50:39 -08:00
Julie Tibshirani e41ccdc1a0 Mute GeoWKTShapeParserTests#testParseGeometryCollection
Tracked in #37894.
2019-01-25 16:15:16 -08:00
Julie Tibshirani 827ed12146 Mute TasksIT#testTransportBulkTasks
Tracked in #37893.
2019-01-25 15:29:24 -08:00
Julie Tibshirani a4020f4587 Mute SharedClusterSnapshotRestoreIT#testSnapshotCanceledOnRemovedShard
Tracked in #37888.
2019-01-25 13:40:29 -08:00
Like eb7bf16427 Migrate o.e.i.r.RecoveryState to Writeable (#37380)
Relates to #34389
2019-01-25 15:52:04 -05:00
Nhat Nguyen 5cd4dfb0e4
Relax cluster metadata version check (#37834)
If the in_sync_allocations of index-1 or index-2 is changed, the
metadata version will be increased. This leads to the failure in
the metadata version checks. We need to relax them.

Closes #37820
2019-01-25 14:54:13 -05:00
Yuri Astrakhan f1e71be8b2
Refactored GeoHashGrid unit tests (#37832)
* Refactored GeoHashGrid unit tests

This change allows other grid aggregations to reuse the same tests.

The change mostly just moves code to the base classes, trying to
keep changes to a bare minimum.

* rename createInternalGeoHashGridBucket to createInternalGeoGridBucket

* indentation
2019-01-25 13:37:24 -05:00
Zachary Tong afd4618851 Fixes for a few randomized agg tests that fail hasValue() checks
Closes #37743
Closes #37873
2019-01-25 12:39:42 -05:00
Igor Motov 68149b6058
Geo: replace intermediate geo objects with libs/geo (#37721)
Replaces intermediate geo objects built by ShapeBuilders with
objects from the libs/geo hierarchy. This should allow us to build
all geo functionality around a single hierarchy.

Follow up for #35320
2019-01-25 11:37:27 -05:00
Tanguy Leroux a644bc095c
Add unit tests for ShardStateAction's ShardStartedClusterStateTaskExecutor (#37756) 2019-01-25 16:51:53 +01:00
Vishnu Gt 27c3fb8e0d Do not allow negative variances (#37384)
Due to floating point error, it was possible for variances to become negative which should never happen.  This bugfix sets variance to zero if it becomes negative as a result of fp error.
2019-01-25 09:56:34 -05:00
Tanguy Leroux ef8dd12c6d Limit number of documents indexed in CloseIndexIT test
This test indexes an unlimited number of documents, this commit
reduces this number to 25K and also tracks exact number of hits
when counting the docs.
2019-01-25 15:09:27 +01:00
Christoph Büscher b4b4cd6ebd
Clean codebase from empty statements (#37822)
* Remove empty statements

There are a couple of instances of undocumented empty statements all across the
code base. While they are mostly harmless, they make the code hard to read and
are potentially error-prone. Removing most of these instances and marking blocks
that look empty by intention as such.

* Change test, slightly more verbose but less confusing
2019-01-25 14:23:02 +01:00
Henning Andersen 49073dd2f6
Fail start on invalid index metadata (#37748)
Node started with node.data=false and node.master=false can no longer
start if they have index metadata. This avoids resurrecting old indexes
into the cluster and ensures metadata is cleaned out before
re-purposing a node that was previously master or data node.

Issue #27073
2019-01-25 14:22:48 +01:00
Jim Ferenczi cb451edb01
Allow nested fields in the composite aggregation (#37178)
This changes adds the support to handle `nested` fields in the `composite`
aggregation. A `nested` aggregation can be used as parent of a `composite`
aggregation in order to target `nested` fields in the `sources`.

Closes #28611
2019-01-25 14:00:39 +01:00
Alexander Reelsen 9e350d027e
Add BWC compatible processing to ingest date processors (#37407)
The ingest date processor is currently only able to parse joda formats.
However it is not using the existing elasticsearch classes but access
joda directly. This means that our existing BWC layer does not notify
the user about deprecated formats. This commit switches to use the
exising Elasticsearch Joda methods to acquire a date format, that
includes the BWC check and the ability to parse java 8 dates.

The date parsing in ingest has also another extra feature, that the
fallback year, when a date format without a year is used, is the current
year, and not 1970 like usual. This is currently not properly supported
in the DateFormatter class. As this is the only case for this feature
and java time can take care of this using the toZonedDateTime() method,
a workaround just for the joda time parser has been created, that can be
removed soon again from 7.0.
2019-01-25 13:50:19 +01:00
Jim Ferenczi 787acb14b9
Track total hits up to 10,000 by default (#37466)
This commit changes the default for the `track_total_hits` option of the search request
to `10,000`. This means that by default search requests will accurately track the total hit count
up to `10,000` documents, requests that match more than this value will set the `"total.relation"`
to `"gte"` (e.g. greater than or equals) and the `"total.value"` to `10,000` in the search response.
Scroll queries are not impacted, they will continue to count the total hits accurately.
The default is set back to `true` (accurate hit count) if `rest_total_hits_as_int` is set in the search request.
I choose `10,000` as the default because that's also the number we use to limit pagination. This means that users will be able to know how far they can jump (up to 10,000) even if the total number of hits is not accurate.

Closes #33028
2019-01-25 13:45:39 +01:00
Mayya Sharipova 70af3c7983
Correct deprec log in RestGetFieldMappingAction (#37843)
* Correct deprec log in RestGetFieldMappingAction

Correct a class used for deprecation logging in
RestGetFieldMappingAction

* Correct deprec log in RestCreateIndexAction

Correct a class used for deprecation logging in
RestCreateIndexAction
2019-01-25 07:13:46 -05:00
Andrey Ershov 9e7fd8caed
Migrate ZenDiscoveryIT to Zen2 (#37465)
ZenDiscoveryIT contained 5 tests. 3 run without changes, testNodeRejectsClusterStateWithWrongMasterNode removed, testHandleNodeJoin_incompatibleClusterState changed.
2019-01-25 11:17:09 +01:00
Armin Braun 7692b607b9
Fix ClusterDisruptionIT#testAckedIndexing (#37853)
* Stop threads before logging the list of exceptions
* For the broken case of concurrent iteration in the finally block and the threads not having shut down,
use `CopyOnWriteArrayList` to have concurrency safe iteration
* Closes #37810
2019-01-25 09:38:29 +01:00
Martijn van Groningen 5a9dadb3ff
changed versionAdded now that #37767 is backedported 2019-01-25 09:18:42 +01:00
Martijn van Groningen 1151f3b3ff
Fail with a dedicated exception if remote connection is missing or (#37767)
or connectivity to the remote connection is failing.

Relates to #37681
2019-01-25 08:53:18 +01:00
Ricardo Ferreira df8fa9781e Remove Abstract Component (#35898)
TransportAction and BaseRestHandler now no longer extends AbstractComponent. The AbstractComponent no longer has usages so it was deleted.

Closes #34488
2019-01-25 08:35:19 +01:00