Commit Graph

44801 Commits

Author SHA1 Message Date
Dimitris Athanasiou 8843832039 [ML] Shave off DeleteExpiredDataIT runtime (#39557)
This commit parallelizes some parts of the test
and its remove an unnecessary refresh call.
On my local machine it shaves off about 15 seconds
for a test execution time of ~64s (down from ~80s).
This test is still slow but progress over perfection.

Relates #37339
2019-03-01 19:10:00 +02:00
Tanguy Leroux 0c6b7cfb77 Revert "Support concurrent refresh of refresh tokens (#39559)"
This reverts commit e2599214e0.
2019-03-01 17:59:45 +01:00
Jack Conradson 39a401b827 Remove non-existent variable from Painless context docs (#39523) 2019-03-01 08:38:56 -08:00
Jack Conradson 687a66b580 Add byte and Byte to Painless standard cast tests (#39415) 2019-03-01 08:35:20 -08:00
Adrien Grand 976f988358 Add guidance for writing tests. (#39318) 2019-03-01 15:14:21 +01:00
Ioannis Kakavas e2599214e0
Support concurrent refresh of refresh tokens (#39559)
This is a backport of #38382

This change adds supports for the concurrent refresh of access
tokens as described in #36872
In short it allows subsequent client requests to refresh the same token that
come within a predefined window of 60 seconds to be handled as duplicates
of the original one and thus receive the same response with the same newly
issued access token and refresh token.
In order to support that, two new fields are added in the token document. One
contains the instant (in epoqueMillis) when a given refresh token is refreshed
and one that contains a pointer to the token document that stores the new
refresh token and access token that was created by the original refresh.
A side effect of this change, that was however also a intended enhancement
for the token service, is that we needed to stop encrypting the string
representation of the UserToken while serializing. ( It was necessary as we
correctly used a new IV for every time we encrypted a token in serialization, so
subsequent serializations of the same exact UserToken would produce
different access token strings)

This change also handles the serialization/deserialization BWC logic:

- In mixed clusters we keep creating tokens in the old format and
consume only old format tokens
- In upgraded clusters, we start creating tokens in the new format but
still remain able to consume old format tokens (that could have been
created during the rolling upgrade and are still valid)

Resolves #36872

Co-authored-by: Jay Modi jaymode@users.noreply.github.com
2019-03-01 16:00:07 +02:00
Luca Cavanna 29e3c18713 Mute failing IndexShardIT#testPendingRefreshWithIntervalChange
Relates to #39565
2019-03-01 14:55:19 +01:00
Tanguy Leroux e005eeb0b3
Backport support for replicating closed indices to 7.x (#39506)(#39499)
Backport support for replicating closed indices (#39499)
    
    Before this change, closed indexes were simply not replicated. It was therefore
    possible to close an index and then decommission a data node without knowing
    that this data node contained shards of the closed index, potentially leading to
    data loss. Shards of closed indices were not completely taken into account when
    balancing the shards within the cluster, or automatically replicated through shard
    copies, and they were not easily movable from node A to node B using APIs like
    Cluster Reroute without being fully reopened and closed again.
    
    This commit changes the logic executed when closing an index, so that its shards
    are not just removed and forgotten but are instead reinitialized and reallocated on
    data nodes using an engine implementation which does not allow searching or
     indexing, which has a low memory overhead (compared with searchable/indexable
    opened shards) and which allows shards to be recovered from peer or promoted
    as primaries when needed.
    
    This new closing logic is built on top of the new Close Index API introduced in
    6.7.0 (#37359). Some pre-closing sanity checks are executed on the shards before
    closing them, and closing an index on a 8.0 cluster will reinitialize the index shards
    and therefore impact the cluster health.
    
    Some APIs have been adapted to make them work with closed indices:
    - Cluster Health API
    - Cluster Reroute API
    - Cluster Allocation Explain API
    - Recovery API
    - Cat Indices
    - Cat Shards
    - Cat Health
    - Cat Recovery
    
    This commit contains all the following changes (most recent first):
    * c6c42a1 Adapt NoOpEngineTests after #39006
    * 3f9993d Wait for shards to be active after closing indices (#38854)
    * 5e7a428 Adapt the Cluster Health API to closed indices (#39364)
    * 3e61939 Adapt CloseFollowerIndexIT for replicated closed indices (#38767)
    * 71f5c34 Recover closed indices after a full cluster restart (#39249)
    * 4db7fd9 Adapt the Recovery API for closed indices (#38421)
    * 4fd1bb2 Adapt more tests suites to closed indices (#39186)
    * 0519016 Add replica to primary promotion test for closed indices (#39110)
    * b756f6c Test the Cluster Shard Allocation Explain API with closed indices (#38631)
    * c484c66 Remove index routing table of closed indices in mixed versions clusters (#38955)
    * 00f1828 Mute CloseFollowerIndexIT.testCloseAndReopenFollowerIndex()
    * e845b0a Do not schedule Refresh/Translog/GlobalCheckpoint tasks for closed indices (#38329)
    * cf9a015 Adapt testIndexCanChangeCustomDataPath for replicated closed indices (#38327)
    * b9becdd Adapt testPendingTasks() for replicated closed indices (#38326)
    * 02cc730 Allow shards of closed indices to be replicated as regular shards (#38024)
    * e53a9be Fix compilation error in IndexShardIT after merge with master
    * cae4155 Relax NoOpEngine constraints (#37413)
    * 54d110b [RCI] Adapt NoOpEngine to latest FrozenEngine changes
    * c63fd69 [RCI] Add NoOpEngine for closed indices (#33903)
    
    Relates to #33888
2019-03-01 14:48:26 +01:00
Andrei Stefan 06d0e0efad Removed custom naming for DISTINCT COUNT (#39537)
(cherry picked from commit 9412a2ee01a60dd6449bbced1273ec0b37b65589)
2019-03-01 15:26:32 +02:00
Andrei Stefan ba44f28340 SQL: ignore UNSUPPORTED fields for JDBC and ODBC modes in 'SYS COLUMNS' (#39518)
* SYS COLUMNS will skip UNSUPPORTED field types in ODBC and JDBC, as well.
NESTED and OBJECT types were already skipped in ODBC mode, now they are
skipped in JDBC mode, as well.

(cherry picked from commit 9e0df64b2d36c9069dfa506570468f0522c86417)
2019-03-01 15:26:31 +02:00
David Kyle 894ecb244d
[ML-Dataframe] Move dataframe actions into core (#39548) 2019-03-01 10:45:36 +00:00
Marios Trivyzas 9fb2f670dc SQL: Enhance checks for inexact fields (#39427)
For functions: move checks for `text` fields without underlying `keyword`
fields or with many of them (ambiguity) to the type resolution stage.

For Order By/Group By: move checks to the `Verifier` to catch early
before `QueryTranslator` or execution.

Closes: #38501
Fixes: #35203
2019-03-01 10:40:57 +01:00
Yannick Welsch 1a50af7dd4 Do not close bad indices on startup (#39500)
With #17187, we verified IndexService creation during initial state recovery on the master and if the
recovery failed the index was imported as closed, not allocating any shards. This was mainly done to
prevent endless allocation loops and full log files on data-nodes when the indexmetadata contained
broken settings / analyzers. Zen2 loads the cluster state eagerly, and this check currently runs on all
nodes (not only the elected master), which can significantly slow down startup on data nodes.
Furthermore, with replicated closed indices (#33888) on the horizon, importing the index as closed
will no longer not allocate any shards. Fortunately, the original issue for endless allocation loops is
no longer a problem due to #18467, where we limit the retries of failed allocations. The solution here
is therefore to just undo #17187, as it's no longer necessary, and covered by #18467, which will solve
the issue for Zen2 and replicated closed indices as well.
2019-03-01 09:23:46 +01:00
Tal Levy b9b46fdec6
fix UpdateSettingsRequestStreamableTests.mutateInstance (#39386) (#39477)
Mutations of the timeout values were using string-representations.

This resulted in very rare cases where the original timeout value was
represented as something like "0ms" and the new random time-value generated
was "0s". Although their string representations differ, their underlying
TimeValue does not. This resulted in `-Dtests.seed=7F4C034C43C22B1B` to
fail.
2019-02-28 21:02:32 -08:00
Mark Tozzi 609118c229 Override and mute InternalAutoDateHistogramTests#testReduceRandom() (#39536)
pending resolution of #39497
2019-02-28 16:00:32 -05:00
Ryan Ernst 1124624e87
Obsolete pre 7.0 noarch package in rpm (#39472)
This commit makes the rpm metadata indicate the pre 7.0 noarch packages
are obsoleted by this package. This fixes an issue where upgrading with
yum would cause an error thinking there was nothing to upgrade.

closes #39414
2019-02-28 12:35:23 -08:00
Shajahan Palayil 8ced21db88
[DOCS] Corrected API path for /_security/api_key (#39521) 2019-02-28 20:08:39 +01:00
Lee Hinman dae48ba262 Add details about what acquired the shard lock last (#38807)
This adds a `details` parameter to shard locking in `NodeEnvironment`. This is
intended to be used for diagnosing issues such as

```
  1> [2019-02-11T14:34:19,262][INFO ][o.e.c.m.MetaDataDeleteIndexService] [node_s0] [.tasks/oSYOG0-9SHOx_pfAoiSExQ] deleting index
  1> [2019-02-11T14:34:19,279][WARN ][o.e.i.IndicesService     ] [node_s0] [.tasks/oSYOG0-9SHOx_pfAoiSExQ] failed to delete index
  1> org.elasticsearch.env.ShardLockObtainFailedException: [.tasks][0]: obtaining shard lock timed out after 0ms
  1> 	at org.elasticsearch.env.NodeEnvironment$InternalShardLock.acquire(NodeEnvironment.java:736) ~[main/:?]
  1> 	at org.elasticsearch.env.NodeEnvironment.shardLock(NodeEnvironment.java:655) ~[main/:?]
  1> 	at org.elasticsearch.env.NodeEnvironment.lockAllForIndex(NodeEnvironment.java:601) ~[main/:?]
  1> 	at org.elasticsearch.env.NodeEnvironment.deleteIndexDirectorySafe(NodeEnvironment.java:554) ~[main/:?]
```

In the hope that we will be able to determine why the shard is still locked.

Relates to #30290 as well as some other CI failures
2019-02-28 10:50:47 -07:00
Armin Braun e564c4d8ad
Add Package Level JavaDoc on Snapshots (#38108) (#39514)
* Add Package Level JavaDoc on Snapshots
2019-02-28 18:23:01 +01:00
Albert Zaharovits 8a19d981db Integ test snapshot and restore for native realm (#39123)
This commit adds a simple integ test that exercises the flow:
* snapshot .security
* delete .security
* restore .security

, checking that the Native Realm works as expected.

Relates #34454
2019-02-28 14:41:47 +02:00
Hendrik Muhs 30e5c11cc2
[ML-DataFrame] Dataframe REST cleanups (#39451) (#39503)
fix a couple of odd behaviors of data frame transforms REST API's:

 -  check if id from body and id from URL match if both are specified
 -  do not allow a body for delete
 -  allow get and stats without specifying an id
2019-02-28 13:00:37 +01:00
Simon Willnauer 5c96b90ed5 Never block on scheduled refresh if a refresh is running (#39462)
Today we block on the ReferenceManager in the case of a scheduled refresh.
Yet if there is a refresh happening concurrently we might block and create
very smallish segments. Instead we should just move on to the next shard
and free up the refresh thread instead.
2019-02-28 11:57:45 +01:00
Armin Braun d3d7d9bb9d
Remove Dead Code + Duplication in o.e.c.routing (#36678) (#39493)
* Removed obviously unused fields+methods
* Inlined public methods that only had one caller
* Simplified `Optional` chain
* Simplified some obviously redundant conditions
2019-02-28 10:33:05 +01:00
Dimitris Athanasiou 8122650a55 [ML] Add integration test for interim results after advancing bucket (#39447)
This is an integration test that captures the issue described in
elastic/ml-cpp#324
2019-02-28 11:12:08 +02:00
Armin Braun 90ab4a6f6e
Stabilize RareClusterState (#38671) (#39468)
* Use actual master node, not just a master elligible node when trying to cancel publication. This only works on the master and for unlucky seeds we never try the master within the 10s that the busy assert runs.
* Closes #36813
2019-02-28 08:01:52 +01:00
Ioannis Kakavas 2ce9457c8f Mute Bulk indexing of monitoring data (#39448)
Relates: #30101
2019-02-28 07:40:36 +02:00
Tal Levy f538b30af9
ensure no initializing shards during cluster cleanup (#39283) (#39480)
there are testing situations where newly created indices
are being wiped before they are fully initialized. This results
in an edge-case in the shard-locking strategy where an index
cannot be deleted.

This should fix that
2019-02-27 15:56:33 -08:00
Lisa Cawley 8b26f59958 [DOCS] Removes problematic footer from Watcher docs (#39474) 2019-02-27 15:45:56 -08:00
Tanguy Leroux 4dd274b51d Unmute CoordinatorTests.testDiscoveryUsesNodesFromLastClusterState() (#39452)
This commit unmutes the test and comments out the
offending call to linearizabilityChecker.isLinearizable() as suggested
in #39437
2019-02-27 20:38:54 +01:00
Lee Hinman ad8228aec9
Use non-ILM template setting up watch history template & ILM disabled (#39420)
Backport of #39325

When ILM is disabled and Watcher is setting up the templates and policies for
the watch history indices, it will now use a template that does not have the
`index.lifecycle.name` setting, so that indices are not created with the
setting.

This also adds tests for the behavior, and changes the cluster state used in
these tests to be real instead of mocked.

Resolves #38805
2019-02-27 11:11:19 -07:00
Lisa Cawley 9c8c158f21 [DOCS] Fix inline callout in Watcher documentation (#39423) 2019-02-27 09:45:10 -08:00
Tanguy Leroux 983b5d1c0e Mute SpecificMasterNodesIT.testElectOnlyBetweenMasterNodes()
Tracked in #38331
2019-02-27 18:00:02 +01:00
Alan Woodward 54ced2949b
Re-enable BWC (#39460)
Follow up to #39444
2019-02-27 16:51:15 +00:00
Jason Tedor 6d72d45e33
Use https to obtain Lucene snapshots (#39458)
This commit changes the protocol used to download Lucene snapshots.
2019-02-27 11:41:06 -05:00
Lisa Cawley dedbe60e0a [DOCS] Fixes table and code block separators in Watcher documentation (#39426) 2019-02-27 08:21:19 -08:00
Jay Modi 995144b197
Fix SSLConfigurationReloaderTests failure tests (#39408)
This change fixes the tests that expect the reload of a
SSLConfiguration to fail. The tests relied on an incorrect assumption
that the reloader only called reload on for an SSLConfiguration if the
key and trust managers were successfully reloaded, but that is not the
case. This change removes the fail call with a wrapped call to the
original method and captures the exception and counts down a latch to
make these tests consistently tested.

Closes #39260
2019-02-27 09:17:09 -07:00
Daniel Mitterdorfer 2ccba18809
Correct name of basic_date_time_no_millis (#39367) (#39454)
With this commit we correct the name of the Java time based formatter
for `basic_date_time_no_millis`.
2019-02-27 17:03:50 +01:00
Lisa Cawley e6c2dae250 [DOCS] Fix image warnings in CCR documentation (#39430) 2019-02-27 07:37:33 -08:00
Alan Woodward 71b8494181
Upgrade to lucene 8.0.0-snapshot-ff9509a8df (#39444)
Backport of #39350

Contains the following:

* LUCENE-8635: Move terms dictionary off-heap for non-primary-key fields in `MMapDirectory`
* LUCENE-8292: `TermsEnum` is fully abstract
* LUCENE-8679: Return WITHIN in `EdgeTree#relateTriangle` only when polygon and triangle share one edge
* LUCENE-8676: Nori tokenizer deals correctly with large buffers
* LUCENE-8697: `GraphTokenStreamFiniteStrings` better handles side paths with gaps
* LUCENE-8664: Add `equals` and `hashCode` to `TotalHits`
* LUCENE-8660: `TopDocsCollector` returns accurate hit counts if the total equals the threshold
* LUCENE-8654: `Polygon2D#relateTriangle` fix for when the polygon is inside the triangle
* LUCENE-8645: `Intervals#fixField` can merge intervals from different fields
* LUCENE-8585: Create jump-tables for DocValues at index time
2019-02-27 14:36:08 +00:00
Armin Braun f675b33d50
Increase Timeout in UnicastZenPingTests (#38893) (#39449)
* Just like #37268 removing another 1s timeout, those are dangerous since they're easily exceeded by an untimely gc pause
* Closes #26701
2019-02-27 15:22:17 +01:00
Marios Trivyzas a2c07b5011
SQL: Use underlying exact field for LIKE/RLIKE (#39443)
Previously, if a text field had an underlying keyword field
the latter was not used instead of the text leading to wrong
results returned by queries filtering with LIKE/RLIKE.

Fixes: #39442
2019-02-27 14:46:54 +01:00
Jason Tedor 55e98f08d8
Provide a clearer error message on keystore add (#39327)
When trying to add a setting to the keystore with an upper case name, we
reject with an unclear error message. This commit makes that error
message much clearer.
2019-02-27 08:10:23 -05:00
Jason Tedor 6c5bf3ac13
Remove outdated DNS caching docs from HTTP exporter (#39394)
These docs are out of date, now that we override the infinite DNS cache
within Elasticsearch. This commit completely removes this content, as
specific guidance is no longer needed here.
2019-02-27 08:08:44 -05:00
Jason Tedor 842940785a
Conditionally build BWC projects in parallel (#39396)
This commit sets the BWC projects to build in parallel if Gradle was
invoked with parallal project execution enabled. This substantially
speeds up the time of building the BWC projects since there are many
dependent projects needed to build a BWC version.
2019-02-27 08:05:00 -05:00
Armin Braun 27485871b8
Don't Ping on Handshake Connection (#39076) (#39446)
* Don't Ping on Handshake Connection

* It does not make sense to run pings on the handshake connection
   * Set the ping interval to `-1` to deactivate pings on it
2019-02-27 13:39:25 +01:00
Tanguy Leroux 6912e27ee0 Mute MinimumMasterNodesIT.testThreeNodesNoMasterBlock()
Tracked in #39172
2019-02-27 13:13:22 +01:00
Armin Braun da9190be0a
Add Checks for Closed Channel in Selector Loop (#39096) (#39439)
* A few warnings could be observed in test logs about `NoSuchElementException` being thrown in `InboundChannelBuffer#sliceBuffersTo`.
These were the result of calls to this method after the relevant channel and hence the buffer was closed already as a result of a failed IO operation.
  * Fixed by adding the necessary guard statements to break out in these cases. I don't think there is a need here to do any additional error handling since `eventHandler.postHandling(channelContext);` at the end of the `processKey`
call in the main selection loop handles closing channels and invoking callbacks for writes that failed to go through already.
2019-02-27 11:28:30 +01:00
Mehran Koushkebaghi 1d0097b5e8 [ML] Refactoring scheduled event to store instant instead of zoned time zone (#39380)
The ScheduledEvent class has never preserved the time
zone so it makes more sense for it to store the start and
end time using Instant rather than ZonedDateTime.

Closes #38620
2019-02-27 09:27:04 +00:00
Martijn van Groningen a427a28318
Unmuted testCannotFollowLeaderInUpgradedCluster test.
Relates to #39355
2019-02-27 09:45:43 +01:00
David Turner 41668f7723 Move PeerFinder's logger to the expected package (#39412)
Today the abstract `org.elasticsearch.discovery.PeerFinder` uses the logger of
its implementation, which in production is in `o.e.cluster.coordination`. This
turns out to be confusing and unhelpful, so with this change we move to using
the logger that belongs to `PeerFinder`.
2019-02-27 08:44:05 +00:00