Commit Graph

46042 Commits

Author SHA1 Message Date
David Kyle aea600fe7d [Ml Data Frame] Return bad_request on preview when config is invalid (#42447) 2019-05-28 15:36:50 +01:00
David Turner c21745c8ab Avoid loading retention leases while writing them (#42620)
Resolves #41430.
2019-05-28 15:27:06 +01:00
James Rodewig 31d2bdca37
[DOCS] Fix Moving Avg Aggregation `deprecated` macro for Asciidoctor (#42405) 2019-05-28 08:56:50 -04:00
James Rodewig b30ca8da28 [DOCS] Fix API Quick Reference rollup attribute for Asciidoctor (#42403) 2019-05-28 08:53:20 -04:00
James Rodewig 3079d2d295 [DOCS] Escape cross-ref link comma for Asciidoctor (#42402) 2019-05-28 08:47:51 -04:00
Yannick Welsch 1e0b0f640b Fix compilation
Follow-up to 5598647922
2019-05-28 13:56:36 +02:00
Yannick Welsch 5598647922 Reset state recovery after successful recovery (#42576)
The problem this commit addresses is that state recovery is not reset on a node that then becomes
master with a cluster state that has a state not recovered flag in it. The situation that was observed
in a failed test run of MinimumMasterNodesIT.testThreeNodesNoMasterBlock (see below) is that we
have 3 master nodes (node_t0, node_t1, node_t2), two of them are shut down (node_t2 remains),
when the first one comes back (renamed to node_t4) it becomes leader in term 2 and sends state
(with state_not_recovered_block) to node_t2, which accepts. node_t2 becomes leader in term 3, and
as it was previously leader in term1 and successfully completed state recovery, does never retry
state recovery in term 3.

Closes #39172
2019-05-28 13:46:10 +02:00
David Turner 746a2f41fd
Remove PRE_60_NODE_CHECKPOINT (#42531)
This commit removes the obsolete `PRE_60_NODE_CHECKPOINT` constant for dealing
with 5.x nodes' lack of sequence number support.

Backport of #42527
2019-05-28 12:25:53 +01:00
Armin Braun 00d665540a
Make unwrapCorrupt Check Suppressed Ex. (#41889) (#42605)
* Make unwrapCorrupt Check Suppressed Ex. (#41889)
* As discussed in #24800 we want to check for suppressed corruption
indicating exceptions here as well to more reliably categorize
corruption related exceptions
* Closes #24800, 41201
2019-05-28 12:44:40 +02:00
Daniel Mitterdorfer adb3574af8
Mute NodeTests (#42615)
Relates #42577
Relates #42614
2019-05-28 12:25:18 +02:00
Armin Braun 116b050cc6
Cleanup Bulk Delete Exception Logging (#41693) (#42606)
* Cleanup Bulk Delete Exception Logging

* Follow up to #41368
* Collect all failed blob deletes and add them to the exception message
* Remove logging of blob name list from caller exception logging
2019-05-28 11:00:28 +02:00
Armin Braun 44bf784fe1
Add Infrastructure to Run 3rd Party Repository Tests (#42586) (#42604)
* Add Infrastructure to Run 3rd Party Repository Tests

* Add infrastructure to run third party repository tests using our standard JUnit infrastructure
* This is a prerequisite of #42189
2019-05-28 10:46:22 +02:00
Daniel Mitterdorfer 635ce0ca6d
Mute AsyncTwoPhaseIndexerTests#testStateMachine() (#42610)
Relates #42084
Relates #42609
2019-05-28 10:20:22 +02:00
Gürkan Kaymak 1d09367a82 Fixed ignoring name parameter for percolator queries (#42598)
Closes #40405
2019-05-28 09:38:00 +02:00
Armin Braun c079fb61bf
Remove Dead Code from Azure Repo Plugin (#42178) (#42569)
* None of this stuff is used
2019-05-28 08:00:02 +02:00
Nhat Nguyen 2077f9ffbc Reset mock transport service in CcrRetentionLeaseIT (#42600)
testRetentionLeaseIsAddedIfItDisappearsWhileFollowing does not reset the
mock transport service after test. Surviving transport interceptors from
that test can sneaky remove retention leases and make other tests fail.

Closes #39331
Closes #39509
Closes #41428
Closes #41679
Closes #41737
Closes #41756
2019-05-27 21:51:25 -04:00
Nhat Nguyen ab832c4f17 Use doc instead of _doc in FullClusterRestartIT
ES does not accept doc type starting with underscore until 6.2.0. We
have to use "doc" instead of "_doc" in FullClusterRestartIT if we are
upgrading from a 6.2.0- cluster.

Closes #42581
2019-05-27 21:35:56 -04:00
Nhat Nguyen 4123ade2b6 Add test ensure we can execute update requests in mixed cluster
Relates #42596
2019-05-27 18:27:50 -04:00
Nhat Nguyen de6be819d6 Allocate to data-only nodes in ReopenWhileClosingIT (#42560)
If all primary shards are allocated on the master node, then the
verifying before close step will never interact with mock transport
service. This change prefers to allocate shards on data-only nodes.

Closes #39757
2019-05-27 17:32:06 -04:00
Armin Braun a94d24ae5a
Fix RareClusterStateIT (#42430) (#42580)
* It looks like we might be cancelling a previous publication instead of
the one triggered by the given request with a very low likelihood.
   * Fixed by adding a wait for no in-progress publications
   * Also added debug logging that would've identified this problem
* Closes #36813
2019-05-27 13:57:17 +02:00
Armin Braun c4f44024af
Remove Delete Method from BlobStore (#41619) (#42574)
* Remove Delete Method from BlobStore (#41619)
* The delete method on the blob store was used almost nowhere and just duplicates the delete method on the blob containers
  * The fact that it provided for some recursive delete logic (that did not behave the same way on all implementations) was not used and not properly tested either
2019-05-27 12:24:20 +02:00
Armin Braun bb7e8eb2fd
Introduce ShardState Enum + Slight Cleanup SnapshotsInProgress (#41940) (#42573)
* Added separate enum for the state of each shard, it was really
confusing that we used the same enum for the state of the snapshot
overall and the state of each individual shard
   * relates https://github.com/elastic/elasticsearch/pull/40943#issuecomment-488664150
* Shortened some obvious spots in equals method and saved a few lines
via `computeIfAbsent` to make up for adding 50 new lines to this class
2019-05-27 12:08:45 +02:00
Armin Braun a96606d962
Safer Wait for Snapshot Success in ClusterPrivilegeTests (#40943) (#42575)
* Safer Wait for Snapshot Success in ClusterPrivilegeTests

* The snapshot state returned by the API might become SUCCESS before it's fully removed from the cluster state.
  * We should fix this race in the transport API but it's not trivial and will be part of the incoming big round of refactoring the repository interaction, this added check fixes the test for now
* closes #38030
2019-05-27 12:08:20 +02:00
Ignacio Vera 5d3e381648
mute test testClosedIndices (#42582) 2019-05-27 12:05:17 +02:00
Travis Steel 381e100217 Fixed typo in docker.asciidoc (#42455) 2019-05-27 11:54:56 +02:00
bellengao 380f296631 Update script-fields.asciidoc (#42490) 2019-05-27 11:48:37 +02:00
Armin Braun 7b4d1ac352
Remove Obsolete BwC Logic from BlobStoreRepository (#42193) (#42571)
* Remove Obsolete BwC Logic from BlobStoreRepository

* We can't restore 1.3.3 files anyway -> no point in doing the dance of computing a hash here
* Some other minor+obvious cleanups
2019-05-27 11:47:04 +02:00
Armin Braun b68358945f
Dump Stacktrace on Slow IO-Thread Operations (#42000) (#42572)
* Dump Stacktrace on Slow IO-Thread Operations

* Follow up to #39729 extending the functionality to actually dump the
stack when the thread is blocked not afterwards
   * Logging the stacktrace after the thread became unblocked is only of
limited use because we don't know what happened in the slow callback
from that (only whether we were  blocked on a read,write,connect etc.)
* Relates #41745
2019-05-27 11:44:36 +02:00
Armin Braun 489616da62
Fix testTracerLog Network Tests (#42286) (#42565)
* Fix testTracerLog Network Tests

* Start appender before using it like we do for e.g. the Netty leak detection appender to avoid interference from actions on the network threads that might still be dangling from previous tests in the same suite
* Closes #41890
2019-05-27 11:39:59 +02:00
Armin Braun c7448b12e1
Cleanup Redundant BlobStoreFormat Class (#42195) (#42570)
* No need to have an abstract class here when there's only a single impl.
2019-05-27 11:28:50 +02:00
Armin Braun 49767fc1e9
Some Cleanup in o.e.gateway Package (#42108) (#42568)
* Removing obvious dead code
* Removing redundant listener interface
2019-05-27 11:28:12 +02:00
Armin Braun a5ca20a250
Some Cleanup in o.e.i.engine (#42278) (#42566)
* Some Cleanup in o.e.i.engine

* Remove dead code and parameters
* Reduce visibility in some obvious spots
* Add missing `assert`s (not that important here since the methods
themselves will probably be dead-code eliminated) but still
2019-05-27 11:04:54 +02:00
Armin Braun d2cd36bd9f
Upgrade to Netty 4.1.36 (#42543) (#42564) 2019-05-27 10:38:03 +02:00
Martijn van Groningen e591d30918
fixed test compile issue 2019-05-27 10:17:00 +02:00
Martijn van Groningen 48a71459c0
Improve how internal representation of pipelines are updated (#42257)
If a single pipeline is updated then the internal representation of
all pipelines was updated. With this change, only the internal representation
of the pipelines that have been modified will be updated.

Prior to this change the IngestMetadata of the previous and current cluster
was used to determine whether the internal representation of pipelines
should be updated. If applying the previous cluster state change failed then
subsequent cluster state changes that have no changes to IngestMetadata
will not attempt to update the internal representation of the pipelines.

This commit, changes how the IngestService updates the internal representation
by keeping track of the underlying configuration and use that to detect
against the new IngestMetadata whether a pipeline configuration has been
changed and if so, then the internal pipeline representation will be updated.
2019-05-27 10:01:15 +02:00
Alpar Torok 4dbf6c0df9 Make packer cache branches explicit (#41990)
Before this change we would recurse to cache bwc versions.
This proved to be problematic due to  the number of steps it was
generating taking too long.
Also this required tricky maintenance to break the recursion for old
branches we don't really care about.

With this change we now cache specific branches only.
2019-05-27 09:46:43 +03:00
Nhat Nguyen 85e60850af Add debug log for retention leases (#42557)
We need more information to understand why CcrRetentionLeaseIT is
failing. This commit adds some debug log to retention leases and enables
them in CcrRetentionLeaseIT.
2019-05-26 16:04:47 -04:00
Nhat Nguyen 5d2fcc53e4 Unmute FullClusterRestartIT#testClosedIndices
Fixed in #39566
Closes #39576
2019-05-26 11:20:04 -04:00
Nhat Nguyen d6e2f4a43e Enable recoveries trace log in CcrRetentionLeaseIT
Tracked #41679
2019-05-24 22:16:14 -04:00
Tanguy Leroux 6bec876682 Improve Close Index Response (#39687)
This changes the `CloseIndexResponse` so that it reports closing result
for each index. Shard failures or exception are also reported per index,
and the global acknowledgment flag is computed from the index results
only.

The response looks like:
```
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "indices" : {
    "docs" : {
      "closed" : true
    }
  }
}
```

The response reports shard failures like:
```
{
  "acknowledged" : false,
  "shards_acknowledged" : false,
  "indices" : {
    "docs-1" : {
      "closed" : true
    },
    "docs-2" : {
      "closed" : false,
      "shards" : {
        "1" : {
          "failures" : [
            {
              "shard" : 1,
              "index" : "docs-2",
              "status" : "BAD_REQUEST",
              "reason" : {
                "type" : "index_closed_exception",
                "reason" : "closed",
                "index_uuid" : "JFmQwr_aSPiZbkAH_KEF7A",
                "index" : "docs-2"
              }
            }
          ]
        }
      }
    },
    "docs-3" : {
      "closed" : true
    }
  }
}
```

Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>
2019-05-24 21:57:55 -04:00
Mark Vieira 63eccb16de
Make LoggerUsageTask cacheable (#42550)
(cherry picked from commit 0bb46d73cb89016ab9d96e76693bb0d7cee267a1)
2019-05-24 18:30:46 -07:00
Mark Vieira 24cf86a013
Ignore JAR manifests when snapshotting runtime classpaths (#42548)
(cherry picked from commit d5281fc96f6fb2f022c87699bdad64d88614e04c)
2019-05-24 18:28:47 -07:00
Julie Tibshirani 3a6c2525ca
Deprecate support for chained multi-fields. (#42330)
This PR contains a straight backport of #41926, and also updates the
migration documentation and deprecation info API for 7.x.
2019-05-24 15:55:06 -07:00
Jason Tedor f2cfd09289
Remove renewal in retention lease recovery test (#42536)
This commit removes the act of renewing some retention leases during a
retention lease recovery test. Having renewal does not add anything
extra to this test, but does allow for some situations where the test
can fail spuriously (i.e., in a way that does not indicate that
production code is broken).
2019-05-24 17:40:59 -05:00
Mark Vieira 5695992094
Use reproducible method of generating properties file for better caching (#42539)
(cherry picked from commit 9772574f9d0b942a1ee8dba5ff503b4cd286e36c)
2019-05-24 15:07:29 -07:00
Nhat Nguyen 74d771d8f6 Adjust load SplitIndexIT#testSplitIndexPrimaryTerm (#42477)
SplitIndexIT#testSplitIndexPrimaryTerm sometimes timeout due to
relocating many shards. This change adjusts loads and increases
the timeout.
2019-05-24 15:47:29 -04:00
Nhat Nguyen 02739d038c Mute accounting circuit breaker check after test (#42448)
If we close an engine while a refresh is happening, then we might leak
refCount of some SegmentReaders. We need to skip the ram accounting
circuit breaker check until we have a new Lucene snapshot which includes
the fix for LUCENE-8809.

This also adds a test to the engine but left it muted so we won't forget
to reenable this check.

Closes #30290
2019-05-24 15:42:12 -04:00
David Roberts 48dc0dca57 [ML] Use map and filter instead of flatMap in find_file_structure (#42534)
Using map and filter avoids the garbage from all the
Stream.of calls that flatMap necessitated. Performance
is better when there are masses of fields.
2019-05-24 20:12:06 +01:00
David Roberts 34de68b007 [ML] Fix possible race condition when closing an opening job (#42506)
This change fixes a race condition that would result in an
in-memory data structure becoming out-of-sync with persistent
tasks in cluster state.

If repeated often enough this could result in it being
impossible to open any ML jobs on the affected node, as the
master node would think the node had capacity to open another
job but the chosen node would error during the open sequence
due to its in-memory data structure being full.

The race could be triggered by opening a job and then closing
it a tiny fraction of a second later.  It is unlikely a user
of the UI could open and close the job that fast, but a script
or program calling the REST API could.

The nasty thing is, from the externally observable states and
stats everything would appear to be fine - the fast open then
close sequence would appear to leave the job in the closed
state.  It's only later that the leftovers in the in-memory
data structure might build up and cause a problem.
2019-05-24 20:11:58 +01:00
James Rodewig d521a88e19 [DOCS] Move callouts to end of line for Asciidoctor migration (#42356) 2019-05-24 15:03:46 -04:00