Commit Graph

48119 Commits

Author SHA1 Message Date
David Turner ecb20ebc6c More bootstrap docs tweaks (#47809)
Clarifies not to set `cluster.initial_master_nodes` on nodes that are joining
an existing cluster.

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2019-10-10 09:55:30 +01:00
Jim Ferenczi dc39196ea4 Fix tag in the search request timeout option docs (#47776)
and add missing parentheses `search_timeout` param
2019-10-10 10:35:44 +02:00
Jim Ferenczi bd6e2592a7 Remove the SearchContext from the highlighter context (#47733)
Today built-in highlighter and plugins have access to the SearchContext through the
highlighter context. However most of the information exposed in the SearchContext are not needed and a QueryShardContext
would be enough to perform highlighting. This change replaces the SearchContext by the informations that are absolutely
required by highlighter: a QueryShardContext and the SearchContextHighlight. This change allows to reduce the exposure of the
complex SearchContext and remove the needs to clone it in the percolator sub phase.

Relates #47198
Relates #46523
2019-10-10 10:34:10 +02:00
Hendrik Muhs 0e7869128a
[7.5][Transform] introduce new roles and deprecate old ones (#47780) (#47819)
deprecate data_frame_transforms_{user,admin} roles and introduce transform_{user,admin} roles as replacement
2019-10-10 10:31:24 +02:00
Alpar Torok fbbe04b801 Add a verifyVersions to the test FW (#47192)
The test FW has a method to check that it's implementation of getting
index and wire compatible versions as well as reasoning about which
version is released or not produces the same rezults as the simillar
implementation in the build.

This PR adds the `verifyVersions` task to the test FW so we have one
task to check everything related to versions.
2019-10-10 11:23:56 +03:00
Jim Ferenczi 3d334a262b Ensure that we don't call listener twice when detecting a partial failure in _search (#47694)
This change fixes a bug that can occur when a shard failure is detected while we build
the search response and accept partial failures in set to false. In this case we currently
call onFailure on the provided listener but also continue the search as if the failure didn't
occur. This can lead to a listener called twice, once with onFailure and once with onSuccess
which is forbidden by design.
2019-10-10 09:59:49 +02:00
Mark Vieira 5825d2df83
Resolve more Gradle task validation warnings (#47825)
(cherry picked from commit 82e3d5ea31101f4c27e69162684c3aa4fb2193a4)

# Conflicts:
#	buildSrc/src/main/groovy/org/elasticsearch/gradle/doc/RestTestsFromSnippetsTask.groovy
2019-10-09 15:42:49 -07:00
Mark Vieira 0360a18f61
Mute test SLMSnapshotBlockingIntegTests.testRetentionWhileSnapshotInProgress
Signed-off-by: Mark Vieira <portugee@gmail.com>
(cherry picked from commit a8a7477c396554926f260d210364f009d85ae5f2)
2019-10-09 15:38:24 -07:00
dengweisysu dc4224fbdf Sync translog without lock before trim unreferenced readers (#47790)
This commit is similar to the optimization made in #45765. With this
change, we fsync most of the data of the current generation without
holding writeLock when trimming unreferenced readers.

Relates #45765
2019-10-09 17:56:30 -04:00
Armin Braun 302e09decf
Simplify some Common ActionRunnable Uses (#47799) (#47828)
Especially in the snapshot code there's a lot
of logic chaining `ActionRunnables` in tricky
ways now and the code is getting hard to follow.
This change introduces two convinience methods that
make it clear that a wrapped listener is invoked with
certainty in some trickier spots and shortens the code a bit.
2019-10-09 23:29:50 +02:00
Ryan Ernst 14b979a7bc Make internal project flag overrideable by tests (#47757)
In order to work with external elasticsearch plugins, some parts of
build-tools need to know when the current build is part of the
elasticsearch project or an external build. We identify these "internal"
builds by a marker in our buildSrc classpath. However, some build-tools
integ tests need to override this flag to test both the external and
internal behavior.

This commit moves the storage of the flag indicating whether we are
running in an internal build to global build info. This will allow
testkit projects to override the flag.
2019-10-09 12:01:30 -07:00
Gordon Brown 9b3790d4f2 Mute "Test All Indexes Lifecycle Explain" (#47317) 2019-10-09 21:32:58 +04:00
Jack Conradson 076d3073b5 Move binding member field generation to Painless semantic pass (#47739)
This adds an SField node that operates similarly to SFunction as a top level 
node meant only for use in an SClass node. Member fields are generated 
for both class bindings and instance bindings using the new SField node 
during the semantic pass, and information is no longer passed through 
Globals for this during the write pass.
2019-10-09 10:24:53 -07:00
Tanguy Leroux 8f86469d3f
Do not auto-follow closed indices (#47721) (#47800)
Backport of (#47721) for 7.x.

Similarly to #47582, Auto-follow patterns creates following
indices as long as the remote index matches the pattern and
the remote primary shards are all started. But since 7.2 closed
indices are also replicated, and it does not play well with CCR
auto-follow patterns as they create following indices for closed
leader indices too.

This commit changes the getLeaderIndicesToFollow() so that closed
indices are excluded from auto-follow patterns.
2019-10-09 19:16:23 +02:00
Ryan Ernst 813f44cec8 Move test seed setup to global build info (#47763)
The global build info plugin prints high level information about the
build, including the test seed. However, BuildPlugin sets up the test
seed, which creates an odd implicit dependency on it. This commit moves
the intialization of the testSeed property into the global build info.
2019-10-09 09:59:19 -07:00
Igor Motov 12e4e7ef54 Geo: implement proper handling of out of bounds geo points (#47734)
This is the first iteration in improving of handling of out of
bounds geopoints with a latitude outside of the -90 - +90 range
and a longitude outside of the -180 - +180 range.

Relates to #43916
2019-10-09 20:30:59 +04:00
Igor Motov f8b8afdc70 Geo: Fixes indexing of linestrings that go around the globe (#47471)
LINESTRING (0 0, 720 20) is now decomposed into 3 strings:
multilinestring (
  (0.0 0.0, 180.0 5.0),
  (-180.0 5.0, 180 15),
  (-180.0 15.0, 0 20)
)

It also fixes issues with linestrings that intersect antimeridian
more than 5 times.

Fixes #43837
Fixes #43826
2019-10-09 20:30:59 +04:00
Tim Brooks d18ff24dbe
Fix BulkByScrollResponseTests exception assertions (#45519)
Currently in the x content serialization tests we compare the exception
messages that are serialized. These exceptions messages are not
equivalent because the exception often changes when serialized to x
content. This commit removes this assertion.
2019-10-09 10:15:58 -06:00
István Zoltán Szabó 6f4b7e9a7f [DOCS] Extends the analyzed_fields description in the PUT DFA API docs (#47791) 2019-10-09 18:14:58 +02:00
Tim Brooks 02622c1ef9
Fix issues with serializing BulkByScrollResponse (#45357)
Currently there are two issues with serializing BulkByScrollResponse.
First, when deserializing from XContent, indexing exceptions and search
exceptions are switched. Additionally, search exceptions do no retain
the appropriate RestStatus code, so you must evaluate the status code
from the exception. However, the exception class is not always correctly
retained when serialized.

This commit adds tests in the failure case. Additionally, fixes the
swapping of failure types and adds the rest status code to the search
failure.
2019-10-09 10:12:14 -06:00
Jim Ferenczi d96977202d Disable SLMSnapshotBlockingIntegTests#testSnapshotInProgress (#47775)
This test fails constantly in master and prs.

Relates #47689
2019-10-09 17:49:13 +02:00
Jake Landis 43dc72f1a5
Fix cluster alert for watcher/monitoring IndexOutOfBoundsExcep… (#47756)
If a cluster sending monitoring data is unhealthy and triggers an
alert, then stops sending data the following exception [1] can occur.

This exception stops the current Watch and the behavior is actually
correct in part due to the exception. Simply fixing the exception
introduces some incorrect behavior. Now that the Watch does not
error in the this case, it will result in an incorrectly "resolved"
alert.  The fix here is two parts a) fix the exception b) fix the
following incorrect behavior.

a) fixing the exception is as easy as checking the size of the
array before accessing it.

b) fixing the following incorrect behavior is a bit more intrusive

- Note - the UI depends on the success/met state for each condition
to determine an "OK" or "FIRING"

In this scenario, where an unhealthy cluster triggers an alert and
then goes silent, it should keep "FIRING" until it hears back that
the cluster is green. To keep the Watch "FIRING" either the index
action or the email action needs to fire. Since the Watch is neither
a "new" alert or a "resolved" alert, we do not want to keep sending
an email (that would be non-passive too). Without completely changing
the logic of how an alert is resolved allowing the index action to
take place would result in the alert being resolved. Since we can
not keep "FIRING" either the email or index action (since we don't
want to resolve the alert nor re-write the logic for alert resolution),
we will introduce a 3rd action. A logging action that WILL fire when
the cluster is unhealthy. Specifically will fire when there is an
unresolved alert and it can not find the cluster state.
This logging action is logged at debug, so it should be noticed much.
This logging action serves as an 'anchor' for the UI to keep the state
in an a "FIRING" status until the alert is resolved.

This presents a possible scenario where a cluster starts firing,
then goes completely silent forever, the Watch will be "FIRING"
forever. This is an edge case that already exists in some scenarios
and requires manual intervention to remove that Watch.

This changes changes to use a template-like method to populate the 
version_created for the default monitoring watches. The version is 
set to 7.5 since that is where this is first introduced.

Fixes #43184
2019-10-09 10:47:21 -05:00
Florian Kelbert 2abd9d53b6 [DOCS] Correct split API request format (#47774) 2019-10-09 08:28:41 -04:00
Alpar Torok 57671f6077 Enable 7.x to run with --parallel 2019-10-09 14:40:47 +03:00
Armin Braun edc3e9f0ab
Fix --debug-jvm Gradle Arg (#47773) (#47783)
This fixes the `--debug-jvm` arg to work again for
the `run` task.
Seems a recent refactoring of `RunTask` introduced
this obvious type.
2019-10-09 13:25:16 +02:00
Yogesh Gaikwad 1139cce9a3
[DOCS] Add docs for `create_doc` index privilege (#47584) (#47778)
This commit adds documentation for new index privilege
create_doc which only allows indexing of new documents
but no updates to existing documents via Index or Bulk APIs.

Relates: #45806
2019-10-09 21:22:36 +11:00
Andrei Stefan 75a7daae73 SQL: use calendar interval of 1y instead of fixed interval for grouping by YEAR and HISTOGRAMs (#47558)
(cherry picked from commit 55f5463eee4ecea3537df4b34645f1d87472a802)
2019-10-09 11:51:35 +03:00
Ryan Ernst 54c2aec38a Filter out special gradle threads from leak control (#47713)
This commit adds a thread filter for gradle unit tests which omits
threads gradle may create that we have no control over shutting down.
The current example of this is for project.exec which gradle pools.

closes #47417
2019-10-08 17:00:28 -07:00
Lee Hinman fb7abe9fa4 Separate SLM stop/start/status API from ILM (#47710)
* Separate SLM stop/start/status API from ILM

This separates a start/stop/status API for SLM from being tied to ILM's
operation mode. These APIs look like:

```
POST /_slm/stop
POST /_slm/start
GET /_slm/status
```

This allows administrators to have fine-grained control over preventing
periodic snapshots and deletions while performing cluster maintenance.

Relates to #43663

* Allow going from RUNNING to STOPPED

* Align with the OperationMode rules

* Fix slmStopping method

* Make OperationModeUpdateTask constructor private

* Wipe snapshots better in test
2019-10-08 17:21:38 -06:00
Gordon Brown a492864a9d
Manage retention of failed snapshots in SLM (#47617)
Failed snapshots will eventually build up unless they are deleted. While
failures may not take up much space, they add noise to the list of
snapshots and it's desirable to remove them when they are no longer
useful.

With this change, failed snapshots are deleted using the following
strategy: `FAILED` snapshots will be kept until the configured
`expire_after` period has passed, if present, and then be deleted. If
there is no configured `expire_after` in the retention policy, then they
will be deleted if there is at least one more recent successful snapshot
from this policy (as they may otherwise be useful for troubleshooting
purposes). Failed snapshots are not counted towards either `min_count`
or `max_count`.
2019-10-08 17:07:08 -06:00
Jake Landis a6b0ae7f69
Fix bug in ingest node documentation (#45589) (#47750)
The "Conditionals with the Pipeline Processor" incorrectly documents
how to create a pipeline of pipelines with a failure condition. The 
example as-is will always execute the fail processor. The change here
updates the documentation to correct guard the fail processor with an
if condition.
2019-10-08 17:23:38 -05:00
Armin Braun 96b36b5a8c
Make loadShardSnapshot Exceptions Consistent (#47728) (#47735)
Similar to #47507. We are throwing `SnapshotException` when
you (and SLM tests) would expect a `SnapshotMissingException`
for concurrent snapshot status and snapshot delete operations
with a very low probability.
Fixed the exception type and added a test for this scenario.
2019-10-08 21:04:51 +02:00
Ioannis Kakavas f4c884450f
Remove unmerged changes from release notes (#47732) 2019-10-08 18:14:41 +03:00
Armin Braun 5cef4752f7
Fix Ex. Handling in SnapshotsService#snapshots (#47507) (#47727)
We're needlessly wrapping a `SnapshotMissingException` which
itself is a `SnapshotException` when trying to load a missing
snapshot. This leads to failure #47442 which expects a
`SnapshotMissingException` in this case.

Closes #47442
2019-10-08 17:01:54 +02:00
Lee Hinman ea4069ca63 Add Snapshot Lifecycle Retention documentation (#47545)
* Add Snapshot Lifecycle Retention documentation

This commits adds API and general purpose documentation for SLM
retention.

Relates to #43663

* Fix docs tests

* Update default now that #47604 has been merged

* Update docs/reference/ilm/apis/slm-api.asciidoc

Co-Authored-By: Gordon Brown <gordon.brown@elastic.co>

* Update docs/reference/ilm/apis/slm-api.asciidoc

Co-Authored-By: Gordon Brown <gordon.brown@elastic.co>

* Update docs with feedback
2019-10-08 08:47:08 -06:00
Jake Landis b578059c90
Re-enable Watcher rest test (#47699) (#47705)
This test is believed to be fixed by #43939

closes #43988
2019-10-08 09:45:27 -05:00
David Turner 11093197f1
Fix deprecation docs formatting (#47725)
Relates #47443
2019-10-08 15:41:34 +02:00
Dimitris Athanasiou c1b0bfd74a
[7.x][ML] Unwrap exception causes before calling instanceof (#47676) (#47724)
When exceptions could be returned from another node, the exception
might be wrapped in a `RemoteTransportException`. In places where
we handled specific exceptions using `instanceof` we ought to unwrap
the cause first.

This commit attempts to fix this issue after searching code in the ML
plugin.

Backport of #47676
2019-10-08 16:02:47 +03:00
James Rodewig 9b4eec9887 [DOCS] Fix errors in rollover index API docs (#47702) 2019-10-08 08:37:26 -04:00
Alpar Torok 36d018c909 Convert RunTask to use testclusers, remove ClusterFormationTasks (#47572)
* Convert RunTask to use testclusers, remove ClusterFormationTasks

This PR adds a new RunTask and a way for it to start a
testclusters cluster out of band and block on it to replace
the old RunTask that used ClusterFormationTasks.

With this we can now remove ClusterFormationTasks.
2019-10-08 14:43:29 +03:00
Benjamin Trent d33dbf82d4
[7.x] [ML][Inference] adjusting definition object schema and validation (#47447) (#47673)
* [ML][Inference] adjusting definition object schema and validation (#47447)

* [ML][Inference] adjusting definition object schema and validation

* finalizing schema and fixing inference npe

* addressing PR comments

* fixing for backport
2019-10-08 07:11:05 -04:00
Henning Andersen ce91ba7c25 Dangling indices strip aliases (#47581)
Importing dangling indices with aliases risks breaking functionalities
using those aliases. For instance, writing to an alias may break if
there is no is_write_index indication on the existing alias and the
dangling index import adds a second index to the alias. Or an
application could have an assumption about the alias only ever pointing
to one index and suddenly seeing the alias also linked to an old index
could break it. With this change we strip aliases of the index meta data
found before importing a dangling index.
2019-10-08 12:09:30 +02:00
David Turner bb5f750ab4 Deprecate include_relocations setting (#47443)
Setting `cluster.routing.allocation.disk.include_relocations` to `false` is a
bad idea since it will lead to the kinds of overshoot that were otherwise fixed
in #46079. This commit deprecates this setting so it can be removed in the next
major release.
2019-10-08 08:19:04 +01:00
Hendrik Muhs 5e0e54f455
[Transform] move root endpoint to _transform with BWC layer (#47127) (#47682)
move the main endpoint to /_transform/ from /_data_frame/transforms/ with providing backwards compatibility and deprecation warnings
2019-10-08 08:59:01 +02:00
Lee Hinman 91988c7c26 Throw error retrieving non-existent SLM policy (#47679)
Previously when retrieving an SLM policy it would always return a 200
with `{}` in the body, even if the policy did not exist. This changes
that behavior to throw an error (similar to our other APIs) if a
policy doesn't exist.

This also adds a basic CRUD yml test for the behavior.

Resolves #47664
2019-10-07 19:54:04 -06:00
Lee Hinman 906be45209 Add a test for SLM retention with security enabled (#47608)
This enhances the existing SLM test using users/roles/etc to also test
that SLM retention works when security is enabled.

Relates to #43663
2019-10-07 19:52:09 -06:00
Ryan Ernst 8c6d1e0a08 Switch stored script example to script_score query (#47691)
The example use of a scoring script was incorrectly using a filter
script query, which has no scoring, and thus no _score variable
avialable. This commit converts the example doc to using the newer
script_score query.
2019-10-07 17:08:57 -07:00
Lisa Cawley 39ef795085
[DOCS] Cleans up links to security content (#47610) (#47703) 2019-10-07 15:23:19 -07:00
Jake Landis 74876811c2
Watcher - catch uncaught exception. (#47680) (#47695)
If a thread pool rejection exception happens, an alternative code
path is chosen to write history and delete the trigger. If an exception
happens during deletion of the trigger an exception may be thrown and not
caught.

This commit catches the exception and provides a meaning error message.

fixes #47008
2019-10-07 15:45:45 -05:00
Jake Landis a49a1b6994
Watcher remove assertion that is susceptible to a race conditi… (#47667)
When deactivating a watch, there is a chance that it is fully deactivated
and reporting as not running but the history is not fully written yet.
There is not a tight coupling between the associated watcher history
index and the deactivation. This test assumes that once a watch is
deactivated that all history is fully written in a very short time period.
If the Watch is deactivated, but the history is slow to write it can result
in a failing test.

This change removes an assertion that assumes that the deactivation of a watch
ensured the all of the watch history was written. There is still a minor race
condition with respect to the remaining history assertions. However, if the
history is slow to be written, it will allow the test to still passing.

fixes #47503
2019-10-07 12:07:10 -05:00