Commit Graph

5227 Commits

Author SHA1 Message Date
David Kyle 643ecf68b5
Remove InferenceConfigUpdate generic parameter (#55249) (#55301)
Simplify the code by removing the generic type from InferenceConfigUpdate which 
meant wildcard types were used in many places. Instead check the class type is
appropriate where used.
2020-04-16 13:44:53 +01:00
Ioannis Kakavas ac87c10039
[7.x] Fix responses for the token APIs (#54532) (#55278)
This commit fixes our behavior regarding the responses we
return in various cases for the use of token related APIs.
More concretely:

- In the Get Token API with the `refresh` grant, when an invalid
(already deleted, malformed, unknown) refresh token is used in the
body of the request, we respond with `400` HTTP status code
 and an `error_description` header with the message "could not
refresh the requested token".
Previously we would return erroneously return a  `401` with "token
malformed" message.

- In the Invalidate Token API, when using an invalid (already
deleted, malformed, unknown) access or refresh token, we respond
with `404` and a body that shows that no tokens were invalidated:
   ```
   {
     "invalidated_tokens":0,
     "previously_invalidated_tokens":0,
      "error_count":0
   }
   ```
   The previous behavior would be to erroneously return
a `400` or `401` ( depending on the case ).

- In the Invalidate Token API, when the tokens index doesn't
exist or is closed, we return `400` because we assume this is
a user issue either because they tried to invalidate a token
when there is no tokens index yet ( i.e. no tokens have
been created yet or the tokens index has been deleted ) or the
index is closed.

- In the Invalidate Token API, when the tokens index is
unavailable, we return a `503` status code because
we want to signal to the caller of the API that the token they
tried to invalidate was not invalidated and we can't be sure
if it is still valid or not, and that they should try the request
again.

Resolves: #53323
2020-04-16 14:05:55 +03:00
David Roberts 8489f8c121
[ML] Add test to prove categorization state written after lookback (#55297)
When a datafeed transitions from lookback to real-time we request
that state is persisted from the autodetect process in the
background.

This PR adds a test to prove that for a categorization job the
state that is persisted includes the categorization state.
Without the fix from elastic/ml-cpp#1137 this test fails.  After
that C++ fix is merged this test should pass.

Backport of #55243
2020-04-16 11:55:18 +01:00
Dimitris Athanasiou 6c9e1fecc5
[7.x][ML] Mark task as completed when DFA job is stopped while reindexing (#55286) (#55290)
After #54650 we catch `TaskCancelledException` when we wait for
reindexing to complete as it may be thrown. However, when that happens
we do not mark the task as completed. This results in the stop request
never returning and the failures we saw in #55068.

Closes #55068

Backport of #55286
2020-04-16 13:08:54 +03:00
David Roberts ac11dd619c
Only ship Linux binaries for the correct architecture (#55280)
Following elastic/ml-cpp#1135 there are now Linux binaries
for both x86_64 and aarch64.  The code that finds the
correct binaries to ship with each distribution was
including both on every Linux distribution.  This change
alters that logic to consider the architecture as well
as the operating system.

Also, there is no need to disable ML on aarch64 now that
we have the native binaries available.  ML is still not
supported on aarch64, but the processes at least run up
and work at a superficial level.

Backport of #55256
2020-04-16 09:45:52 +01:00
David Roberts 5de6ddfef2 Mute ClassificationIT.testSetUpgradeMode_ExistingTaskGetsUnassigned
Due to https://github.com/elastic/elasticsearch/issues/55221
2020-04-16 09:03:46 +01:00
Jay Modi 2d9e3c7794
Start resource watcher service early (#55275)
The ResourceWatcherService enables watching of files for modifications
and deletions. During startup various consumers register the files that
should be watched by this service. There is behavior that might be
unexpected in that the service may not start polling until later in the
startup process due to the use of lifecycle states to control when the
service actually starts the jobs to monitor resources. This change
removes this unexpected behavior so that upon construction the service
has already registered its tasks to poll resources for changes. In
making this modification, the service no longer extends
AbstractLifecycleComponent and instead implements the Closeable
interface so that the polling jobs can be terminated when the service
is no longer required.

Relates #54867
Backport of #54993
2020-04-15 20:45:39 -06:00
Jason Tedor cad1a3b0ad
Fix imports in CCRFeatureSet
This commit fixes some imports that were mixed up during a
backport. Because, backports.
2020-04-15 19:37:25 -04:00
Jason Tedor a18faacf1b
Make feature usage version aware (#55246)
Today we indiscriminately serialize these independent of the version on
the stream, even though the other side might not understand a new
feature set usage that we have added. For example, if we add feature set
usage in 7.7 for EQL, in a mixed cluster context if a request is sent to
an old coordinating node, but the master is a new version, then it would
attempt to serialize the usage information for the new feature back to
the old coordinating node, who will blow up on the unrecognized named
writeable. This commit addresses this by making feature usage version
aware, and only serializing those that the other side would understand.
2020-04-15 19:24:47 -04:00
William Brafford 2ba3be9db6
Remove deprecated third-party methods from tests (#55255) (#55269)
I've noticed that a lot of our tests are using deprecated static methods
from the Hamcrest matchers. While this is not a big deal in any
objective sense, it seems like a small good thing to reduce compilation
warnings and be ready for a new release of the matcher library if we
need to upgrade. I've also switched a few other methods in tests that
have drop-in replacements.
2020-04-15 17:54:47 -04:00
Ryan Ernst 29b70733ae
Use task avoidance with forbidden apis (#55034)
Currently forbidden apis accounts for 800+ tasks in the build. These
tasks are aggressively created by the plugin. In forbidden apis 3.0, we
will get task avoidance
(https://github.com/policeman-tools/forbidden-apis/pull/162), but we
need to ourselves use the same task avoidance mechanisms to not trigger
these task creations. This commit does that for our foribdden apis
usages, in preparation for upgrading to 3.0 when it is released.
2020-04-15 13:27:53 -07:00
Henning Andersen b3eb57a094 CCR: Test follow on top of closed index (#54956)
Added testing of following on top of a closed index.
This could for instance be the old leader index in
cases where leader and follower clusters have been
swapped.
2020-04-15 20:13:32 +02:00
Mark Vieira 5d4bc8aea6
Mute ModelLoadingServiceTests.testMaxCachedLimitReached 2020-04-15 10:25:51 -07:00
Ignacio Vera a677b63daa
Upgrade to lucene 8.5.1 release (#55229) (#55235)
Upgrade to lucene 8.5.1 release that contains a bug fix for a bug that might introduce index corruption when deleting data from an index that was previously shrunk.
2020-04-15 17:35:42 +02:00
Benjamin Trent 8ff2cbf1a3
[7.x] [ML] adding prediction_field_type to inference config (#55128) (#55230)
* [ML] adding prediction_field_type to inference config (#55128)

Data frame analytics dynamically determines the classification field type. This field type then dictates the encoded JSON that is written to Elasticsearch. 

Inference needs to know about this field type so that it may provide the EXACT SAME predicted values as analytics. 

Here is added a new field `prediction_field_type` which indicates the desired type. Options are: `string` (DEFAULT), `number`, `boolean` (where close_to(1.0) == true, false otherwise). 

Analytics provides the default `prediction_field_type` when the model is created from the process.
2020-04-15 09:45:22 -04:00
Armin Braun 2f91e2aab7
Fix Race in Snapshot Abort (#54873) (#55233)
We can be a little more efficient when aborting a snapshot. Since we know the new repository
data after finalizing the aborted snapshot when can pass it down to the snapshot completion listeners.
This way, we don't have to fork off to the snapshot threadpool to get the repository data when the listener completes and can directly submit the delete task with high priority straight from the cluster state thread.
2020-04-15 15:42:15 +02:00
Przemysław Witek b5fe565c89
Add log line that will help debug item failures during multi search request (#55220) (#55227) 2020-04-15 15:01:17 +02:00
Hendrik Muhs 9ec9866acb [Transform] simplify TransformConfigUpdate (#55224)
removes the unnecessary ToXContent method in TransformConfigUpdate
2020-04-15 13:22:50 +02:00
Yannick Welsch d1123281b1 Use unlimited cache size by default (#55218)
Sets the default cache size for searchable snapshots to unlimited, which, for testing purposes,
is a better default than the 1GB that we currently have.
2020-04-15 12:20:51 +02:00
David Kyle bdf0eab78d
[7.x] Fix non-deterministic behaviour in ModelLoadingServiceTests (#55008) (#55213) 2020-04-15 11:09:12 +01:00
Ioannis Kakavas 0f51934bcf
[7.x] Add support for more named curves (#55179) (#55211)
We implicitly only supported the prime256v1 ( aka secp256r1 )
curve for the EC keys we read as PEM files to be used in any
SSL Context. We would not fail when trying to read a key
pair using a different curve but we would silently assume
that it was using `secp256r1` which would lead to strange
TLS handshake issues if the curve was actually another one.

This commit fixes that behavior in that it
supports parsing EC keys that use any of the named curves
defined in rfc5915 and rfc5480 making no assumptions about
whether the security provider in use supports them (JDK8 and
higher support all the curves defined in rfc5480).
2020-04-15 12:33:40 +03:00
Dimitris Athanasiou 4000138105
[7.x][ML] Add debug logging for outlier detection stop and restart integ test (#55169) (#55202)
To understand the failures in #55068

Backport of #55169
2020-04-15 10:40:38 +03:00
Lee Hinman 36f6e542a2 Ignore ILM indices in the TerminalPolicyStep (#55184)
Prior to the change in #51631 indices were moved to the `TerminalPolicyStep` when their ILM actions
had completed. Once we switched ILM to stop in the last policy configured, these steps because
inaccessible from the policy's perspective. This meant that indices upgraded from ES prior to 7.7.0
could see the following error spammed in their logs every 10 minutes (by default) for every index in
this state:

```
[2020-04-14T15:52:23,764][ERROR][o.e.x.i.IndexLifecycleRunner] [midgar] current step [{"phase":"completed","action":"completed","name":"completed"}] for index [foo] with policy [full] is not recognized
```

This changes the runner to ignore these steps, which is what is desired anyway since the index is
already in the terminal phase.
2020-04-14 16:45:03 -06:00
Igor Motov 1754e50cbd
[7.x] Add analytics plugin usage stats to _xpack/usage (#54911) (#55162)
Adds analytics plugin usage stats to _xpack/usage.

Closes #54847
2020-04-14 17:03:14 -04:00
Mark Vieira ce85063653
[7.x] Re-add origin url information to publish POM files (#55173) 2020-04-14 13:24:15 -07:00
Albert Zaharovits 7f35b927d1
Preserve parent task id for ml transform (#55124)
This change ensures that internal client requests spawned by the
transform persistent task executor and that use the end user security
credentials, have the parent task id assigned. The objective here is
to permit auditing (as well as tracking for debugging purposes) of all
the end-user requests executed on its behalf by persistent tasks.
Because transform tasks already implements graceful shutdown of the
child tasks, this change does not interfere with that by opting out of
the persistent task cancellation of child tasks.

Relates #55046 #54943 #52314
Closes #54957
2020-04-14 18:43:47 +03:00
Andrei Dan d918ef0da9
[Tests] Enable searchable_snapshots for non-snapshot builds (#55151) (#55157)
Fixes https://github.com/elastic/elasticsearch/issues/55050

(cherry picked from commit 13391ceff1cbf6db69706c5f46127b6ff8850a1f)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-04-14 16:13:39 +01:00
Albert Zaharovits 5998486ce8
Refactor AuditTrail for TransportRequests instead of TransportMessage (#55141)
This commit refactors the `AuditTrail` to use the `TransportRequest` as a parameter
for all its audit methods, instead of the current `TransportMessage` super class.

The goal is to gain access to the `TransportRequest#parentTaskId` member,
so that it can be audited. The `parentTaskId` is used internally when spawning tasks
that handle transport requests; in this way tasks across nodes are related by the
same parent task.

Relates #52314
2020-04-14 16:53:59 +03:00
Yannick Welsch a610513ec7 Provide repository-level stats for searchable snapshots (#55051)
Provides basic repository-level stats that will allow us to get some insight into how many
requests are actually being made by the underlying SDK. Currently only tracks GET and LIST
calls for S3 repositories. Most of the code is unfortunately boiler plate to add a new endpoint
that will help us better understand some of the low-level dynamics of searchable snapshots.
2020-04-14 14:34:08 +02:00
Przemysław Witek d5bb574e1e
[7.x] Unassign DFA tasks in SetUpgradeModeAction (#54523) (#55143) 2020-04-14 14:09:02 +02:00
Igor Motov 8a669dc9b7
EQL: Add cascading search cancellation (#54843)
EQL search cancellation now propagates cancellation to underlying search
operations.

Relates to #49638
2020-04-14 08:06:02 -04:00
David Turner 87e8367ece Fix testCreateAndRestoreSearchableSnapshot (#55147)
Fixes a couple of related failures in SearchableSnapshotsIntegTests.

Firstly, we were not correctly accounting for the case where the cache was so
small that some/all files were read directly; fixed this by only asserting that
the cache is definitely used if the corresponding node has a cache that's large
enough to hold the whole index.

Secondly, we were not permitting shards to be completely empty, which might be
the case (rarely) if there were not many documents indexed and the distribution
of IDs was a bit unlucky; fixed this by asserting that we get stats for at
least one file for the whole index, rather than for each shard separately.

Closes #55126
2020-04-14 11:54:46 +01:00
Ioannis Kakavas 70cc1d57fb Mute failing test (#54734) 2020-04-14 10:18:33 +01:00
Mark Vieira cb58725164 Mute InferenceIngestIT.testPipelineIngest 2020-04-14 09:27:56 +01:00
debadair e8fa539bea
[DOCS] Removed obsolete warning about no way to securely store passwords (#55133) (#55140)
* [DOCS] Removed obsolete warning about no way to securely store passwords.

* Update x-pack/docs/en/watcher/actions/email.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>
2020-04-13 21:38:32 -07:00
William Brafford 52bebec51f
NodeInfo response should use a collection rather than fields (#54460) (#55132)
This is a first cut at giving NodeInfo the ability to carry a flexible
list of heterogeneous info responses. The trick is to be able to
serialize and deserialize an arbitrary list of blocks of information. It
is convenient to be able to deserialize into usable Java objects so that
we can aggregate nodes stats for the cluster stats endpoint.

In order to provide a little bit of clarity about which objects can and
can't be used as info blocks, I've introduced a new interface called
"ReportingService."

I have removed the hard-coded getters (e.g., getOs()) in favor of a
flexible method that can return heterogeneous kinds of info blocks
(e.g., getInfo(OsInfo.class)). Taking a class as an argument removes the
need to cast in the client code.
2020-04-13 17:18:39 -04:00
Ryan Ernst ae14d1661e
Replace license check isAuthAllowed with isSecurityEnabled (#54547) (#55082)
The isAuthAllowed() method for license checking is used by code that
wants to ensure security is both enabled and available. The enabled
state is dynamic and provided by isSecurityEnabled(). But since security
is available with all license types, an check on the license level is
not necessary. Thus, this change replaces isAuthAllowed() with calling
isSecurityEnabled().
2020-04-13 12:26:39 -07:00
Benjamin Trent d32f6fed1d
[ML] inference only persist if there are stats (#54752) (#55121)
We needlessly send documents to be persisted. If there are no stats added, then we should not attempt to persist them.

Also, this PR fixes the race condition that caused issue:  https://github.com/elastic/elasticsearch/issues/54786
2020-04-13 14:03:05 -04:00
Igor Motov 51c6f69e02
[7.x] Add support for filters to T-Test aggregation (#54980) (#55066)
Adds support for filters to T-Test aggregation. The filters can be used to
select populations based on some criteria and use values from the same or
different fields.

Closes #53692
2020-04-13 12:28:58 -04:00
Jake Landis a2fafa6af4
[7.x] Lazy test cluster module and plugins (#54852) (#55087)
This change converts the module and plugin parameters
for testClusters to be lazy. Meaning that the values
are not resolved until they are actually used. This
removes the requirement to use project.afterEvaluate to
be able to resolve the bundle artifact.

Note - this does not completely remove the need for afterEvaluate
since it is still needed for the custom resource extension.
2020-04-13 10:53:35 -05:00
Igor Motov 6861295706
Further improve InternalTTestTests (#55081)
A small follow-up to #54910. Now that we can generated consistent set of
internal aggs to reduce, we no longer need to keep agg parameters as class
variables.

Related to #54910
2020-04-13 10:26:23 -04:00
Benjamin Trent c5c7ee9d73
[7.x] [ML] Start gathering and storing inference stats (#53429) (#54738)
* [ML] Start gathering and storing inference stats (#53429)

This PR enables stats on inference to be gathered and stored in the `.ml-stats-*` indices.

Each node + model_id will have its own running stats document and these will later be summed together when returning _stats to the user.

`.ml-stats-*` is ILM managed (when possible). So, at any point the underlying index could change. This means that a stats document that is read in and then later updated will actually be a new doc in a new index. This complicates matters as this means that having a running knowledge of seq_no and primary_term is complicated and almost impossible. This is because we don't know the latest index name.

We should also strive for throughput, as this code sits in the middle of an ingest pipeline (or even a query).
2020-04-13 08:15:46 -04:00
Ioannis Kakavas 7a8a66d9ae
[7.x] Fix ReloadSecureSettings API to consume password (#54771) (#55059)
The secure_settings_password was never taken into consideration in
the ReloadSecureSettings API. This commit fixes that and adds
necessary REST layer testing. Doing so, it also:

- Allows TestClusters to have a password protected keystore
so that it can be set for tests.
- Adds a parameter to the run task so that elastisearch can
be run with a password protected keystore from source.
2020-04-13 09:50:55 +03:00
Andrei Dan c0406f78b7
ILM add cluster update timeout on step retry (#54878) (#55022)
This commits adds a timeout when moving ILM back on to a failed step. In
case the master is struggling with processing the cluster update requests
these ones will expire (as we'll send them again anyway on the next ILM
loop run)

ILM more descriptive source messages for cluster updates

Use the configured ILM step master timeout setting

(cherry picked from commit ff6c5ed16616eadfcddd9c95317d370f0d126583)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-04-11 10:13:31 +01:00
Andrei Dan b8df265b42
[7.x] ILM use Priority.IMMEDIATE for stop ILM cluster update (#54909) (#55018)
* ILM use Priority.IMMEDIATE for stop ILM cluster update (#54909)

This changes the priority of the cluster state update that stops ILM
altogether to `IMMEDIATE`. We've chosen to change this as it can be useful to
temporarily stop ILM if a cluster is overwhelmed, but a `NORMAL`
priority can see the "stop ILM update" not make it up the tasks queue.

On the same note, we're keeping the `start ILM` cluster update priority
to `NORMAL` on purpose such that we only start `ILM` if the cluster can
handle it.

(cherry picked from commit d67df3a7cd2a8619c2c9efac4dde3ba83271f2fa)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-04-11 10:12:35 +01:00
Albert Zaharovits f22004a262
Preserve parent task id for data frame analytics (#55046)
This change makes sure that all internal client requests spawned by the
data frame analytics persistent task executor and that use the end user
security credentials, have the parent task id assigned. The objective here
is to permit auditing (as well as tracking for debugging purposes) of all
the end-user requests executed on its behalf by persistent tasks.
Because data frame analytics taks already implements graceful shutdown
of child tasks, this change does not interfere with it by opting out of
the persistent task cancellation of child tasks.

Relates #54943 #52314
2020-04-10 22:27:21 +03:00
Mark Vieira 5d4ddf9146
Fixes for IntelliJ IDEA 2020.1 support (#55077) 2020-04-10 11:57:48 -07:00
Nik Everett c00811f3a3
Make some agg tests easier to read (#54954) (#55079)
We added a fancy method to provide random realistic test data to the
reduction tests in #54910. This uses that to remove some of the more
esoteric machinations in the agg tests. This will marginally increase
the coverage of the serialiation tests and, more importantly, remove
some mysterious value generation code that only really made sense for
random reduction tests but was used all over the place. It doesn't, on
the other hand, make the tests shorter. Just *hopefully* more clear.

I only cleaned up a few tests this way. If we like this it'd probably be
worth grabbing others.
2020-04-10 14:15:30 -04:00
Luca Cavanna 93c39ad4e7 Async search: create internal index only before storing initial response (#54619)
We currently create the .async-search index if necessary before performing any action (index, update or delete). Truth is that this is needed only before storing the initial response. The other operations are either update or delete, which will anyways not find the document to update/delete even if the index gets created when missing. This also caused `testCancellation` failures as we were trying to delete the document twice from the .async-search index, once from `TransportDeleteAsyncSearchAction` and once as a consequence of the search task being completed. The latter may be called after the test is completed, but before the cluster is shut down and causing problems to the after test checks, for instance if it happens after all the indices have been cleaned up. It is totally fine to try to delete a response that is no longer found, but not quite so if such call will also trigger an index creation.

With this commit we remove all the calls to createIndexIfNecessary from the update/delete operation, and we leave one call only from storeInitialResponse which is where the index is expected to be created.

Closes #54180
2020-04-10 18:24:05 +02:00
Ross Wolf 96a903b17f
EQL: Add string function (#54470)
* EQL: Add string() function
* EQL: Reorder queryfolder_tests
* EQL: Add test queries
* EQL: Fix InternalEqlScriptUtils.string and test case
* EQL: Fix testStringFunctionWithText error message
* EQL: Flatten ToStringFunctionPipe.equals
* EQL: Reorder painless whitelist
* EQL: Address feedback and remove string(null) handling
* EQL: Move string(pid) test over
* EQL: Rename source -> value
2020-04-10 09:48:29 -06:00