3763 Commits

Author SHA1 Message Date
Tim Vernum
2e5f2dd1e1
Deprecate misconfigured SSL server config (#49280)
This commit adds a deprecation warning when starting
a node where either of the server contexts
(xpack.security.transport.ssl and xpack.security.http.ssl)
meet either of these conditions:

1. The server lacks a certificate/key pair (i.e. neither
   ssl.keystore.path not ssl.certificate are configured)
2. The server has some ssl configuration, but ssl.enabled is not
   specified. This new validation does not care whether ssl.enabled is
   true or false (though other validation might), it simply makes it
   an error to configure server SSL without being explicit about
   whether to enable that configuration.

Backport of: #45892
2019-11-22 12:14:55 +11:00
Benjamin Trent
a7477ad7c3
[7.x] [ML][Inference] compressing model definition and lazy parsing (#49269) (#49446)
* [ML][Inference] compressing model definition and lazy parsing (#49269)

* [ML][Inference] compressing model definition and lazy parsing

* addressing PR comments

* adding commons io

* implementing simplified bounded stream

* adjusting for type inclusion
2019-11-21 15:32:32 -05:00
Benjamin Trent
d9835f7fb4
[ML] Fix r_squared eval when variance is 0 (#49439) (#49445) 2019-11-21 11:22:16 -05:00
Benjamin Trent
d41b2e3f38
[ML][Inference] allowing per-model licensing (#49398) (#49435)
* [ML][Inference] allowing per-model licensing

* changing to internal action + removing pre-mature opt
2019-11-21 09:46:34 -05:00
Przemysław Witek
c7ac2011eb
[7.x] Implement accuracy metric for multiclass classification (#47772) (#49430) 2019-11-21 15:01:18 +01:00
Martijn van Groningen
d59ea64ccd
Monitoring should wait with collecting data when cluster service is started. (#49426)
Backport of #48277

Otherwise integration tests may fail if the monitoring interval is low:
```
[2019-10-21T09:57:25,527][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [integTest-0] fatal error in thread [elasticsearch[integTest-0][generic][T#4]], exiting
java.lang.AssertionError: initial cluster state not set yet
        at org.elasticsearch.cluster.service.ClusterApplierService.state(ClusterApplierService.java:208) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
        at org.elasticsearch.cluster.service.ClusterService.state(ClusterService.java:125) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
        at org.elasticsearch.xpack.monitoring.MonitoringService$MonitoringExecution$1.doRun(MonitoringService.java:231) ~[?:?]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:703) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at java.lang.Thread.run(Thread.java:835) [?:?]
```

I ran into this when lowering the monitoring interval when investigating
enrich monitoring test: #48258
2019-11-21 14:22:41 +01:00
Hendrik Muhs
c3e4405ddf
[7.x][Transform] Transform fix force stop race condition (#49249) (#49420)
fix force stopping transform if indexer state hasn't been written and/or is set to STOPPED. In certain situations the transform could not be stopped, which means the task could not be removed. Introduces improved abstraction in order to better test state handling in future.
2019-11-21 13:52:14 +01:00
Andrei Dan
010c3de47e
Slm set operation mode to RUNNING on first run (#49236) (#49425)
* SLM set the operation mode to RUNNING on first run

Set the SLM operation mode to RUNNING when setting the first SLM lifecycle
policy. Historically, SLM was not decoupled from ILM but now they are
independent components. Setting the SLM operation mode to what the ILM running
mode was when we set the first SLM lifecycle policy was a remain from those
times.

* SLM update package info

* SLM suppress unusued warning

* SLM use logger for the correct class

* SLM Add integration test for operation mode

* Use ESSingleNodeTestCase instead of ESIntegTestCase

(cherry picked from commit 4ad3d93f89d03bf9a25685a990d1a439f33ce0e6)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2019-11-21 11:41:32 +00:00
Nhat Nguyen
fec22130c2 Improve error message when pausing index (#48915)
Throw an appropriate error message when the follower index is not found
or is a regular index.
2019-11-20 15:58:44 -05:00
Bogdan Pintea
8c2ab8bb72 SQL:Docs: add the PIVOT clause to SELECT section (#49129)
The PR adds the documentation on the PIVOT clause.

(cherry picked from commit a55b36065e6496c44b6e3191296931d477a8e5f5)
2019-11-20 18:21:06 +01:00
David Roberts
20558cf61c [ML] Fix simultaneous stop and force stop datafeed (#49367)
If a datafeed is stopped normally and force stopped at the same
time then it is possible that the force stop removes the
persistent task while the normal stop is performing actions.
Currently this causes the normal stop to error, but since
stopping a stopped datafeed is not an error this doesn't make
sense. Instead the force stop should just take precedence.

This is a followup to #49191 and should really have been
included in the changes in that PR.
2019-11-20 12:52:47 +00:00
Mayya Sharipova
e3da60c23d Increase the number of vector dims to 2048 (#46895) 2019-11-20 07:47:33 -05:00
Przemysław Witek
9c0ec7ce23
[7.x] Make AnalyticsProcessManager class more robust (#49282) (#49356) 2019-11-20 10:08:16 +01:00
Dimitris Athanasiou
4d6e037e90
[7.x][ML] Extract creation of DFA field extractor into a factory (#49315) (#49329)
This commit moves the async calls required to retrieve the components
that make up `ExtractedFieldsExtractor` out of `DataFrameDataExtractorFactory`
and into a dedicated `ExtractorFieldsExtractorFactory` class.

A few more refactorings are performed:

  - The detector no longer needs the results field. Instead, it knows
  whether to use it or not based on whether the task is restarting.
  - We pass more accurately whether the task is restarting or not.
  - The validation of whether fields that have a cardinality limit
  are valid is now performed in the detector after retrieving the
  respective cardinalities.

Backport of #49315
2019-11-20 10:02:42 +02:00
Przemysław Witek
42bb8ae525
[7.x] Extract indexData method out of RegressionIT tests (#49306) (#49313) 2019-11-19 22:47:12 +01:00
Mark Tozzi
17358b5af7
(refactor) Extract Empty/Script/Missing ValuesSource behavior to an interface (#48320) (#49330)
This is a pure code rearrangement refactor.  Logic for what specific ValuesSource instance to use for a given type (e.g. script or field) moved out of ValuesSourceConfig and into CoreValuesSourceType (previously just ValueSourceType; we extract an interface for future extensibility).  ValueSourceConfig still selects which case to use, and then the ValuesSourceType instance knows how to construct the ValuesSource for that case.
2019-11-19 16:44:29 -05:00
Jay Modi
eed4cd25eb
ThreadPool and ThreadContext are not closeable (#43249) (#49273)
This commit changes the ThreadContext to just use a regular ThreadLocal
over the lucene CloseableThreadLocal. The CloseableThreadLocal solves
issues with ThreadLocals that are no longer needed during runtime but
in the case of the ThreadContext, we need it for the runtime of the
node and it is typically not closed until the node closes, so we miss
out on the benefits that this class provides.

Additionally by removing the close logic, we simplify code in other
places that deal with exceptions and tracking to see if it happens when
the node is closing.

Closes #42577
2019-11-19 13:15:16 -07:00
Andrei Dan
19780e20ba
Handle failure to retrieve ILM policy step better (#49193) (#49316)
This commit wraps the calls to retrieve the current step in a try/catch
so that the exception does not bubble up. Instead, step info is added
containing the exception to the existing step.

Semi-related to #49128

(cherry picked from commit 72530f8a7f40ae1fca3704effb38cf92daf29057)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2019-11-19 17:14:46 +00:00
Armin Braun
0acba44a2e
Make Repository.getRepositoryData an Async API (#49299) (#49312)
This API call in most implementations is fairly IO heavy and slow
so it is more natural to be async in the first place.
Concretely though, this change is a prerequisite of #49060 since
determining the repository generation from the cluster state
introduces situations where this call would have to wait for other
operations to finish. Doing so in a blocking manner would break
`SnapshotResiliencyTests` and waste a thread.
Also, this sets up the possibility to in the future make use of async IO
where provided by the underlying Repository implementation.

In a follow-up `SnapshotsService#getRepositoryData` will be made async
as well (did not do it here, since it's another huge change to do so).
Note: This change for now does not alter the threading behaviour in any way (since `Repository#getRepositoryData` isn't forking) and is purely mechanical.
2019-11-19 16:49:12 +01:00
Marios Trivyzas
fd1bb4a33a SQL: Fix issue with mins & hours for DATEDIFF (#49252)
Previously, DATEDIFF for minutes and hours was doing a
rounding calculation using all the time fields (secs, msecs/micros/nanos).
Instead it should first truncate the 2 dates to the respective field (mins or hours)
zeroing out all the more detailed time fields and then make the subtraction.

(cherry picked from commit 124cd18e20429e19d52fd8dc383827ea5132d428)
2019-11-19 14:25:28 +01:00
Benjamin Trent
19602fd573
[ML][Inference] changing setting to be memorySizeSettting (#49259) (#49302) 2019-11-19 07:56:40 -05:00
Przemysław Witek
38aec2e298
Relax assertions related to datafeed timing stats in .yml test (#49285) (#49291) 2019-11-19 12:50:14 +01:00
David Roberts
a5204c1c80
[ML] Fixes for stop datafeed edge cases (#49284)
The following edge cases were fixed:

1. A request to force-stop a stopping datafeed is no longer
   ignored.  Force-stop is an important recovery mechanism
   if normal stop doesn't work for some reason, and needs
   to operate on a datafeed in any state other than stopped.
2. If the node that a datafeed is running on is removed from
   the cluster during a normal stop then the stop request is
   retried (and will likely succeed on this retry by simply
   cancelling the persistent task for the affected datafeed).
3. If there are multiple simultaneous force-stop requests for
   the same datafeed we no longer fail the one that is
   processed second.  The previous behaviour was wrong as
   stopping a stopped datafeed is not an error, so stopping
   a datafeed twice simultaneously should not be either.

Backport of #49191
2019-11-19 10:51:46 +00:00
Julie Tibshirani
a0ee6c8f7e
Add telemetry for flattened fields. (#48972) (#49125)
Currently we just record the number of flattened fields defined in the mappings.
2019-11-18 12:29:42 -08:00
Benjamin Trent
eefe7688ce
[7.x][ML] ML Model Inference Ingest Processor (#49052) (#49257)
* [ML] ML Model Inference Ingest Processor (#49052)

* [ML][Inference] adds lazy model loader and inference (#47410)

This adds a couple of things:

- A model loader service that is accessible via transport calls. This service will load in models and cache them. They will stay loaded until a processor no longer references them
- A Model class and its first sub-class LocalModel. Used to cache model information and run inference.
- Transport action and handler for requests to infer against a local model
Related Feature PRs:

* [ML][Inference] Adjust inference configuration option API (#47812)

* [ML][Inference] adds logistic_regression output aggregator (#48075)

* [ML][Inference] Adding read/del trained models (#47882)

* [ML][Inference] Adding inference ingest processor (#47859)

* [ML][Inference] fixing classification inference for ensemble (#48463)

* [ML][Inference] Adding model memory estimations (#48323)

* [ML][Inference] adding more options to inference processor (#48545)

* [ML][Inference] handle string values better in feature extraction (#48584)

* [ML][Inference] Adding _stats endpoint for inference (#48492)

* [ML][Inference] add inference processors and trained models to usage (#47869)

* [ML][Inference] add new flag for optionally including model definition (#48718)

* [ML][Inference] adding license checks (#49056)

* [ML][Inference] Adding memory and compute estimates to inference (#48955)

* fixing version of indexed docs for model inference
2019-11-18 13:19:17 -05:00
Armin Braun
25cc8e3663
Fix RepoCleanup not Removed on Master-Failover (#49217) (#49239)
The logic for `cleanupInProgress()` was backwards everywhere (method itself and
all but one user). Also, we weren't checking it when removing a repository.

This lead to a bug (in the one spot that didn't use the method backwards) that prevented
the cleanup cluster state entry from ever being removed from the cluster state if master
failed over during the cleanup process.

This change corrects the backwards logic, adds a test that makes sure the cleanup
is always removed and adds a check that prevents repository removal during cleanup
to the repositories service.

Also, the failure handling logic in the cleanup action was broken. Repeated invocation would lead to the cleanup being removed from the cluster state even if it was in progress. Fixed by adding a flag that indicates whether or not any removal of the cleanup task from the cluster state must be executed. Sorry for mixing this in here, but I had to fix it in the same PR, as the first test (for master-failover) otherwise would often just delete the blocked cleanup action as a result of a transport master action retry.
2019-11-18 16:44:09 +01:00
Przemysław Witek
5f9965e4b8
Lower minimum model memory limit value from 1MB to 1kB. (#49227) (#49242) 2019-11-18 14:58:20 +01:00
Hendrik Muhs
ca912624ec [Transform] improve error handling of script errors (#48887)
improve error handling for script errors, treating it as irrecoverable errors which puts the task
immediately into failed state, also improves the error extraction to properly report the script 
error.

fixes #48467
2019-11-18 10:24:39 +01:00
Tanguy Leroux
fcac3fbfd9 AutoFollowIT should not rely on assertBusy but should use latches instead (#49141)
AutoFollowIT relies on assertBusy() calls to wait for a given number of 
leader indices to be created but this is prone to failures on CI. Instead, 
we should use latches to indicate when auto-follow patterns must be 
paused and resumed.
2019-11-18 09:40:56 +01:00
Dimitris Athanasiou
805c31e19e
[7.x][ML] Avoid NPE when node load is calculated on job assignment (#49186) (#49214)
This commit fixes a NPE problem as reported in #49150.
But this problem uncovered that we never added proper handling
of state for data frame analytics tasks.

In this commit we improve the `MlTasks.getDataFrameAnalyticsState`
method to handle null tasks and state tasks properly.

Closes #49150

Backport of #49186
2019-11-18 10:33:07 +02:00
Przemysław Witek
150db2b544
Throw an exception when memory usage estimation endpoint encounters empty data frame. (#49143) (#49164) 2019-11-18 07:52:57 +01:00
Jason Tedor
60d1d67aac
CCR should auto-retry rejected execution exceptions (#49213)
If CCR encounters a rejected execution exception, today we treat this as
fatal. This is not though, as the stuffed queue could drain. Requiring
an administrator to manually restart the follow tasks that faced such an
exception is a burden. This commit addresses this by making CCR
auto-retry on rejected execution exceptions.
2019-11-17 12:48:46 -05:00
Mayya Sharipova
0e933a093d Add index name to search requests (#49175)
We can't guarantee expected request failures if search request is across
many indexes, as if expected shards fail, some indexes may return 200.

closes #47743
2019-11-15 16:39:18 -05:00
Jay Modi
57f57227ac
Clean up static web server in sql-client tests (#49187) (#49197)
The JdbcHttpClientRequestTests and HttpClientRequestTests classes both
hold a static reference to a mock web server that internally uses the
JDKs built-in HttpServer, which resides in a sun package that the
RamUsageEstimator does not have access to. This causes builds that use
a runtime of Java 8 to fail since the StaticFieldsInvariantRule is run
when Java 8 is used.

Relates #41526
Relates #49105
2019-11-15 13:02:21 -07:00
Lee Hinman
680436dd0d
[7.x] Don't halt policy execution on policy trigger exception… (#49171)
When triggered either by becoming master, a new cluster state, or a
periodic schedule, an ILM policy execution through
`maybeRunAsyncAction`, `runPolicyAfterStateChange`, or
`runPeriodicStep` throwing an exception will cause the loop the
terminate. This means that any indices that would have been processed
after the index where the exception was thrown will not be processed by
ILM.

For most execution this is not a problem because the actual running of
steps is protected by a try/catch that moves the index to the ERROR step
in the event of a problem. If an exception occurs prior to step
execution (for example, in fetching and parsing the current
policy/step) however, it causes the loop termination previously
mentioned.

This commit wraps the invocation of the methods specified above in a
try/catch block that provides better logging and does not bubble the
exception up.
2019-11-15 09:22:37 -07:00
Albert Zaharovits
89b3c32b40
Audit log filter and marker (#49145)
This adds a log marker and a marker filter for the audit log.

Closes #47251
2019-11-15 08:44:09 -05:00
Christos Soulios
d9f0245b10
[7.x] Implement stats aggregation for string terms (#49097)
Backport of #47468 to 7.x

This PR adds a new metric aggregation called string_stats that operates on string terms of a document and returns the following:

min_length: The length of the shortest term
max_length: The length of the longest term
avg_length: The average length of all terms
distribution: The probability distribution of all characters appearing in all terms
entropy: The total Shannon entropy value calculated for all terms

This aggregation has been implemented as an analytics plugin.
2019-11-15 14:36:21 +02:00
Andrei Dan
085d08cfd1
ILM Remove obsolete testRolloverAlreadyExists (#49104) (#49144)
The rollover action is now a retryable step (see #48256)
so ILM will keep retrying until it succeeds as opposed to stopping and
moving the execution in the ERROR step.

Fixes #49073

(cherry picked from commit 3ae90898121b43032ec8f3b50514d93a86e14d0f)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>

# Conflicts:
#	x-pack/plugin/ilm/qa/multi-node/src/test/java/org/elasticsearch/xpack/ilm/TimeSeriesLifecycleActionsIT.java
2019-11-15 12:06:22 +00:00
Ioannis Kakavas
f5f0e1366a
Handle unexpected/unchecked exceptions correctly (#49080) (#49137)
Ensures that methods that are called from different threads ( i.e.
from the callbacks of org.apache.http.concurrent.FutureCallback )
catch `Exception` instead of only the expected checked exceptions.

This resolves a bug where OpenIdConnectAuthenticator#mergeObjects
would throw an IllegalStateException that was never caught causing
the thread to hang and the listener to never be called. This would
in turn cause Kibana requests to authenticate with OpenID Connect
to timeout and fail without even logging anything relevant.

This also guards against unexpected Exceptions that might be thrown
by invoked library methods while performing the necessary operations
in these callbacks.
2019-11-15 11:54:08 +02:00
James Baiera
6bb6adb8d3
Reuse collected cluster state in EnrichPolicyRunner (#48488) (#49100)
The cluster state is obtained twice in the EnrichPolicyRunner when updating 
the final alias. There is a possibility for the state to be slightly different 
between those two calls. This PR just has the function get the cluster state 
once and reuse it for the life of the function call.
2019-11-14 14:14:39 -05:00
Dan Hermann
cac9fe4d86
[7.x] Validate monitoring password at parse time (#49083) 2019-11-14 09:39:28 -06:00
Dimitris Athanasiou
be5894ed9c
[7.x][SQL] Mute JdbcConfigurationTests.testDriverConfigurationWithSSLInURL (#49085) (#49086)
Relates #41557
2019-11-14 15:15:55 +02:00
Rory Hunter
c46a0e8708
Apply 2-space indent to all gradle scripts (#49071)
Backport of #48849. Update `.editorconfig` to make the Java settings the
default for all files, and then apply a 2-space indent to all `*.gradle`
files. Then reformat all the files.
2019-11-14 11:01:23 +00:00
Marios Trivyzas
7c3198ba44
SQL: [Tests] Mute testReplaceChildren for Pivot (#49045)
Temporarily "mute" the testReplaceChildren for Pivot since it leads to
failing tests for some seeds, since the new child doesn't respond to a
valid data type.

Relates to #48900

(cherry picked from commit 6200a2207b9a4264d2f3fc976577323c7e084317)
2019-11-14 11:30:33 +01:00
Armin Braun
25e05b0013
Fix X-Pack SchedulerEngine Shutdown (#48951) (#49054)
We can have a race here where `scheduleNextRun` executes concurrently to `stop`
and so we run into a `RejectedExecutionException` that we don't catch and thus it
fails tests.
=> Fixed by ignoring these so long as they coincide with a scheduler shutdown
2019-11-13 22:06:55 +01:00
Przemysław Witek
e6ad3c29fd
Do not throw exceptions resulting from persisting datafeed timing stats. (#49044) (#49050) 2019-11-13 20:23:13 +01:00
Henning Andersen
66f0c8900f
Fix Transport Stopped Exception (#48930) (#49035)
When a node shuts down, `TransportService` moves to stopped state and
then closes connections. If a request is done in between, an exception
was thrown that was not retried in replication actions. Now throw a
wrapped `NodeClosedException` exception instead, which is correctly
handled in replication action. Fixed other usages too.

Relates #42612
2019-11-13 18:48:05 +01:00
Tanguy Leroux
e86b598813 Fix AutoFollowIT (#49025)
This commit fixes an off-by-one bug in the AutoFollowIT test that causes
failures because the leaderIndices counter is incremented during the evaluation
of the leaderIndices.incrementAndGet() < 20 condition but the 20th index is
not created, making the final assertion not verified.

It also gives a bit more time for cluster state updates to be processed on the
follower cluster.

Closes #48982
2019-11-13 13:20:57 +01:00
Ioannis Kakavas
4405042900
Remove unnecessary details logged for OIDC (#48746) (#49031)
This commit removes unnecessary details logged for
OIDC.

Co-Authored-By: Ioannis Kakavas <ikakavas@protonmail.com>
2019-11-13 13:43:56 +02:00
Yannick Welsch
2dfa0133d5 Always use primary term from primary to index docs on replica (#47583)
Ensures that we always use the primary term established by the primary to index docs on the
replica. Makes the logic around replication less brittle by always using the operation primary
term on the replica that is coming from the primary.
2019-11-13 12:13:45 +01:00