Commit Graph

4390 Commits

Author SHA1 Message Date
Przemysław Witek 9b116c8fef
A few improvements to AnalyticsProcessManager class that make the code more readable. (#50026) (#50069) 2019-12-11 09:35:05 +01:00
Ioannis Kakavas 3b613c36f4
Always return 401 for not valid tokens (#49736) (#50042)
Return a 401 in all cases when a request is submitted with an
access token that we can't consume. Before this change, we would
throw a 500 when a request came in with an access token that we
had generated but was then invalidated/expired and deleted from
the tokens index.

Resolves: #38866
Backport of #49736
2019-12-11 09:14:50 +02:00
Stuart Cam 44cd2f444c Add the REST API specifications for SLM Status / Start / Stop endpoints. (#49759)
Was originally missed in PR #47710

(cherry picked from commit 133b34c8355639ae0f699a86ffd9f37d19f73bca)
2019-12-11 13:34:13 +11:00
Adrien Grand 87e72156ce
Upgrade to lucene 8.4.0-snapshot-662c455. (#50016) (#50039)
Lucene 8.4 is about to be released so we should check it doesn't cause problems
with Elasticsearch.
2019-12-10 18:04:58 +01:00
Dimitris Athanasiou 8891f4db88
[7.x][ML] Introduce randomize_seed setting for regression and classification (#49990) (#50023)
This adds a new `randomize_seed` for regression and classification.
When not explicitly set, the seed is randomly generated. One can
reuse the seed in a similar job in order to ensure the same docs
are picked for training.

Backport of #49990
2019-12-10 15:29:19 +02:00
Yannick Welsch a16abf921f Make elasticsearch-node tools custom metadata-aware (#48390)
The elasticsearch-node tools allow manipulating the on-disk cluster state. The tool is currently
unaware of plugins and will therefore drop custom metadata from the cluster state once the
state is written out again (as it skips over the custom metadata that it can't read). This commit
preserves unknown customs when editing on-disk metadata through the elasticsearch-node
command-line tools.
2019-12-10 09:58:11 +01:00
Jason Tedor bfb2dc1353
Enable dependent settings values to be validated (#49942)
Today settings can declare dependencies on another setting. This
declaration is implemented so that if the declared setting is not set
when the declaring setting is, settings validation fails. Yet, in some
cases we want not only that the setting is set, but that it also has a
specific value. For example, with the monitoring exporter settings, if
xpack.monitoring.exporters.my_exporter.host is set, we not only want
that xpack.monitoring.exporters.my_exporter.type is set, but that it is
also set to local. This commit extends the settings infrastructure so
that this declaration is possible. The use of this in the monitoring
exporter settings will be implemented in a follow-up.
2019-12-09 12:45:50 -05:00
Marios Trivyzas 48e7420307 SQL: [Tests] Unmute Pivot from NodeSublassTests (#49925)
The `testReplaceChildren()` has been fixed for Pivot as part
of #49693.

Reverting: #49045
(cherry picked from commit 4b9b9edbcf2041a8619b65580bbe192bf424cebc)
2019-12-09 17:20:20 +01:00
Benjamin Trent 0b6ce9683c
[ML] Use query in cardinality check (#49939) (#49984)
When checking the cardinality of a field, the query should be take into account. The user might know about some bad data in their index and want to filter down to the target_field values they care about.
2019-12-09 10:14:41 -05:00
Przemysław Witek 0965a10468
[7.x] Pass `prediction_field_type` to C++ analytics process (#49861) (#49981) 2019-12-09 14:43:01 +01:00
Benjamin Trent 049d854360
[ML][Inference] adjust so target_field always has inference result and optionally allow new top classes field in the classification config (#49923) (#49982) 2019-12-09 08:29:45 -05:00
Dimitris Athanasiou e4f838e764
[7.x][ML] Update expected mem estimate in explain API integ test (#49924) (#49979)
Work in progress in the c++ side is increasing memory estimates
a bit and this test fails. At the time of this commit the mem
estimate when there is no source query is a about 2Mb. So I
am relaxing the test to assert memory estimate is less than 1Mb
instead of 500Kb.

Backport of #49924
2019-12-09 11:52:06 +02:00
cachedout 549b103458
[7.x] APM system_user (#47668) (#49912)
* Add test for APM beats index perms

* Grant monitoring index privs to apm_system user

* Review feedback

* Fix compilation problem
2019-12-09 08:25:03 +00:00
Armin Braun ac2774c9fa
Use Cluster State to Track Repository Generation (#49729) (#49976)
Step on the road to #49060.

This commit adds the logic to keep track of a repository's generation
across repository operations. See changes to package level Javadoc for the concrete changes in the distributed state machine.

It updates the write side of new repository generations to be fully consistent via the cluster state. With this change, no `index-N` will be overwritten for the same repository ever. So eventual consistency issues around conflicting updates to the same `index-N` are not a possibility any longer.

With this change the read side will still use listing of repository contents instead of relying solely on the cluster state contents.
The logic for that will be introduced in #49060. This retains the ability to externally delete the contents of a repository and continue using it afterwards for the time being. In #49060 the use of listing to determine the repository generation will be removed in all cases (except for full-cluster restart) as the last step in this effort.
2019-12-09 09:02:57 +01:00
Yannick Welsch 01d36afa4b Randomly run CCR tests with _source disabled (#49922)
Makes sure that CCR also properly works with _source disabled.

Changes one exception in LuceneChangesSnapshot as the case of missing _recovery_source
because of a missing lease was not properly properly bubbled up to CCR (testIndexFallBehind
was failing).
2019-12-09 08:33:40 +01:00
Costin Leau 5b896c5bb5
SQL: Refactor usage of NamedExpression (#49693)
To recap, Attributes form the properties of a derived table.
Each LogicalPlan has Attributes as output since each one can be part of
a query and as such its result are sent to its consumer.
This change essentially removes the name id comparison so any changes
applied to existing expressions should work as long as the said
expressions are semantically equivalent.
This change enforces the hashCode and equals which has the side-effect
of using hashCode as identifiers for each expression.
By removing any property from an Attribute, the various components need
to look the original source for comparison which, while annoying, should
prevent a reference from getting out of sync with its source due to
optimizations.

Essentially going forward there are only 3 types of NamedExpressions:

Alias - user define (implicit or explicit) name
FieldAttribute - field backed by Elasticsearch
ReferenceAttribute - a reference to another source acting as an
Attribute. Typically the Attribute of an Alias.

* Remove the usage of NamedExpression as basis for all Expressions.
Instead, restrict their use only for named context, such as projections
by using Aliasing instead.
* Remove different types of Attributes and allow only FieldAttribute,
UnresolvedAttribute and ReferenceAttribute. To avoid issues with
rewrites, resolve the references inside the QueryContainer so the
information always stays on the source.
* Side-effect, simplify the rules as the state for InnerAggs doesn't
have to be contained anymore.
* Improve ResolveMissingRef rule to handle references to named
non-singular expression tree against the same expression used up the
tree.

#49693 backport to 7.x

(cherry picked from commit 5d095e2173bcbf120f534a6f2a584185a7879b57)
2019-12-07 11:02:14 +02:00
Stuart Tettemer 17cda5b2c0
Scripting: Groundwork for caching script results (#49895) (#49944)
In order to cache script results in the query shard cache, we need to
check if scripts are deterministic.  This change adds a default method
to the script factories, `isResultDeterministic() -> false` which is
used by the `QueryShardContext`.

Script results were never cached and that does not change here.  Future
changes will implement this method based on whether the results of the
scripts are deterministic or not and therefore cacheable.

Refs: #49466

**Backport**
2019-12-06 15:08:05 -07:00
Lee Hinman 8205cdd423
[7.x] Refactor IndexLifecycleRunner to split state modificatio… (#49936)
This commit refactors the `IndexLifecycleRunner` to split out and
consolidate the number of methods that change state from within ILM. It
adds a new class `IndexLifecycleTransition` that contains a number of
static methods used to modify ILM's state. These methods all return new
cluster states rather than making changes themselves (they can be
thought of as helpers for modifying ILM state).

Rather than having multiple ways to move an index to a particular step
(like `moveClusterStateToStep`, `moveClusterStateToNextStep`,
`moveClusterStateToPreviouslyFailedStep`, etc (there are others)) this
now consolidates those into three with (hopefully) useful names:

- `moveClusterStateToStep`
- `moveClusterStateToErrorStep`
- `moveClusterStateToPreviouslyFailedStep`

In the move, I was also able to consolidate duplicate or redundant
arguments to these functions. Prior to this commit there were many calls
that provided duplicate information (both `IndexMetaData` and
`LifecycleExecutionState` for example) where the duplicate argument
could be derived from a previous argument with no problems.

With this split, `IndexLifecycleRunner` now contains the methods used to
actually run steps as well as the methods that kick off cluster state
updates for state transitions. `IndexLifecycleTransition` contains only
the helpers for constructing new states from given scenarios.

This also adds Javadocs to all methods in both `IndexLifecycleRunner`
and `IndexLifecycleTransition` (this accounts for almost all of the
increase in code lines for this commit). It also makes all methods be as
restrictive in visibility, to limit the scope of where they are used.

This refactoring is part of work towards capturing actions and
transitions that ILM makes, by consolidating and simplifying the places
we make state changes, it will make adding operation auditing easier.
2019-12-06 12:55:16 -07:00
Jake Landis 1c5a139968
Update jackson-databind to 2.8.11.4 (#49347) (#49937) 2019-12-06 13:39:33 -06:00
Przemysław Witek e60837aa3b
[7.x] Log whole analytics stats when the state assertion fails (#49906) (#49911) 2019-12-06 14:31:17 +01:00
Zachary Tong fec882a457 Decouple pipeline reductions from final agg reduction (#45796)
Historically only two things happened in the final reduction:
empty buckets were filled, and pipeline aggs were reduced (since it
was the final reduction, this was safe).  Usage of the final reduction
is growing however.  Auto-date-histo might need to perform
many reductions on final-reduce to merge down buckets, CCS
may need to side-step the final reduction if sending to a
different cluster, etc

Having pipelines generate their output in the final reduce was
convenient, but is becoming increasingly difficult to manage
as the rest of the agg framework advances.

This commit decouples pipeline aggs from the final reduction by
introducing a new "top level" reduce, which should be called
at the beginning of the reduce cycle (e.g. from the SearchPhaseController).
This will only reduce pipeline aggs on the final reduce after
the non-pipeline agg tree has been fully reduced.

By separating pipeline reduction into their own set of methods,
aggregations are free to use the final reduction for whatever
purpose without worrying about generating pipeline results
which are non-reducible
2019-12-05 16:11:54 -05:00
Ignacio Vera 44e94555ee
Add reusable HistogramValue object (#49799) (#49823)
Adds a reusable implementation of HistogramValue so we do not create
an object per document.
2019-12-04 11:51:53 +01:00
Armin Braun 996cddd98b
Stop Copying Every Http Request in Message Handler (#44564) (#49809)
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way
* Relates #32228
   * I think the issue that preventet that PR  that PR from being merged was solved by #39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks  (I'd argue the bounds checks we add, we save when copying the composite buffer)
* I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
2019-12-04 08:41:42 +01:00
Hendrik Muhs c33be29dc7 [Transform] automatic deletion of old checkpoints (#49496)
add automatic deletion of old checkpoints based on count and time
2019-12-04 07:55:57 +01:00
Hendrik Muhs d5eb9379c9 remove flaky test: might fail due to async execution 2019-12-03 18:28:41 +01:00
Hendrik Muhs 7aae212287
[Transform] Fix possible audit logging disappearance after rolling upgrade (#49731) (#49767)
ensure audit index template is available during a rolling upgrade before a
transform task can write to it.

fixes #49730
2019-12-03 18:05:06 +01:00
Przemysław Witek a3f88595d7
A few cleanups in evaluation tests (#49791) (#49794) 2019-12-03 15:48:39 +01:00
Yannick Welsch fbb92f527a Replicate write actions before fsyncing them (#49746)
This commit fixes a number of issues with data replication:

- Local and global checkpoints are not updated after the new operations have been fsynced, but
might capture a state before the fsync. The reason why this probably went undetected for so
long is that AsyncIOProcessor is synchronous if you index one item at a time, and hence working
as intended unless you have a high enough level of concurrent indexing. As we rely in other
places on the assumption that we have an up-to-date local checkpoint in case of synchronous
translog durability, there's a risk for the local and global checkpoints not to be up-to-date after
replication completes, and that this won't be corrected by the periodic global checkpoint sync.
- AsyncIOProcessor also has another "bad" side effect here: if you index one bulk at a time, the
bulk is always first fsynced on the primary before being sent to the replica. Further, if one thread
is tasked by AsyncIOProcessor to drain the processing queue and fsync, other threads can
easily pile more bulk requests on top of that thread. Things are not very fair here, and the thread
might continue doing a lot more fsyncs before returning (as the other threads pile more and
more on top), which blocks it from returning as a replication request (e.g. if this thread is on the
primary, it blocks the replication requests to the replicas from going out, and delaying
checkpoint advancement).

This commit fixes all these issues, and also simplifies the code that coordinates all the after
write actions.
2019-12-03 12:22:46 +01:00
Przemysław Witek 1d8e3d69d7
Make only a part of `stop()` method a critical section. (#49756) (#49788) 2019-12-03 09:54:16 +01:00
Andrei Stefan e2982b2110 SQL: handle NULL arithmetic operations with INTERVALs (#49633)
(cherry picked from commit ce727615c08cf5ae422feb77f69ea24fb53cd9d1)
2019-12-02 17:31:05 +02:00
Andrei Stefan 34311dd818 Fix NULL handling for FLOOR and CEIL math functions (#49644)
(cherry picked from commit 034f4cf7b4bd062c157d40f1e7a8760de31de568)
2019-12-02 17:31:04 +02:00
Andrei Stefan 4dc83a7db9 Fix Locate function optional parameter handling (#49666)
(cherry picked from commit dd3aeb8f5497bec4b050beaaf9d628a179b5454f)
2019-12-02 17:31:03 +02:00
Marios Trivyzas 901a8d1dcc
SQL: Fix issues with WEEK/ISO_WEEK/DATEDIFF (#49405)
Some extended testing with MS-SQL server and H2 (which agree on
results) revealed bugs in the implementation of WEEK related extraction
and diff functions.

Non-iso WEEK seems to be broken since #48209 because
of the replacement of Calendar and the change in the ISO rules.

ISO_WEEK failed for some edge cases around the January 1st.

DATE_DIFF was previously based on non-iso WEEK extraction which seems
not to be the case.

Fixes: #49376

(cherry picked from commit 54fe7f57289c46bb0905b1418f51a00e8c581560)
2019-11-29 17:07:30 +01:00
Ignacio Vera d9162c1243
Replace usages of XPackPlugin with the LocalStateCompositeXPackPlugin (#49714) (#49722) 2019-11-29 15:47:23 +01:00
Yannick Welsch c2d316a22f Remove obsolete resolving logic from TRA (#49685)
This stems from a time where index requests were directly forwarded to
TransportReplicationAction. Nowadays they are wrapped in a BulkShardRequest, and this logic is
obsolete.

In contrast to prior PR (#49647), this PR also fixes (see b3697cc) a situation where the previous
index expression logic had an interesting side effect. For bulk requests (which had resolveIndex
= false), the reroute phase was waiting for the index to appear in case where it was not present,
and for all other replication requests (resolveIndex = true) it would right away throw an
IndexNotFoundException while resolving the name and exit. With #49647, every replication
request was now waiting for the index to appear, which was problematic when the given index
had just been deleted (e.g. deleting a follower index while it's still receiving requests from the
leader, where these requests would now wait up to a minute for the index to appear). This PR
now adds b3697cc on top of that prior PR to make sure to reestablish some of the prior behavior
where the reroute phase waits for the bulk request for the index to appear. That logic was in
place to ensure that when an index was created and not all nodes had learned about it yet, that
the bulk would not fail somewhere in the reroute phase. This is now only restricted to the
situation where the current node has an older cluster state than the one that coordinated the
bulk request (which checks that the index is present). This also means that when an index is
deleted, we will no longer unnecessarily wait up to the timeout for the index o appear, and
instead fail the request.

Closes #20279
2019-11-29 15:24:07 +01:00
Dimitris Athanasiou 4edb2e7bb6
[7.x][ML] Add optional source filtering during data frame reindexing (#49690) (#49718)
This adds a `_source` setting under the `source` setting of a data
frame analytics config. The new `_source` is reusing the structure
of a `FetchSourceContext` like `analyzed_fields` does. Specifying
includes and excludes for source allows selecting which fields
will get reindexed and will be available in the destination index.

Closes #49531

Backport of #49690
2019-11-29 16:10:44 +02:00
Armin Braun 813b49adb4
Make BlobStoreRepository Aware of ClusterState (#49639) (#49711)
* Make BlobStoreRepository Aware of ClusterState (#49639)

This is a preliminary to #49060.

It does not introduce any substantial behavior change to how the blob store repository
operates. What it does is to add all the infrastructure changes around passing the cluster service to the blob store, associated test changes and a best effort approach to tracking the latest repository generation on all nodes from cluster state updates. This brings a slight improvement to the consistency
by which non-master nodes (or master directly after a failover) will be able to determine the latest repository generation. It does not however do any tricky checks for the situation after a repository operation
(create, delete or cleanup) that could theoretically be used to get even greater accuracy to keep this change simple.
This change does not in any way alter the behavior of the blobstore repository other than adding a better "guess" for the value of the latest repo generation and is mainly intended to isolate the actual logical change to how the
repository operates in #49060
2019-11-29 14:57:47 +01:00
Ioannis Kakavas a59b7e07f1
Use PEM files instead of a JKS for key material (#49625) (#49701)
So that the tests can also run in a FIPS 140 JVM, where using a
JKS keystore is not allowed.

Resolves: #49261
2019-11-29 09:43:55 +02:00
Tim Vernum e6f530c167
Improved diagnostics for TLS trust failures (#49669)
- Improves HTTP client hostname verification failure messages
- Adds "DiagnosticTrustManager" which logs certificate information
  when trust cannot be established (hostname failure, CA path failure,
  etc)

These diagnostic messages are designed so that many common TLS
problems can be diagnosed based solely (or primarily) on the
elasticsearch logs.

These diagnostics can be disabled by setting

     xpack.security.ssl.diagnose.trust: false

Backport of: #48911
2019-11-29 15:01:20 +11:00
Tim Vernum 31f13e839c
Correct the documentation for create_doc privilege (#49354)
The documentation was added in #47584 but those docs did not reflect the up-to-date behavior of the feature.

Backport of: #47784
2019-11-29 12:59:16 +11:00
Ioannis Kakavas ba0c848027
[7.x] Update opensaml dependency (#44972) (#49512)
Add a mirror of the maven repository of the shibboleth project
and upgrade opensaml and related dependencies to the latest
version available version

Resolves: #44947
2019-11-29 00:17:16 +02:00
Przemysław Witek 1425e30b1e
[7.x] Remove ClassInfo interface and BinaryClassInfo class. (#49649) (#49681) 2019-11-28 21:46:46 +01:00
Jim Ferenczi 496bb9e2ee
Add a listener to track the progress of a search request locally (#49471) (#49691)
This commit adds a function in NodeClient that allows to track the progress
of a search request locally. Progress is tracked through a SearchProgressListener
that exposes query and fetch responses as well as partial and final reduces.
This new method can be used by modules/plugins inside a node in order to track the
progress of a local search request.

Relates #49091
2019-11-28 18:23:09 +01:00
Mayya Sharipova 2dafecc398
Upgrade lucene to 8.4.0-snapshot-e648d601efb (#49641) 2019-11-28 11:59:58 -05:00
Przemyslaw Gomulka e528b41cf2
Enable LicenceServiceTests for all jdks (#49440) backport(#49682)
This test no longer relies on jdk version, so the assume should be removed
relates #48209
2019-11-28 15:26:54 +01:00
Ignacio Vera 326fe7566e
New Histogram field mapper that supports percentiles aggregations. (#48580) (#49683)
This commit adds  a new histogram field mapper that consists in a pre-aggregated format of numerical data to be used in percentiles aggregations.
2019-11-28 15:06:26 +01:00
Yannick Welsch 04e9cbd6eb Revert "Remove obsolete resolving logic from TRA (#49647)"
This reverts commit 0827ea2175.
2019-11-28 13:12:07 +01:00
Yannick Welsch 0827ea2175 Remove obsolete resolving logic from TRA (#49647)
This stems from a time where index requests were directly forwarded to
TransportReplicationAction. Nowadays they are wrapped in a BulkShardRequest, and this logic is
obsolete.

Closes #20279
2019-11-28 12:11:27 +01:00
Jim Ferenczi d6445fae4b Add a cluster setting to disallow loading fielddata on _id field (#49166)
This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when https://github.com/elastic/elasticsearch/issues/26472
is implemented.

Closes #43599
2019-11-28 09:35:28 +01:00
Martijn van Groningen 09c4269097
Add templating support to enrich processor (#49093)
Adds support for templating to `field` and `target_field` options.
2019-11-27 08:53:11 +01:00
Tim Vernum 901c64ebbf
Add Debug/Trace logging for authentication (#49619)
Authentication has grown more complex with the addition of new realm
types and authentication methods. When user authentication does not
behave as expected it can be difficult to determine where and why it
failed.

This commit adds DEBUG and TRACE logging at key points in the
authentication flow so that it is possible to gain addition insight
into the operation of the system.

Backport of: #49575
2019-11-27 16:39:07 +11:00
Tim Vernum e9ad1a7fcd
Fix iterate-from-1 bug in smart realm order (#49614)
The AuthenticationService has a feature to "smart order" the realm
chain so that whicherver realm was the last one to successfully
authenticate a given user will be tried first when that user tries to
authenticate again.

There was a bug where the building of this realm order would
incorrectly drop the first realm from the default chain unless that
realm was the "last successful" realm.

In most cases this didn't cause problems because the first realm is
the reserved realm and so it is unusual for a user that authenticated
against a different realm to later need to authenticate against the
resevered realm.

This commit fixes that bug and adds relevant asserts and tests.

Backport of: #49473
2019-11-27 13:46:52 +11:00
Armin Braun 3862400270
Remove Redundant EsBlobStoreTestCase (#49603) (#49605)
All the implementations of `EsBlobStoreTestCase` use the exact same
bootstrap code that is also used by their implementation of
`EsBlobStoreContainerTestCase`.
This means all tests might as well live under `EsBlobStoreContainerTestCase`
saving a lot of code duplication. Also, there was no HDFS implementation for
`EsBlobStoreTestCase` which is now automatically resolved by moving the tests over
since there is a HDFS implementation for the container tests.
2019-11-26 20:57:19 +01:00
Marios Trivyzas b0cb7bf229 SQL: Fix issue with GROUP BY YEAR() (#49559)
Grouping By YEAR() is translated to a histogram aggregation, but
previously if there was a scalar function invloved (e.g.:
`YEAR(date + INTERVAL 2 YEARS)`), there was no proper script created
and the histogram was applied on a field with name: `date + INTERVAL 2 YEARS`
which doesn't make sense, and resulted in null result.

Check the underlying field of YEAR() and if it's a function call
`asScript()` to properly get the painless script on which the histogram
is applied.

Fixes: #49386
(cherry picked from commit 93c37abc943d00d3a14ba08435d118a6d48874c7)
2019-11-26 14:11:11 +01:00
Marios Trivyzas 3c69d4d0bd
SQL: Add TRUNC alias for TRUNCATE (#49571)
Add TRUNC as alias to already implemented TRUNCATE
numeric function which is the flavour supported by
Oracle and PostgreSQL.

Relates to: #41195

(cherry picked from commit f2aa7f0779bc5cce40cc0c1f5e5cf1a5bb7d84f0)
2019-11-26 12:32:54 +01:00
j-bean 048b9dbb14 Fix expired job results deletion audit message (#49560)
The PR fixes #49549
2019-11-26 10:48:12 +00:00
Dimitris Athanasiou c23a2187da
[7.x][ML] Only report complete writing_results progress after completion (#49551) (#49577)
We depend on the number of data frame rows in order to report progress
for the writing of results, the last phase of a job run. However, results
include other objects than just the data frame rows (e.g, progress, inference model, etc.).

The problem this commit fixes is that if we receive the last data frame row results
we'll report that progress is complete even though we still have more results to process
potentially. If the job gets stopped for any reason at this point, we will not be able
to restart the job properly as we'll think that the job was completed.

This commit addresses this by limiting the max progress we can report for the
writing_results phase before the results processor completes to 98.
At the end, when the process is done we set the progress to 100.

The commit also improves failure capturing and reporting in the results processor.

Backport of #49551
2019-11-26 12:20:37 +02:00
Marios Trivyzas 5d306ae3b2
SQL: Fix issue with CASE/IIF pre-calculating results (#49553)
Previously, CaseProcessor was pre-calculating (called `process()`)
on all the building elements of a CASE/IIF expression, not only the
conditions involved but also the results, as well as the final else result.
In case one of those results had an erroneous calculation
(e.g.: division by zero) this was executed and resulted in
an Exception to be thrown, even if this result was not used because of
the condition guarding it. e.g.:

```
SELECT CASE myField1 = 0 THEN NULL ELSE myField2 / myField1 END
FROM test;
```

Fixes: #49388
(cherry picked from commit dbd169afc98686cae1bc72024fad0ca32b272efd)
2019-11-26 10:48:07 +01:00
Tim Brooks 416178c7c8
Enable simple remote connection strategy (#49561)
This commit back ports three commits related to enabling the simple
connection strategy.

Allow simple connection strategy to be configured (#49066)

Currently the simple connection strategy only exists in the code. It
cannot be configured. This commit moves in the direction of allowing it
to be configured. It introduces settings for the addresses and socket
count. Additionally it introduces new settings for the sniff strategy
so that the more generic number of connections and seed node settings
can be deprecated.

The simple settings are not yet registered as the registration is
dependent on follow-up work to validate the settings.

Ensure at least 1 seed configured in remote test (#49389)

This fixes #49384. Currently when we select a random subset of seed
nodes from a list, it is possible for 0 seeds to be selected. This test
depends on at least 1 seed being selected.

Add the simple strategy to cluster settings (#49414)

This is related to #49067. This commit adds the simple connection
strategy settings and strategy mode setting to the cluster settings
registry. With these changes, the simple connection mode can be used.
Additionally, it adds validation to ensure that settings cannot be
misconfigured.
2019-11-25 16:53:07 -07:00
Benjamin Trent 688c78c589
[ML] Stop timing stats failure propagation (#49495) (#49501) 2019-11-25 10:09:30 -05:00
David Roberts 62811c2272 [ML] Add default categorization analyzer definition to ML info (#49545)
The categorization job wizard in the ML UI will use this
information when showing the effect of the chosen categorization
analyzer on a sample of input.
2019-11-25 13:39:16 +00:00
Dimitris Athanasiou aca38f6882
[7.x][ML] DFA jobs should accept excluding an unsupported field (#49535) (#49544)
Before this change excluding an unsupported field resulted in
an error message that explained the excluded field could not be
detected as if it doesn't exist. This error message is confusing.

This commit commit changes this so that there is no error in this
scenario. When excluding a field that does exist but has been
automatically been excluded from the analysis there is no harm
(unlike excluding a missing field which could be a typo).

Backport of #49535
2019-11-25 15:13:00 +02:00
Armin Braun af0f97d50a
Fix SLMSnapshotBlockingIntegTests.testSnapshotInProgress (#49533) (#49542)
This test must check for state `SUCCESS` as well. `SUCESS` in
`SnapshotsInProgress` means "all data nodes finished snapshotting sucessfully but master must still finalize the snapshot in the repo".
`SUCESS` does not mean that the snapshot is actually fully finished in this object.

You can easily reporduce the scenario in #49303 that has an in-progress snapshot in `SUCCESS` state
by waiting 20s before running the busy assert loop on the snapshot status so that all steps but the blocked
finalization can finish.

Closes #49303
2019-11-25 13:31:45 +01:00
Dimitris Athanasiou c149c64dc4
[7.x][ML] Apply source query on data frame analytics memory estimation (#49517) (#49532)
Closes #49454

Backport of #49517
2019-11-25 12:51:57 +02:00
Hendrik Muhs 5256756879 [Transform] add debug log for configuration index (#49484)
add debug log for transform creation and disallow partial results for retrieval
2019-11-25 09:49:17 +01:00
debadair 2ec047db04 [DOCS] Rename auditing topic. Closes #49012 (#49013)
* [DOCS] Rename auditing topic. Closes #49012

* Fixed file name, fixed settings link.

* Add link to settings
2019-11-22 14:16:58 -08:00
Dimitris Athanasiou 8eaee7cbdc
[7.x][ML] Explain data frame analytics API (#49455) (#49504)
This commit replaces the _estimate_memory_usage API with
a new API, the _explain API.

The API consolidates information that is useful before
creating a data frame analytics job.

It includes:

- memory estimation
- field selection explanation

Memory estimation is moved here from what was previously
calculated in the _estimate_memory_usage API.

Field selection is a new feature that explains to the user
whether each available field was selected to be included or
not in the analysis. In the case it was not included, it also
explains the reason why.

Backport of #49455
2019-11-22 22:06:10 +02:00
Jason Tedor 71bcfbf1e3
Replace required pipeline with final pipeline (#49470)
This commit enhances the required pipeline functionality by changing it
so that default/request pipelines can also be executed, but the required
pipeline is always executed last. This gives users the flexibility to
execute their own indexing pipelines, but also ensure that any required
pipelines are also executed. Since such pipelines are executed last, we
change the name of required pipelines to final pipelines.
2019-11-22 14:37:36 -05:00
Marios Trivyzas 0c4491964b SQL: Fix issue with folding of CASE/IIF (#49449)
Add extra checks to prevent ConstantFolding rule to try to fold
the CASE/IIF functions early before the SimplifyCase rule gets applied.

Fixes: #49387

(cherry picked from commit f35c9725350e35985d8dd3001870084e1784a5ca)
2019-11-22 18:29:49 +01:00
Benjamin Trent 276b6c67f4
[ML][Inference] Fixing pre-processor value handling and size estimate (#49270) (#49489)
* [ML][Inference] Fixing pre-processor value handling and size estimate

* fixing npe
2019-11-22 08:14:33 -05:00
Jim Ferenczi ed4eecc00e Pre-sort shards based on the max/min value of the primary sort field (#49092)
This change automatically pre-sort search shards on search requests that use a primary sort based on the value of a field. When possible, the can_match phase will extract the min/max (depending on the provided sort order) values of each shard and use it to pre-sort the shards prior to running the subsequent phases. This feature can be useful to ensure that shards that contain recent data are executed first so that intermediate merge have more chance to contain contiguous data (think of date_histogram for instance) but it could also be used in a follow up to early terminate sorted top-hits queries that don't require the total hit count. The latter could significantly speed up the retrieval of the most/least
recent documents from time-based indices.

Relates #49091
2019-11-22 11:02:12 +01:00
Hendrik Muhs 1fbb248cb7 reenable warning checks in pivot tests (#49436) 2019-11-22 08:50:10 +01:00
Tim Vernum 2e5f2dd1e1
Deprecate misconfigured SSL server config (#49280)
This commit adds a deprecation warning when starting
a node where either of the server contexts
(xpack.security.transport.ssl and xpack.security.http.ssl)
meet either of these conditions:

1. The server lacks a certificate/key pair (i.e. neither
   ssl.keystore.path not ssl.certificate are configured)
2. The server has some ssl configuration, but ssl.enabled is not
   specified. This new validation does not care whether ssl.enabled is
   true or false (though other validation might), it simply makes it
   an error to configure server SSL without being explicit about
   whether to enable that configuration.

Backport of: #45892
2019-11-22 12:14:55 +11:00
Benjamin Trent a7477ad7c3
[7.x] [ML][Inference] compressing model definition and lazy parsing (#49269) (#49446)
* [ML][Inference] compressing model definition and lazy parsing (#49269)

* [ML][Inference] compressing model definition and lazy parsing

* addressing PR comments

* adding commons io

* implementing simplified bounded stream

* adjusting for type inclusion
2019-11-21 15:32:32 -05:00
Benjamin Trent d9835f7fb4
[ML] Fix r_squared eval when variance is 0 (#49439) (#49445) 2019-11-21 11:22:16 -05:00
Benjamin Trent d41b2e3f38
[ML][Inference] allowing per-model licensing (#49398) (#49435)
* [ML][Inference] allowing per-model licensing

* changing to internal action + removing pre-mature opt
2019-11-21 09:46:34 -05:00
Przemysław Witek c7ac2011eb
[7.x] Implement accuracy metric for multiclass classification (#47772) (#49430) 2019-11-21 15:01:18 +01:00
Martijn van Groningen d59ea64ccd
Monitoring should wait with collecting data when cluster service is started. (#49426)
Backport of #48277

Otherwise integration tests may fail if the monitoring interval is low:
```
[2019-10-21T09:57:25,527][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [integTest-0] fatal error in thread [elasticsearch[integTest-0][generic][T#4]], exiting
java.lang.AssertionError: initial cluster state not set yet
        at org.elasticsearch.cluster.service.ClusterApplierService.state(ClusterApplierService.java:208) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
        at org.elasticsearch.cluster.service.ClusterService.state(ClusterService.java:125) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
        at org.elasticsearch.xpack.monitoring.MonitoringService$MonitoringExecution$1.doRun(MonitoringService.java:231) ~[?:?]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:703) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
        at java.lang.Thread.run(Thread.java:835) [?:?]
```

I ran into this when lowering the monitoring interval when investigating
enrich monitoring test: #48258
2019-11-21 14:22:41 +01:00
Hendrik Muhs c3e4405ddf
[7.x][Transform] Transform fix force stop race condition (#49249) (#49420)
fix force stopping transform if indexer state hasn't been written and/or is set to STOPPED. In certain situations the transform could not be stopped, which means the task could not be removed. Introduces improved abstraction in order to better test state handling in future.
2019-11-21 13:52:14 +01:00
Andrei Dan 010c3de47e
Slm set operation mode to RUNNING on first run (#49236) (#49425)
* SLM set the operation mode to RUNNING on first run

Set the SLM operation mode to RUNNING when setting the first SLM lifecycle
policy. Historically, SLM was not decoupled from ILM but now they are
independent components. Setting the SLM operation mode to what the ILM running
mode was when we set the first SLM lifecycle policy was a remain from those
times.

* SLM update package info

* SLM suppress unusued warning

* SLM use logger for the correct class

* SLM Add integration test for operation mode

* Use ESSingleNodeTestCase instead of ESIntegTestCase

(cherry picked from commit 4ad3d93f89d03bf9a25685a990d1a439f33ce0e6)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2019-11-21 11:41:32 +00:00
István Zoltán Szabó 5b10fd301e [DOCS] Fixes endpoint schema in PUT app privileges API docs. (#49390) 2019-11-21 09:52:44 +01:00
Lisa Cawley 61c54fd617 [DOCS] Qualifies Watcher transforms (#47482) 2019-11-20 16:44:18 -08:00
Nhat Nguyen fec22130c2 Improve error message when pausing index (#48915)
Throw an appropriate error message when the follower index is not found
or is a regular index.
2019-11-20 15:58:44 -05:00
Hendrik Muhs 06c2689802
rename data frame tests to transform tests (#49361)
rename files and tests in rolling upgrade tests to transform
2019-11-20 18:51:11 +01:00
Bogdan Pintea 8c2ab8bb72 SQL:Docs: add the PIVOT clause to SELECT section (#49129)
The PR adds the documentation on the PIVOT clause.

(cherry picked from commit a55b36065e6496c44b6e3191296931d477a8e5f5)
2019-11-20 18:21:06 +01:00
David Roberts 20558cf61c [ML] Fix simultaneous stop and force stop datafeed (#49367)
If a datafeed is stopped normally and force stopped at the same
time then it is possible that the force stop removes the
persistent task while the normal stop is performing actions.
Currently this causes the normal stop to error, but since
stopping a stopped datafeed is not an error this doesn't make
sense. Instead the force stop should just take precedence.

This is a followup to #49191 and should really have been
included in the changes in that PR.
2019-11-20 12:52:47 +00:00
Mayya Sharipova e3da60c23d Increase the number of vector dims to 2048 (#46895) 2019-11-20 07:47:33 -05:00
Przemysław Witek 9c0ec7ce23
[7.x] Make AnalyticsProcessManager class more robust (#49282) (#49356) 2019-11-20 10:08:16 +01:00
Dimitris Athanasiou 4d6e037e90
[7.x][ML] Extract creation of DFA field extractor into a factory (#49315) (#49329)
This commit moves the async calls required to retrieve the components
that make up `ExtractedFieldsExtractor` out of `DataFrameDataExtractorFactory`
and into a dedicated `ExtractorFieldsExtractorFactory` class.

A few more refactorings are performed:

  - The detector no longer needs the results field. Instead, it knows
  whether to use it or not based on whether the task is restarting.
  - We pass more accurately whether the task is restarting or not.
  - The validation of whether fields that have a cardinality limit
  are valid is now performed in the detector after retrieving the
  respective cardinalities.

Backport of #49315
2019-11-20 10:02:42 +02:00
Lisa Cawley 2b9fb7ebe2 [DOCS] Merges security overview pages (#49342) 2019-11-19 16:19:02 -08:00
Przemysław Witek 42bb8ae525
[7.x] Extract indexData method out of RegressionIT tests (#49306) (#49313) 2019-11-19 22:47:12 +01:00
Mark Tozzi 17358b5af7
(refactor) Extract Empty/Script/Missing ValuesSource behavior to an interface (#48320) (#49330)
This is a pure code rearrangement refactor.  Logic for what specific ValuesSource instance to use for a given type (e.g. script or field) moved out of ValuesSourceConfig and into CoreValuesSourceType (previously just ValueSourceType; we extract an interface for future extensibility).  ValueSourceConfig still selects which case to use, and then the ValuesSourceType instance knows how to construct the ValuesSource for that case.
2019-11-19 16:44:29 -05:00
Lisa Cawley 75f1f612c2 [DOCS] Merges duplicate pages for Active Directory realms (#49205) 2019-11-19 13:18:01 -08:00
Jay Modi eed4cd25eb
ThreadPool and ThreadContext are not closeable (#43249) (#49273)
This commit changes the ThreadContext to just use a regular ThreadLocal
over the lucene CloseableThreadLocal. The CloseableThreadLocal solves
issues with ThreadLocals that are no longer needed during runtime but
in the case of the ThreadContext, we need it for the runtime of the
node and it is typically not closed until the node closes, so we miss
out on the benefits that this class provides.

Additionally by removing the close logic, we simplify code in other
places that deal with exceptions and tracking to see if it happens when
the node is closing.

Closes #42577
2019-11-19 13:15:16 -07:00
Lisa Cawley c4c8a7a43c [DOCS] Merges duplicate pages for PKI realms (#49206) 2019-11-19 10:51:09 -08:00
Lisa Cawley 2f5acae4a9 [DOCS] Groups pages related to encrypting communications (#49324) 2019-11-19 10:10:39 -08:00
Lisa Cawley 62bbe419d3 [DOCS] Removes Beats security page (#49276) 2019-11-19 09:15:30 -08:00
Andrei Dan 19780e20ba
Handle failure to retrieve ILM policy step better (#49193) (#49316)
This commit wraps the calls to retrieve the current step in a try/catch
so that the exception does not bubble up. Instead, step info is added
containing the exception to the existing step.

Semi-related to #49128

(cherry picked from commit 72530f8a7f40ae1fca3704effb38cf92daf29057)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2019-11-19 17:14:46 +00:00
Armin Braun 0acba44a2e
Make Repository.getRepositoryData an Async API (#49299) (#49312)
This API call in most implementations is fairly IO heavy and slow
so it is more natural to be async in the first place.
Concretely though, this change is a prerequisite of #49060 since
determining the repository generation from the cluster state
introduces situations where this call would have to wait for other
operations to finish. Doing so in a blocking manner would break
`SnapshotResiliencyTests` and waste a thread.
Also, this sets up the possibility to in the future make use of async IO
where provided by the underlying Repository implementation.

In a follow-up `SnapshotsService#getRepositoryData` will be made async
as well (did not do it here, since it's another huge change to do so).
Note: This change for now does not alter the threading behaviour in any way (since `Repository#getRepositoryData` isn't forking) and is purely mechanical.
2019-11-19 16:49:12 +01:00
Marios Trivyzas fd1bb4a33a SQL: Fix issue with mins & hours for DATEDIFF (#49252)
Previously, DATEDIFF for minutes and hours was doing a
rounding calculation using all the time fields (secs, msecs/micros/nanos).
Instead it should first truncate the 2 dates to the respective field (mins or hours)
zeroing out all the more detailed time fields and then make the subtraction.

(cherry picked from commit 124cd18e20429e19d52fd8dc383827ea5132d428)
2019-11-19 14:25:28 +01:00