Commit Graph

48944 Commits

Author SHA1 Message Date
Martijn van Groningen 90850f4ea0
Backport: Introduce on_failure_pipeline ingest metadata inside on_failure block (#49596)
Backport of #49076

In case an exception occurs inside a pipeline processor,
the pipeline stack is kept around as header in the exception.
Then in the on_failure processor the id of the pipeline the
exception occurred is made accessible via the `on_failure_pipeline`
ingest metadata.

Closes #44920
2019-11-27 07:52:08 +01:00
Tim Vernum 901c64ebbf
Add Debug/Trace logging for authentication (#49619)
Authentication has grown more complex with the addition of new realm
types and authentication methods. When user authentication does not
behave as expected it can be difficult to determine where and why it
failed.

This commit adds DEBUG and TRACE logging at key points in the
authentication flow so that it is possible to gain addition insight
into the operation of the system.

Backport of: #49575
2019-11-27 16:39:07 +11:00
Tim Vernum e9ad1a7fcd
Fix iterate-from-1 bug in smart realm order (#49614)
The AuthenticationService has a feature to "smart order" the realm
chain so that whicherver realm was the last one to successfully
authenticate a given user will be tried first when that user tries to
authenticate again.

There was a bug where the building of this realm order would
incorrectly drop the first realm from the default chain unless that
realm was the "last successful" realm.

In most cases this didn't cause problems because the first realm is
the reserved realm and so it is unusual for a user that authenticated
against a different realm to later need to authenticate against the
resevered realm.

This commit fixes that bug and adds relevant asserts and tests.

Backport of: #49473
2019-11-27 13:46:52 +11:00
Tim Brooks e965a6f2df
Fix remote settings upgrade test (#49609)
This commit fixes #49587. Due to a settings change, the broken test was
asserting on the incorrect setting. This commit fixes that issue and
adds additional assertions to ensure that all settings are working
properly.
2019-11-26 16:37:27 -07:00
Armin Braun 996cdebfb4
Make BlobStoreRepository#writeIndexGen API Async (#49584) (#49610)
Preliminary to shorten the diff of #49060. In #49060
we execute cluster state updates during the writing of a new
index gen and thus it must be an async API.
2019-11-26 22:37:31 +01:00
Armin Braun 3862400270
Remove Redundant EsBlobStoreTestCase (#49603) (#49605)
All the implementations of `EsBlobStoreTestCase` use the exact same
bootstrap code that is also used by their implementation of
`EsBlobStoreContainerTestCase`.
This means all tests might as well live under `EsBlobStoreContainerTestCase`
saving a lot of code duplication. Also, there was no HDFS implementation for
`EsBlobStoreTestCase` which is now automatically resolved by moving the tests over
since there is a HDFS implementation for the container tests.
2019-11-26 20:57:19 +01:00
lcawl 777431265b [DOCS] Fixes typo in ML resources 2019-11-26 10:28:59 -08:00
Alan Woodward fe2c65185e Annotated text type should extend TextFieldType (#49555)
The annotated text mapper has a field type that currently extends StringFieldType,
which means that all the positional-related query factory methods need to be copied
over from TextFieldType. In addition, MappedFieldType.intervals() hasn't been
overridden, so you can't use intervals queries with annotated text - a major drawback,
since one of the purposes of annotated text is to be able to run positional queries against
annotations.

This commit changes the annotated text field type to extend TextFieldType instead,
adding tests to ensure that position queries work correctly.

Closes #49289
2019-11-26 16:52:21 +00:00
Benjamin Trent b5d7c939f8
[7.x] [ML][Inference][HLRC] add GET _stats (#49562) (#49600)
* [ML][Inference][HLRC] add GET _stats (#49562)

* fixing for backport
2019-11-26 11:28:26 -05:00
lcawl a42003b95b [DOCS] Fixes data type formatting 2019-11-26 08:22:50 -08:00
Armin Braun 495b543e63
Improve Stability of GCS Mock API (#49592) (#49597)
Same as #49518 pretty much but for GCS.
Fixing a few more spots where input stream can get closed
without being fully drained and adding assertions to make sure
it's always drained.
Moved the no-close stream wrapper to production code utilities since
there's a number of spots in production code where it's also useful
(will reuse it there in a follow-up).
2019-11-26 16:53:51 +01:00
Benjamin Trent 26a8ca00db
[7.x] [ML][Inference][HLRC] Delete trained model API (#49567) (#49585)
* [ML][Inference][HLRC] Delete trained model API (#49567)

* fixing for backport
2019-11-26 08:27:08 -05:00
Marios Trivyzas b0cb7bf229 SQL: Fix issue with GROUP BY YEAR() (#49559)
Grouping By YEAR() is translated to a histogram aggregation, but
previously if there was a scalar function invloved (e.g.:
`YEAR(date + INTERVAL 2 YEARS)`), there was no proper script created
and the histogram was applied on a field with name: `date + INTERVAL 2 YEARS`
which doesn't make sense, and resulted in null result.

Check the underlying field of YEAR() and if it's a function call
`asScript()` to properly get the painless script on which the histogram
is applied.

Fixes: #49386
(cherry picked from commit 93c37abc943d00d3a14ba08435d118a6d48874c7)
2019-11-26 14:11:11 +01:00
Rory Hunter cf5f013033
Return 400 when handling invalid JSON (#49558)
Backport of #49552.

Closes #49428. The code that works out an HTTP code for an exception didn't
consider the JsonParseException case, meant that an invalid JSON request could
result in a 500 Internal Server Error. Now it returns 400 Bad Request.
2019-11-26 12:36:56 +00:00
Hendrik Muhs 41daf284f5 mute FullClusterRestartSettingsUpgradeIT 2019-11-26 13:28:35 +01:00
Marios Trivyzas 3c69d4d0bd
SQL: Add TRUNC alias for TRUNCATE (#49571)
Add TRUNC as alias to already implemented TRUNCATE
numeric function which is the flavour supported by
Oracle and PostgreSQL.

Relates to: #41195

(cherry picked from commit f2aa7f0779bc5cce40cc0c1f5e5cf1a5bb7d84f0)
2019-11-26 12:32:54 +01:00
j-bean 048b9dbb14 Fix expired job results deletion audit message (#49560)
The PR fixes #49549
2019-11-26 10:48:12 +00:00
Dimitris Athanasiou c23a2187da
[7.x][ML] Only report complete writing_results progress after completion (#49551) (#49577)
We depend on the number of data frame rows in order to report progress
for the writing of results, the last phase of a job run. However, results
include other objects than just the data frame rows (e.g, progress, inference model, etc.).

The problem this commit fixes is that if we receive the last data frame row results
we'll report that progress is complete even though we still have more results to process
potentially. If the job gets stopped for any reason at this point, we will not be able
to restart the job properly as we'll think that the job was completed.

This commit addresses this by limiting the max progress we can report for the
writing_results phase before the results processor completes to 98.
At the end, when the process is done we set the progress to 100.

The commit also improves failure capturing and reporting in the results processor.

Backport of #49551
2019-11-26 12:20:37 +02:00
Marios Trivyzas 5d306ae3b2
SQL: Fix issue with CASE/IIF pre-calculating results (#49553)
Previously, CaseProcessor was pre-calculating (called `process()`)
on all the building elements of a CASE/IIF expression, not only the
conditions involved but also the results, as well as the final else result.
In case one of those results had an erroneous calculation
(e.g.: division by zero) this was executed and resulted in
an Exception to be thrown, even if this result was not used because of
the condition guarding it. e.g.:

```
SELECT CASE myField1 = 0 THEN NULL ELSE myField2 / myField1 END
FROM test;
```

Fixes: #49388
(cherry picked from commit dbd169afc98686cae1bc72024fad0ca32b272efd)
2019-11-26 10:48:07 +01:00
Christoph Büscher a4208e44f7 [Docs] Correct `max_doc_freq` default value (#49536)
The default is set to Integer.MAX_VALUE but is reported to be `0` in the docs.
With the current implementation a value of 0 would mean all terms are filtered
out, which is the opposite of "unbounded".

Closes #49520
2019-11-26 10:47:05 +01:00
Tim Vernum 9cb1ace1c2
Expand docs on TLSv1 breaking change (#49352)
The breaking changes cover the removal of TLSv1 from the default
protocols, but assume that users who need to retain TLSv1 support will
understand all the places where they may used it.

This has proven not to be true, as it is easy to be unaware that (for
example) an LDAP server is using TLSv1.

This change explicitly lists all the places where TLS protocols may
need to be configured.

Co-Authored-By: Lisa Cawley <lcawley@elastic.co>
Co-Authored-By: Pius <pius@elastic.co>
2019-11-26 16:34:55 +11:00
Nhat Nguyen d2e92a1791 EngineTestCase#getDocIds should use internal reader (#49564)
We do not guarantee that EngineTestCase#getDocIds is called after the 
engine has been externally refreshed. Hence, we trip an assertion
assertSearcherIsWarmedUp.

CI: https://gradle-enterprise.elastic.co/s/pm2at5qmfm2iu

Relates #48605
2019-11-25 21:07:30 -05:00
Tim Brooks 416178c7c8
Enable simple remote connection strategy (#49561)
This commit back ports three commits related to enabling the simple
connection strategy.

Allow simple connection strategy to be configured (#49066)

Currently the simple connection strategy only exists in the code. It
cannot be configured. This commit moves in the direction of allowing it
to be configured. It introduces settings for the addresses and socket
count. Additionally it introduces new settings for the sniff strategy
so that the more generic number of connections and seed node settings
can be deprecated.

The simple settings are not yet registered as the registration is
dependent on follow-up work to validate the settings.

Ensure at least 1 seed configured in remote test (#49389)

This fixes #49384. Currently when we select a random subset of seed
nodes from a list, it is possible for 0 seeds to be selected. This test
depends on at least 1 seed being selected.

Add the simple strategy to cluster settings (#49414)

This is related to #49067. This commit adds the simple connection
strategy settings and strategy mode setting to the cluster settings
registry. With these changes, the simple connection mode can be used.
Additionally, it adds validation to ensure that settings cannot be
misconfigured.
2019-11-25 16:53:07 -07:00
Zachary Tong 99e313695f Reuse CompensatedSum object in agg collect loops (#49548)
The new CompensatedSum is a nice DRY refactor, but had the unanticipated 
side effect of creating a lot of object allocation in the aggregation hot collection 
loop: one object per visited document, per aggregator. In some places it 
created two per-doc-per-agg (weighted avg, geo centroids, etc) since there 
were multiple compensations being maintained.

This PR moves the object creation out of the hot loop so that it is now 
created once per segment, and resets the internal state each time through 
the loop
2019-11-25 16:46:48 -05:00
James Rodewig 2fd58bb845 [DOCS] Add missing "_type" to delimited payload token filter docs 2019-11-25 16:16:05 -05:00
Lisa Cawley 26beb486c7 [DOCS] Fixes security links (#49563) 2019-11-25 13:02:26 -08:00
James Rodewig c40449ac22 [DOCS] Reformat delimited payload token filter docs (#49380)
* Adds a title abbreviation
* Relocates the older name deprecation warning
* Updates the description and adds a Lucene link
* Adds a note to explain payloads and how to store them
* Adds analyze and custom analyzer snippets
* Adds a 'Return stored payloads' example
2019-11-25 15:40:05 -05:00
James Rodewig 99476db2d0 [DOCS] Remove individual task retrieval from cat/tasks API (#49550) 2019-11-25 10:32:39 -05:00
Benjamin Trent 688c78c589
[ML] Stop timing stats failure propagation (#49495) (#49501) 2019-11-25 10:09:30 -05:00
Kelly Campbell df5afa797e [DOCS] Correct GET path in cat tasks API docs (#49494)
Previously, the request example included `GET _cat/_tasks`. However, the resource should be `tasks`, not `_tasks`.
2019-11-25 09:37:59 -05:00
David Roberts 62811c2272 [ML] Add default categorization analyzer definition to ML info (#49545)
The categorization job wizard in the ML UI will use this
information when showing the effect of the chosen categorization
analyzer on a sample of input.
2019-11-25 13:39:16 +00:00
Dimitris Athanasiou d21df9eba9 [ML][DOCS] Anomaly detection job retention days settings do not require restart (#49546) 2019-11-25 14:19:10 +01:00
Dimitris Athanasiou aca38f6882
[7.x][ML] DFA jobs should accept excluding an unsupported field (#49535) (#49544)
Before this change excluding an unsupported field resulted in
an error message that explained the excluded field could not be
detected as if it doesn't exist. This error message is confusing.

This commit commit changes this so that there is no error in this
scenario. When excluding a field that does exist but has been
automatically been excluded from the analysis there is no harm
(unlike excluding a missing field which could be a typo).

Backport of #49535
2019-11-25 15:13:00 +02:00
Daniel Mitterdorfer 8c374014ae Add note about Gradle wrapper on Windows (#49528)
With this commit we add a clarifying note in the contribution guidelines
that our examples show the usage on Unix and also explain how to invoke
the Gradle wrapper script on Windows.

Closes #49521
2019-11-25 13:41:26 +01:00
Armin Braun af0f97d50a
Fix SLMSnapshotBlockingIntegTests.testSnapshotInProgress (#49533) (#49542)
This test must check for state `SUCCESS` as well. `SUCESS` in
`SnapshotsInProgress` means "all data nodes finished snapshotting sucessfully but master must still finalize the snapshot in the repo".
`SUCESS` does not mean that the snapshot is actually fully finished in this object.

You can easily reporduce the scenario in #49303 that has an in-progress snapshot in `SUCCESS` state
by waiting 20s before running the busy assert loop on the snapshot status so that all steps but the blocked
finalization can finish.

Closes #49303
2019-11-25 13:31:45 +01:00
Armin Braun 2502ff39a0
Enhance SnapshotResiliencyTests (#49514) (#49541)
A few enhancements to `SnapshotResiliencyTests`:
1. Test running requests from random nodes in more spots to enhance coverage (this is particularly motivated by #49060 where the additional number of cluster state updates makes it more interesting to fully cover all kinds of network failures)
2. Fix issue with restarting only master node in one test (doing so breaks the test at an incredibly low frequency, that becomes not so low in #49060 with the additional cluster state updates between request and response)
3. Improved cluster formation checks (now properly checks the term as well when forming cluster) + makes sure all nodes are connected to all other nodes (previously the data nodes would at times not be connected to other data nodes, which was shaken out now by adding the `client()` method
4. Make sure the cluster left behind by the test makes sense by running the repo cleanup action on it (this also increases coverage of the repository cleanup action obviously and adds the basis of making it part of more resiliency tests)
2019-11-25 13:31:28 +01:00
Dimitris Athanasiou c149c64dc4
[7.x][ML] Apply source query on data frame analytics memory estimation (#49517) (#49532)
Closes #49454

Backport of #49517
2019-11-25 12:51:57 +02:00
Armin Braun a5fa86ed97
Improve Stability of Mock APIs (#49518) (#49524)
This commit ensures that even for requests that are known to be empty body
we at least attempt to read one bytes from the request body input stream.
This is done to work around the behavior in `sun.net.httpserver.ServerImpl.Dispatcher#handleEvent`
that will close a TCP/HTTP connection that does not have the `eof` flag (see `sun.net.httpserver.LeftOverInputStream#isEOF`)
set on its input stream. As far as I can tell the only way to set this flag is to do a read when there's no more bytes buffered.
This fixes the numerous connection closing issues because the `ServerImpl` stops closing connections that it thinks
weren't fully drained.

Also, I removed a now redundant drain loop in the Azure handler as well as removed the connection closing in the error handler's
drain action (this shouldn't have an effect but makes things more predictable/easier to reason about IMO).

I would suggest merging this and closing related issue after verifying that this fixes things on CI.

The way to locally reproduce the issues we're seeing in tests is to make the retry timings more aggressive in e.g. the azure tests
and move them to single digit values. This makes the retries happen quickly enough that they run into the async connecting closing
of allegedly non-eof connections by `ServerImpl` and produces the exact kinds of failures we're seeing currently.

Relates #49401, #49429
2019-11-25 10:28:55 +01:00
Hendrik Muhs 5256756879 [Transform] add debug log for configuration index (#49484)
add debug log for transform creation and disallow partial results for retrieval
2019-11-25 09:49:17 +01:00
Nhat Nguyen 8260cba629 Increase timeout while checking for no snapshotted commit (#49461)
If some replica is performing a file-based recovery, then the check 
assertNoSnapshottedIndexCommit would fail. We should increase the
timeout for this check so that we can wait until all recoveries done
or aborted.

Closes #49403
2019-11-24 15:12:34 -05:00
Jared Tan 1d2bfd1af6 Include id to the error msg when it's too long (#49433) 2019-11-24 13:08:26 -05:00
Mark Vieira 777f6d5da6
Fix extraction of notarized Elasticsearch release distribution (#49511)
This commit introduces a workaround for an issue related to our recent
notarization of distributions starting with the 6.8.5 release. An
unintended side effect of notarization was that the file entries of the
release tar all have a `./` prefix in the path. This causes a number of
issues, not least of which is that our Gradle extract tasks end up
copying an empty fileset to the destination directory. The workaround
here is imply to remove the leading `./` path segment from each file
when performing the extraction. For more details see this issue:
https://github.com/elastic/elasticsearch/issues/49417
2019-11-22 17:19:47 -08:00
jesinity c9eba17517 Fix HLRC parsing of CancelTasks response (#47017)
Adds support for proper cancel tasks parsing.

Closes #45414
2019-11-22 16:56:27 -06:00
debadair 2ec047db04 [DOCS] Rename auditing topic. Closes #49012 (#49013)
* [DOCS] Rename auditing topic. Closes #49012

* Fixed file name, fixed settings link.

* Add link to settings
2019-11-22 14:16:58 -08:00
James Rodewig d06c71eb82 [DOCS] Fix edge n-gram tokenizer nav
Adds a missing float tag to the edge n-gram tokenizer docs. This tag
ensures the edge n-gram tokenizer docs display on the same page.
2019-11-22 15:54:07 -05:00
Dimitris Athanasiou 8eaee7cbdc
[7.x][ML] Explain data frame analytics API (#49455) (#49504)
This commit replaces the _estimate_memory_usage API with
a new API, the _explain API.

The API consolidates information that is useful before
creating a data frame analytics job.

It includes:

- memory estimation
- field selection explanation

Memory estimation is moved here from what was previously
calculated in the _estimate_memory_usage API.

Field selection is a new feature that explains to the user
whether each available field was selected to be included or
not in the analysis. In the case it was not included, it also
explains the reason why.

Backport of #49455
2019-11-22 22:06:10 +02:00
Jason Tedor 69f570ea5f
Adjust version on final pipeline serialization
This commit adjusts the version final pipeline serialization after it
was backported to the 7.5 branch.
2019-11-22 14:56:56 -05:00
Jay Modi 4fd5fb5297
Stop NodeTests from timing out in certain cases (#49202) (#49503)
The NodeTests class contains tests that check behavior when shutting
down a node. This involves starting a node, performing some operation,
stopping the node, and then awaiting the close of the node. Part of
closing a node is the termination of the node's ThreadPool. ThreadPool
termination semantics can be deceiving. The ThreadPool#terminate method
takes a timeout value and the first oddity is that the terminate method
can take two times the timeout value before returning. Internally this
method acts on the ExecutorService instances that are held by the
ThreadPool. First, an orderly shutdown is attempted and pending tasks
are allowed to execute while waiting for the timeout value. If any of
the ExecutorService instances have not terminated, a call is made to
attempt to stop all active tasks (usually using interrupts) and then
waits for up to the timeout value a second time for the termination of
the ExecutorService instances. This means that if use a large value
when waiting for a node to close, we may not attempt to interrupt any
threads that are in a blocking call before the test times out.

In order to avoid causing these tests to time out, this change reduces
the timeout passed to Node#awaitClose to 10 seconds from 1 day. This
will allow blocked threads to be interrupted before the test suite
fails due to the timeout.

Closes #44256
Closes #42350
Closes #44435
2019-11-22 12:41:52 -07:00
Jason Tedor 71bcfbf1e3
Replace required pipeline with final pipeline (#49470)
This commit enhances the required pipeline functionality by changing it
so that default/request pipelines can also be executed, but the required
pipeline is always executed last. This gives users the flexibility to
execute their own indexing pipelines, but also ensure that any required
pipelines are also executed. Since such pipelines are executed last, we
change the name of required pipelines to final pipelines.
2019-11-22 14:37:36 -05:00
Jay Modi 1431c2b408
Run build-tools test with Gradle jdk (#49459) (#49497)
The test task is configured to use the runtime java version, but there
are issues with the version of groovy used by gradle pre 6.0. In order
to workaround this, we use the Gradle JDK to execute the build-tools
tests.

Closes #49404
Closes #49253
2019-11-22 11:59:46 -07:00