Commit Graph

4168 Commits

Author SHA1 Message Date
Albert Zaharovits f25b6cc2eb
Add new 'maintenance' index privilege #50643
This commit creates a new index privilege named `maintenance`.
The privilege grants the following actions: `refresh`, `flush` (also synced-`flush`),
and `force-merge`. Previously the actions were only under the `manage` privilege
which in some situations was too permissive.

Co-authored-by: Amir H Movahed <arhd83@gmail.com>
2020-01-30 11:59:11 +02:00
Henning Andersen 149b68d850 [ML] Fix possible race condition starting datafeed (#51646)
Datafeeds being closed while starting could result in and NPE. This was
handled as any other failure, masking out the NPE. However, this
conflicts with the changes in #50886.

Related to #50886 and #51302
2020-01-30 08:23:45 +01:00
Aleksandr Maus 0d21d9e2c5
EQL: Enable QA/rest integration tests for snapshot builds only (#51624) (#51645)
* Related to #51541: [CI] unknown setting [xpack.eql.enabled] in release-tests
2020-01-29 16:38:52 -05:00
Julie Tibshirani 9dcc3ef7e6
Always use one shard in vector REST tests. (#51643)
This PR tries to address the intermittent vector test failures on 7.x by making
sure we create indices with one shard.

The fix is based on this theory as to what's happening:
* On 7.x, the default number of shards is 1, but in REST tests we randomly use
2 in order to cover the multiple shards case. In the failing test run, we use 2
shards and all documents end up on only one shard.
* During a search, the response from the empty shard doesn't produce
deprecation warnings because  we never try to execute the script. If not all
shard responses contain the warning headers, then certain deprecation warnings
can be lost (due to the bug described in #33936).

Addresses #50716.
Relates to #50061.
2020-01-29 12:24:41 -08:00
Przemysław Witek 683170b007
Increase the number of indexed documents to increase a chance that there are at least 2 training rows. (#51607) (#51615) 2020-01-29 17:17:19 +01:00
David Roberts e0e35b7feb [TEST] Mute TimeSeriesLifecycleActionsIT.testWaitForSnapshotSlmExecutedBefore
Due to https://github.com/elastic/elasticsearch/issues/50781
2020-01-29 13:08:55 +01:00
Martijn van Groningen b253af36f3
The watcher indexing listener didn't handle document level exceptions. (#51466)
Prior to the change the watcher index listener didn't implement the
`postIndex(ShardId, Engine.Index, Engine.IndexResult)` method. This
caused document level exceptions like VersionConflictEngineException
to be ignored. This commit fixes this.

The watcher indexing listener did implement the `postIndex(ShardId, Engine.Index, Exception)`
method, but that only handles engine level exceptions.

This change also unmutes the SmokeTestWatcherTestSuiteIT#testMonitorClusterHealth test again.

Relates to #32299
2020-01-29 12:55:02 +01:00
Rory Hunter d8bd736f8a
Formatting: keep simple if / else on the same line (#51544)
Backport of #51526.

Previous the formatter was breaking simple if/else statements (i.e.
without braces) onto separate lines, which could be fragile because the
formatter cannot also introduce braces. Instead, keep such expressions
on the same line.
2020-01-29 10:42:04 +00:00
Albert Zaharovits 90285ee907
Deprecate timeout.tcp_read AD/LDAP realm setting (#47305)
The timeout.tcp_read AD/LDAP realm setting, despite the low-level
allusion, controls the time interval the realms wait for a response for
a query (search or bind). If the connection to the server is synchronous
(un-pooled) the response timeout is analogous to the tcp read timeout.
But the tcp read timeout is irrelevant in the common case of a pooled
connection (when a Bind DN is specified).

The timeout.tcp_read qualifier is hereby deprecated in favor of
timeout.response.

In addition, the default value for both timeout.tcp_read and
timeout.response is that of timeout.ldap_search, instead of the 5s (but
the default for timeout.ldap_search is still 5s). The
timeout.ldap_search defines the server-controlled timeout of a search
request. There is no practical use case to have a smaller tcp_read
timeout compared to ldap_search (in this case the request would time-out
on the client but continue to be processed on the server). The proposed
change aims to simplify configuration so that the more common
configuration change, adjusting timeout.ldap_search up, has the expected
result (no timeout during searches) without any additional
modifications.

Closes #46028
2020-01-29 10:48:26 +02:00
Jason Tedor 3a7192966a
Check if interface is up for loopback devices only (#51583)
In the SQL with SSL tests, we need to find the interfaces that are up,
are loopback devices, or have a loopback address. If we check if the
device is up first, we can run into situations where the device is a
virtual ethernet device that might have disappeared between us seeing
the device, and checking if it is up. By first checking if the device is
a loopback device or it has a loopback address, then we can avoid
checking if the device is up except for loopback devices and therefore
we can avoid the disappearing virtual ethernet device problem.
2020-01-28 18:38:46 -05:00
Armin Braun aae93a7578
Allow Repository Plugins to Filter Metadata on Create (#51472) (#51542)
* Allow Repository Plugins to Filter Metadata on Create

Add a hook that allows repository plugins to filter the repository metadata
before it gets written to the cluster state.
2020-01-28 18:33:26 +01:00
Gordon Brown 89c2834b24
Deprecate creation of dot-prefixed index names except for hidden and system indices (#49959)
This commit deprecates the creation of dot-prefixed index names (e.g.
.watches) unless they are either 1) a hidden index, or 2) registered by
a plugin that extends SystemIndexPlugin. This is the first step
towards more thorough protections for system indices.

This commit also modifies several plugins which use dot-prefixed indices
to register indices they own as system indices, and adds a plugin to
register .tasks as a system index.
2020-01-28 10:01:16 -07:00
Yannick Welsch f6686345c9 Avoid unnecessary setup and teardown in docs tests (#51430)
The docs tests have recently been running much slower than before (see #49753).

The gist here is that with ILM/SLM we do a lot of unnecessary setup / teardown work on each
test. Compounded with the slightly slower cluster state storage mechanism, this causes the
tests to run much slower.

In particular, on RAMDisk, docs:check is taking

ES 7.4: 6:55 minutes
ES master: 16:09 minutes
ES with this commit: 6:52 minutes

on SSD, docs:check is taking

ES 7.4: ??? minutes
ES master: 32:20 minutes
ES with this commit: 11:21 minutes
2020-01-28 16:52:23 +01:00
David Roberts 550254ec7f [ML] Use CSV ingest processor in find_file_structure ingest pipeline (#51492)
Changes the find_file_structure response to include a CSV
ingest processor in the ingest pipeline it suggests.

Previously the Kibana file upload functionality parsed CSV
in the browser, but by parsing CSV in the ingest pipeline
it makes the Kibana file upload functionality more easily
interchangable with Filebeat such that the configurations
it creates can more easily be used to import data with the
same structure repeatedly in production.
2020-01-28 14:38:43 +00:00
Aleksandr Maus a8bd4d08e3 Merge branch 'feature/eql_backport' into 7.x 2020-01-28 09:19:39 -05:00
Hendrik Muhs 53e4d1ef07 [Transform] fix TransformRobustnessIT intermittent test failures part 2 (#51523)
add wait for completion in transform robustness test to avoid occasional test failures during cleanup

fixes #51347
2020-01-28 13:37:01 +01:00
William Brafford 9efa5be60e
Password-protected Keystore Feature Branch PR (#51123) (#51510)
* Reload secure settings with password (#43197)

If a password is not set, we assume an empty string to be
compatible with previous behavior.
Only allow the reload to be broadcast to other nodes if TLS is
enabled for the transport layer.

* Add passphrase support to elasticsearch-keystore (#38498)

This change adds support for keystore passphrases to all subcommands
of the elasticsearch-keystore cli tool and adds a subcommand for
changing the passphrase of an existing keystore.
The work to read the passphrase in Elasticsearch when
loading, which will be addressed in a different PR.

Subcommands of elasticsearch-keystore can handle (open and create)
passphrase protected keystores

When reading a keystore, a user is only prompted for a passphrase
only if the keystore is passphrase protected.

When creating a keystore, a user is allowed (default behavior) to create one with an
empty passphrase

Passphrase can be set to be empty when changing/setting it for an
existing keystore

Relates to: #32691
Supersedes: #37472

* Restore behavior for force parameter (#44847)

Turns out that the behavior of `-f` for the add and add-file sub
commands where it would also forcibly create the keystore if it
didn't exist, was by design - although undocumented.
This change restores that behavior auto-creating a keystore that
is not password protected if the force flag is used. The force
OptionSpec is moved to the BaseKeyStoreCommand as we will presumably
want to maintain the same behavior in any other command that takes
a force option.

*  Handle pwd protected keystores in all CLI tools  (#45289)

This change ensures that `elasticsearch-setup-passwords` and
`elasticsearch-saml-metadata` can handle a password protected
elasticsearch.keystore.
For setup passwords the user would be prompted to add the
elasticsearch keystore password upon running the tool. There is no
option to pass the password as a parameter as we assume the user is
present in order to enter the desired passwords for the built-in
users.
For saml-metadata, we prompt for the keystore password at all times
even though we'd only need to read something from the keystore when
there is a signing or encryption configuration.

* Modify docs for setup passwords and saml metadata cli (#45797)

Adds a sentence in the documentation of `elasticsearch-setup-passwords`
and `elasticsearch-saml-metadata` to describe that users would be
prompted for the keystore's password when running these CLI tools,
when the keystore is password protected.

Co-Authored-By: Lisa Cawley <lcawley@elastic.co>

* Elasticsearch keystore passphrase for startup scripts (#44775)

This commit allows a user to provide a keystore password on Elasticsearch
startup, but only prompts when the keystore exists and is encrypted.

The entrypoint in Java code is standard input. When the Bootstrap class is
checking for secure keystore settings, it checks whether or not the keystore
is encrypted. If so, we read one line from standard input and use this as the
password. For simplicity's sake, we allow a maximum passphrase length of 128
characters. (This is an arbitrary limit and could be increased or eliminated.
It is also enforced in the keystore tools, so that a user can't create a
password that's too long to enter at startup.)

In order to provide a password on standard input, we have to account for four
different ways of starting Elasticsearch: the bash startup script, the Windows
batch startup script, systemd startup, and docker startup. We use wrapper
scripts to reduce systemd and docker to the bash case: in both cases, a
wrapper script can read a passphrase from the filesystem and pass it to the
bash script.

In order to simplify testing the need for a passphrase, I have added a
has-passwd command to the keystore tool. This command can run silently, and
exit with status 0 when the keystore has a password. It exits with status 1 if
the keystore doesn't exist or exists and is unencrypted.

A good deal of the code-change in this commit has to do with refactoring
packaging tests to cleanly use the same tests for both the "archive" and the
"package" cases. This required not only moving tests around, but also adding
some convenience methods for an abstraction layer over distribution-specific
commands.

* Adjust docs for password protected keystore (#45054)

This commit adds relevant parts in the elasticsearch-keystore
sub-commands reference docs and in the reload secure settings API
doc.

* Fix failing Keystore Passphrase test for feature branch (#50154)

One problem with the passphrase-from-file tests, as written, is that
they would leave a SystemD environment variable set when they failed,
and this setting would cause elasticsearch startup to fail for other
tests as well. By using a try-finally, I hope that these tests will fail
more gracefully.

It appears that our Fedora and Ubuntu environments may be configured to
store journald information under /var rather than under /run, so that it
will persist between boots. Our destructive tests that read from the
journal need to account for this in order to avoid trying to limit the
output we check in tests.

* Run keystore management tests on docker distros (#50610)

* Add Docker handling to PackagingTestCase

Keystore tests need to be able to run in the Docker case. We can do this
by using a DockerShell instead of a plain Shell when Docker is running.

* Improve ES startup check for docker

Previously we were checking truncated output for the packaged JDK as
an indication that Elasticsearch had started. With new preliminary
password checks, we might get a false positive from ES keystore
commands, so we have to check specifically that the Elasticsearch
class from the Bootstrap package is what's running.

* Test password-protected keystore with Docker (#50803)

This commit adds two tests for the case where we mount a
password-protected keystore into a Docker container and provide a
password via a Docker environment variable.

We also fix a logging bug where we were logging the identifier for an
array of strings rather than the contents of that array.

* Add documentation for keystore startup prompting (#50821)

When a keystore is password-protected, Elasticsearch will prompt at
startup. This commit adds documentation for this prompt for the archive,
systemd, and Docker cases.

Co-authored-by: Lisa Cawley <lcawley@elastic.co>

* Warn when unable to upgrade keystore on debian (#51011)

For Red Hat RPM upgrades, we warn if we can't upgrade the keystore. This
commit brings the same logic to the code for Debian packages. See the
posttrans file for gets executed for RPMs.

* Restore handling of string input

Adds tests that were mistakenly removed. One of these tests proved
we were not handling the the stdin (-x) option correctly when no
input was added. This commit restores the original approach of
reading stdin one char at a time until there is no more (-1, \r, \n)
instead of using readline() that might return null

* Apply spotless reformatting

* Use '--since' flag to get recent journal messages

When we get Elasticsearch logs from journald, we want to fetch only log
messages from the last run. There are two reasons for this. First, if
there are many logs, we might get a string that's too large for our
utility methods. Second, when we're looking for a specific message or
error, we almost certainly want to look only at messages from the last
execution.

Previously, we've been trying to do this by clearing out the physical
files under the journald process. But there seems to be some contention
over these directories: if journald writes a log file in between when
our deletion command deletes the file and when it deletes the log
directory, the deletion will fail.

It seems to me that we might be able to use journald's "--since" flag to
retrieve only log messages from the last run, and that this might be
less likely to fail due to race conditions in file deletion.

Unfortunately, it looks as if the "--since" flag has a granularity of
one-second. I've added a two-second sleep to make sure that there's a
sufficient gap between the test that will read from journald and the
test before it.

* Use new journald wrapper pattern

* Update version added in secure settings request

Co-authored-by: Lisa Cawley <lcawley@elastic.co>
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
2020-01-28 05:32:32 -05:00
Hendrik Muhs 2239ba8c6e
[Transform] avoid mapping problems with index templates (#51368) (#51519)
insert explict mappings for objects in nested output to avoid clashes with index templates

fixes #51321
2020-01-28 11:31:07 +01:00
Hendrik Muhs 61663b495e add an integration test using date_nanos as timestamp (#51477)
add a test for using date_nanos as timestamp field in a continuous transform
2020-01-28 10:10:23 +01:00
Hendrik Muhs bebce4b190 audit index creation after it the index has been created (#51479)
moves audit message for index creation after the index has been successfully created. This has
been confusing for a user where index creation failed but audit reported index creation.
2020-01-28 10:06:46 +01:00
Ioannis Kakavas 4f3548fbd7
Disable diagnostic trust manager in tests (#51501)
This commit sets `xpack.security.ssl.diagnose.trust` to false in all
of our tests when running in FIPS 140 mode and when settings objects
are used to create an instance of the SSLService. This is needed
in 7.x because setting xpack.security.ssl.diagnose.trust to true
wraps SunJSSE TrustManager with our own DiagnosticTrustManager and
this is not allowed when SunJSSE is in FIPS mode.
An alternative would be to set xpack.security.fips.enabled to
true which would also implicitly disable
xpack.security.ssl.diagnose.trust but would have additional effects
(would require that we set PBKDF2 for password hashing algorithm in
all test clusters, would prohibit using JKS keystores in nodes even
if relevant tests have been muted in FIPS mode etc.)

Relates: #49900
Resolves: #51268
2020-01-28 10:17:35 +02:00
Przemko Robakowski 919083decd
Don't overwrite target field with SetSecurityUserProcessor (#51454) (#51506)
* Don't overwrite target field with SetSecurityUserProcessor

This change fix problem with `SetSecurityUserProcessor` which was overwriting
whole target field and not only fields really filled by the processor.

Closes #51428

* Unused imports removed
2020-01-28 02:12:09 +01:00
Jason Tedor 92b611ece1
Formalize build snapshot (#51484)
Today we are repeatedly checking if the current build is a snapshot
build or not by reading the system property build.snapshot. This commit
formalizes this by adding a build parameter to indicate whether or not
the current build is a snapshot build.
2020-01-27 16:56:31 -05:00
Aleksandr Maus eb1ed2a35f Compilation fixes for 7.x 2020-01-27 16:23:36 -05:00
Aleksandr Maus d8f1735e39 Add xpack.eql.enabled feature flag, disabled by default. Enabled only for integration tests. (#51370)
Related to https://github.com/elastic/elasticsearch/issues/49581
2020-01-27 15:15:22 -05:00
Costin Leau d049de5b72 EQL: import QL into EQL (#50904)
Link QL into the new build file
Remove duplicate classes and use the new ql package
Update Exception hierarchy on top of QlException
2020-01-27 15:13:22 -05:00
Igor Motov c184411456 EQL: Replace EqlSearchResponse.Hits parser with ObjectParser (#50925)
Replaces the existing hand-build Hits parser with a
ConstructingObjectParser version.

Relates to #49581
2020-01-27 15:13:09 -05:00
Igor Motov 88cc30c0d8 EQL: Remove list classes from EqlSearchResponse (#50870)
Removes unnecessary classes from EqlSearchResponse that just represent
lists of other elements.

Relates to #49581
2020-01-27 15:13:00 -05:00
Aleksandr Maus d715176c00 Add more Eql REST API validation integration tests, clean up request implementation (#50822) 2020-01-27 15:12:48 -05:00
Igor Motov 628083183f EQL: Make EqlSearchResponse immutable (#50810)
Refactors EqlSearchResponse to make it immutable

Relates to #49581
2020-01-27 15:12:07 -05:00
Aleksandr Maus 31d2d01e25 Correct search_after handling (#50629) 2020-01-27 15:11:51 -05:00
Aleksandr Maus 79875ce4d9 Initial EQL rest API implementation (#49768) 2020-01-27 15:11:41 -05:00
Costin Leau 10a16d15d1 Add draft EQL grammar and expression tree 2020-01-27 15:11:18 -05:00
Costin Leau e22f501018
QL: Backport project to 7.x (#51497)
* Introduce reusable QL plugin for SQL and EQL (#50815)

Extract reusable functionality from SQL into its own dedicated project QL.
Implemented as a plugin, it provides common components across SQL and the upcoming EQL.

While this commit is fairly large, for the most part it's just a big file move from sql package to the newly introduced ql.

(cherry picked from commit ec1ac0d463bfa12a02c8174afbcdd6984345e8b4)

* SQL: Fix incomplete registration of geo NamedWritables

(cherry picked from commit e295763686f9592976e551e504fdad1d2a3a566d)

* QL: Extend NodeSubclass to read classes from jars (#50866)

As the test classes are spread across more than one project, the Gradle
classpath contains not just folders but also jars.
This commit allows the test class to explore the archive content and
load matching classes from said source.

(cherry picked from commit 25ad74928afcbf286dc58f7d430491b0af662f04)

* QL: Remove implicit conversion inside Literal (#50962)

Literal constructor makes an implicit conversion for each value given
which turns out has some subtle side-effects.
Improve MathProcessors to preserve numeric type where possible
Fix bug on issue compatibility between date and intervals
Preserve the source when folding inside the Optimizer

(cherry picked from commit 9b73e225b0aa07a23859550fb117bae571a2b672)

* QL: Refactor DataType for pluggability (#51328)

Change DataType from enum to class
Break DataType enums into QL (default) and SQL types
Make data type conversion pluggable so that new types can be introduced

As part of the process:
- static type conversion in QL package (such as Literal) has been
removed
- several utility classes have been broken into base (QL) and extended
(SQL) parts based on type awareness
- operators (+,-,/,*) are
- due to extensibility, serialization of arithmetic operation has been
slightly changed and pushed down to the operator executor itself

(cherry picked from commit aebda81b30e1563b877a8896309fd50633e0b663)

* Compilation fixes for 7.x
2020-01-27 22:03:58 +02:00
Ryan Ernst 6ee1baf2ed
Migrate cron eval bats test to java (#50940) (#51007)
This commit migrates the simple test of the cron eval tool from bats to
java packaging tests.

relates #46005
2020-01-27 10:49:01 -08:00
Nik Everett 4ff314a9d5
Begin moving date_histogram to offset rounding (take two) (#51271) (#51495)
We added a new rounding in #50609 that handles offsets to the start and
end of the rounding so that we could support `offset` in the `composite`
aggregation. This starts moving `date_histogram` to that new offset.

This is a redo of #50873 with more integration tests.

This reverts commit d114c9db3e1d1a766f9f48f846eed0466125ce83.
2020-01-27 13:40:54 -05:00
David Roberts 3c223ceea1 [ML] Fix 2 digit year regex in find_file_structure (#51469)
The DATE and DATESTAMP Grok patterns match 2 digit years
as well as 4 digit years.  The pattern determination in
find_file_structure worked correctly in this case, but
the regex used to create a multi-line start pattern was
assuming a 4 digit year.  Also, the quick rule-out
patterns did not always correctly consider 2 digit years,
meaning that detection was inconsistent.

This change fixes both problems, and also extends the
tests for DATE and DATESTAMP to check both 2 and 4 digit
years.
2020-01-27 17:23:18 +00:00
Benjamin Trent 8559ff7cee
[ML][Inference] fixing pattern compilation + unnecessary string copy (#51483) (#51487) 2020-01-27 12:12:34 -05:00
Hendrik Muhs b233e93014
[Transform] refactor naming leftovers and apply code formating (#51465) (#51470)
refactor renaming leftovers: "data frame transform" to "transforms", touch only internals (variable
names, non-public API's, doc strings, ...) and apply code-formatting (spotless). No logical changes.
2020-01-27 14:04:57 +01:00
Andrei Dan 977cce002e
Preserve slm-history-ilm-policy between test runs (#51442) (#51468)
(cherry picked from commit 4e95c8a94fa700d44ac31ef17547512748ab1885)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-27 10:40:40 +00:00
Andrei Dan d872db278a
Fix TimeSeriesLifecycleActionsIT.testShrinkAction (#51431) (#51467)
* Fix TimeSeriesLifecycleActionsIT.testShrinkAction

Shrinking a 6 shard index to 3 shards can be quite time consuming and
assertBusy probes the conditions at exponentially growing intervals.

This separates the one assertion that was used for all the conditions
into multiple assertBusy statements and increases the timeout for waiting
for the shrink to complete.

* Allow more time for shrink to complete

This commit allows more time for the shrink operation to complete in
testRetryFailedShrinkAction (separating the assertBusy calls too) and
testMoveToRolloverStep.

* Shrink to no more than 2 shards in tests

(cherry picked from commit 5fe780148fa3536915d61475b087896a5b9ace82)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-27 10:40:29 +00:00
Ioannis Kakavas ee202a642f
Enable tests in FIPS 140 in JDK 11 (#49485)
This change changes the way to run our test suites in 
JVMs configured in FIPS 140 approved mode. It does so by:

- Configuring any given runtime Java in FIPS mode with the bundled
policy and security properties files, setting the system
properties java.security.properties and java.security.policy
with the == operator that overrides the default JVM properties
and policy.

- When runtime java is 11 and higher, using BouncyCastle FIPS 
Cryptographic provider and BCJSSE in FIPS mode. These are 
used as testRuntime dependencies for unit
tests and internal clusters, and copied (relevant jars)
explicitly to the lib directory for testclusters used in REST tests

- When runtime java is 8, using BouncyCastle FIPS 
Cryptographic provider and SunJSSE in FIPS mode. 

Running the tests in FIPS 140 approved mode doesn't require an
additional configuration either in CI workers or locally and is
controlled by specifying -Dtests.fips.enabled=true
2020-01-27 11:14:52 +02:00
Przemysław Witek dd3e2f1e18
[7.x] Update quantiles document in the index the document belongs to (#51135) (#51415) 2020-01-27 10:13:02 +01:00
Przemko Robakowski fbec19c022
Centralize mocks initialization in ILM steps tests (#51384) (#51453)
* Centralize mocks initialization in ILM steps tests

This change centralizes initialization of `Client`, `AdminClient`
and `IndicesAdminClient` for all classes extending `AbstractStepTestCase`.
This removes a lot of code duplication and make it easier to write tests.
This also removes need for `AsyncActionStep#setClient`

* Unused imports removed

* Added missed tests

* Fix OpenFollowerIndexStepTests
2020-01-25 01:19:55 +01:00
Lee Hinman 8560847dd9
[7.x] Check all snapshots in SnapshotLifecycleRestIT.testFullP… (#51448)
* Check all snapshots in SnapshotLifecycleRestIT.testFullPolicy

Rather than check the first returned snapshot for a snapshot starting with `snap-` in
SnapshotLifecycleRestIT.testFullPolicy, this commit changes the test to find any snapshots starting
with `snap-`.

In the event that there are no snapshots (the failure case), this also exposes the full results map
so we can diagnose why a failure occurred.

Relates to #50358

* Use a more imperative style for checking
2020-01-24 14:30:42 -07:00
Lee Hinman bdb8b6aa0d
[7.x] Separate aliases used for tests in TimeSeriesLifecycleAc… (#51432)
* Separate aliases used for tests in TimeSeriesLifecycleActionsIT

This is related to #51375 and hopes to help illuminate why some of those tests are failing. This
commit switches the aliases used in the test to use a random alias name every time (since there were
some complaints in the tests about aliases having more than one write index). With this we hope to
determine the actual cause of the failure in the test.

This also adds additional information to the exception returned when calling move-to-step with the
incorrect current step.

* Fix rest test
2020-01-24 11:05:19 -07:00
Benjamin Trent bf53ca3380
[7.x] [ML] Add _cat/ml/anomaly_detectors API (#51364) (#51408)
[ML] Add _cat/ml/anomaly_detectors API (#51364)
2020-01-24 11:54:22 -05:00
Benjamin Trent fc994d9ce1
[ML][Inference] Adds validations for model PUT (#51376) (#51409)
Adds validations making sure that

* `input.field_names` is not empty
* `ensemble.trained_models` is not empty
* `tree.feature_names` is not empty

closes https://github.com/elastic/elasticsearch/issues/51354
2020-01-24 09:29:12 -05:00
Hendrik Muhs d177747f66 fix TransformRobustnessIT intermittent test failures
ensure the cluster is not in some intermediate state when cleaning up.

fixes #51347
2020-01-24 15:19:11 +01:00
Benjamin Trent 76660a5a4f
[7.x] [ML][Inference] add tags url param to GET (#51330) (#51404)
* [ML][Inference] add tags url param to GET (#51330)

Adds a new URL parameter, `tags` to the GET _ml/inference/<model_id> endpoint.

This parameter allows the list of models to be further reduced to those who contain all the provided tags.
2020-01-24 08:26:58 -05:00
Martijn van Groningen 53ac28e398
Update smoke test watcher test suite with the changes in master branch.
Relates to #32299
2020-01-24 14:02:55 +01:00
Hendrik Muhs ded7407b4d
[Transform] Adapt tests for error message to 7.x format
adapt messages to 7.x format (#51398)

fixes #51360
2020-01-24 12:17:32 +01:00
Andrei Dan 2f7c240184
[7.x] Use ESSingleNodeTestCase instead of ESIntegTestCase (#51345) (#51346)
* Use ESSingleNodeTestCase instead of ESIntegTestCase (#51345)

(cherry picked from commit abcf1c41faf05a0b0196fb06e57c3de8c3d67688)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-24 10:53:37 +00:00
Przemysław Witek 8703b885c2
Move TupleMatchers class to org.elasticsearch.test.hamcrest package (#51359) (#51395) 2020-01-24 11:10:54 +01:00
Tim Vernum 0981a469ae
Preserve ApiKey credentials for async verification (#51389)
The ApiKeyService would aggressively "close" ApiKeyCredentials objects
during processing. However, under rare circumstances, the verfication
of the secret key would be performed asychronously and may need access
to the SecureString after it had been closed by the caller.

The trigger for this would be if the cache already held a Future for
that ApiKey, but the future was not yet complete. In this case the
verification of the secret key would take place asynchronously on the
generic thread pool.

This commit moves the "close" of the credentials to the body of the
listener so that it only occurs after key verification is complete.

Backport of: #51244
2020-01-24 19:35:07 +11:00
Hendrik Muhs d46e8c3f7f [Transform] disallow dotted fieldnames (#51369)
adds field validation to disallow output field names starting and/or ending with a '.'. Avoids
indexing/mapping problems when starting the transform.
2020-01-24 09:05:44 +01:00
Dimitris Athanasiou 3443d69883
[7.x][ML] Rename DataFrameAnalyticsIndex to DestinationIndex (#51353) (#51356)
As we prepare to introduce a new index for storing additional
information about data frame analytics jobs (e.g. intrumentation),
renaming this class to `DestinationIndex` better captures what it does
and leaves its prior name available for a more suitable use.

Backport of #51353

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-01-24 09:51:48 +02:00
Lee Hinman 31747de2a2 Fix testHashcodeAndEquals mutation for WaitForSnapshotStep (#51379)
This fixes the test failure, it was randomness returning the same policy
rather than a new one. Switched to use `randomValueOtherThan`.

Resolves #51377
2020-01-23 16:19:50 -07:00
Nhat Nguyen 072203cba8
Clean soft-deletes setting in ccr tests (#51113) (#51372)
We no longer need to explicitly enable soft-deletes in CCR tests.

Relates #50775
Backport of #51113
2020-01-23 16:31:47 -05:00
Zachary Tong 2e314a133f Mute TransformTaskFailedStateIT#testStartFailedTransform
Tracking issue: https://github.com/elastic/elasticsearch/issues/51360
2020-01-23 12:53:02 -05:00
Zachary Tong feb2a25761 Mute TransformTaskFailedStateIT#testForceStopFailedTransform
Tracking issue: https://github.com/elastic/elasticsearch/issues/51360
2020-01-23 11:58:36 -05:00
Ioannis Kakavas d279b462f7
Fix flaky testCreateApiKey test (#51223) (#51349)
API Key expiration value has millisecond precision as we use
{@link Instant#toEpoqueMilli()} when creating the API key
document.
It could often happen that `Instant.now()` Instant in the testCreateApiKey
was close enough to the ApiKeyService's `clock.instant()` Instant,
when the nanos were removed from the latter ( due to the call
to `toEpoqueMilli()` ) the result of comparing these two Instants
was a few nanos short of a 7 days.

Resolves: #47958
2020-01-23 17:45:55 +02:00
Hendrik Muhs 3553f68f5a [Transform] Handle permanent bulk indexing errors (#51307)
check bulk indexing error for permanent problems and ensure the state goes into failed instead of
retry. Corrects the stats API to show the real error and avoids excessive audit logging.

fixes #50122
2020-01-23 16:17:26 +01:00
Przemko Robakowski 84664e8d60
Expose master timeout for ILM actions (#51130) (#51348)
This change exposes master timeout to ILM steps through global dynamic setting.
All currently implemented steps make use of this setting as well.

Closes #44136
2020-01-23 15:28:13 +01:00
Nhat Nguyen acf84b68cb Do not wrap soft-deletes reader for segment stats (#51331)
IndexWriter might not filter out fully deleted segments if retention
leases exist or the number of the retaining operations is non-zero.
SoftDeletesDirectoryReaderWrapper, however, always filters out fully
deleted segments.

This change uses the original directory reader when calculating segment
stats instead.

Relates #51192
Closes #51303
2020-01-23 08:43:06 -05:00
David Kyle 0ac03ac5e7
[ML] Add parsers for inference configuration classes (#51300) 2020-01-22 17:03:01 +00:00
David Kyle ca4b90a001
[ML] Calculate results and snapshot retention using latest bucket timestamps (#51061) (#51301)
The retention period is calculated relative to the last bucket result or snapshot
time rather than wall clock
2020-01-22 14:52:33 +00:00
Dimitris Athanasiou 59687a9384
[7.x][ML] Validate classification dependent_variable cardinality is at lea… (#51232) (#51309)
Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.

Backport of #51232
2020-01-22 16:51:16 +02:00
Benjamin Trent 2a73e849d6
[ML][Inference] fixing ingest IT tests (#51267) (#51311)
Converts InferenceIngestIT into a `ESRestTestCase`.

closes #51201
2020-01-22 09:50:17 -05:00
David Roberts 932c63297f [ML] Fix possible race condition when starting datafeed (#51302)
The ID of the datafeed's associated job was being obtained
frequently by looking up the datafeed task in a map that
was being modified in other threads.  This could lead to
NPEs if the datafeed stopped running at an unexpected time.

This change reduces the number of places where a datafeed's
associated job ID is looked up to avoid the possibility of
failures when the datafeed's task is removed from the map
of running tasks during multi-step operations in other
threads.

Fixes #51285
2020-01-22 11:40:39 +00:00
Przemysław Witek bfcfcdee33
[7.x] Do not copy mapping from dependent variable to prediction field in regression analysis (#51227) (#51288) 2020-01-22 12:36:24 +01:00
Andrei Dan 421aa14972
ILM: Make UpdateSettingsStep retryable (#51235) (#51298)
This makes the UpdateSettingsStep retryable. This step updates settings needed
during the execution of ILM actions (mark indexes as read-only, change
allocation configurations, mark indexing complete, etc)

As the index updates are idempotent in nature (PUT requests and are applied only
if the values have changed) and the settings values are seldom user-configurable
(aside from the allocate action) the testing for this change goes along the
lines of artificially simulating a setting update failure on a particular value
update, which is followed by a successful step execution (a retry) in an
environment outside of ILM (the step executions are triggered manually).

(cherry picked from commit 8391b0aba469f39532bfc2796b76148167dc0289)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-22 11:02:26 +00:00
Andrei Dan 123266714b
ILM wait for active shards on rolled index in a separate step (#50718) (#51296)
After we rollover the index we wait for the configured number of shards for the
rolled index to become active (based on the index.write.wait_for_active_shards
setting which might be present in a template, or otherwise in the default case,
for the primaries to become active).
This wait might be long due to disk watermarks being tripped, replicas not
being able to spring to life due to cluster nodes reconfiguration and others
and, the RolloverStep might not complete successfully due to this inherent
transient situation, albeit the rolled index having been created.

(cherry picked from commit 457a92fb4c68c55976cc3c3e2f00a053dd2eac70)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-22 11:01:52 +00:00
Ioannis Kakavas a76321437c
Truncate SAML Response in trace log (#51237) (#51283)
When not truncated, a long SAML response XML document can fill max
line length and mask the actual exception message that the trace
statement is meant to inform about.
The same XML Document is also printed in full on trace level in
SamlRequestHandler#parseSamlMessage() so there is no loss of
information
2020-01-22 09:56:39 +02:00
Nik Everett ca15a3f5a8
Add "did you mean" to unknown queries (#51177) (#51254)
This replaces the message we return for unknown queries with the standard
one that we use for unknown fields from `ObjectParser`. This is nice
because it includes "did you mean". One day we might convert parsing
queries to using object parser, but that looks complex. This change is
much smaller and seems useful.
2020-01-21 12:45:52 -05:00
Benjamin Trent a9b2bc525e
[ML] address two edge cases for categorization.GrokPatternCreator#findBestGrokMatchFromExamples (#51168) (#51255)
There are two edge cases that can be ran into when example input is matched in a weird way.

1. Recursion depth could continue many many times, resulting in a HUGE runtime cost. I put a limit of 10 recursions (could be adjusted I suppose). 
2. If there are no "fixed regex bits", exploring the grok space would result in a fence-post error during runtime (with assertions turned off)
2020-01-21 10:29:29 -05:00
Martijn van Groningen 6b5b26a595
Protects against NPE:
2> REPRODUCE WITH: ./gradlew ':x-pack:plugin:watcher:test' --tests "org.elasticsearch.xpack.watcher.history.HistoryTemplateTransformMappingsTests.testTransformFields" -Dtests.seed=26754396AB9C1A30 -Dtests.security.manager=true -Dtests.locale=lv-LV -Dtests.timezone=America/Dominica -Dcompiler.java=13 -Druntime.java=8
  2> java.lang.NullPointerException
        at __randomizedtesting.SeedInfo.seed([26754396AB9C1A30:B2A3CA27E260803B]:0)
        at org.elasticsearch.xpack.watcher.history.HistoryTemplateTransformMappingsTests.lambda$testTransformFields$1(HistoryTemplateTransformMappingsTests.java:85)
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
        at java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1628)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
        at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
        at org.elasticsearch.xpack.watcher.history.HistoryTemplateTransformMappingsTests.lambda$testTransformFields$2(HistoryTemplateTransformMappingsTests.java:88)
        at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:892)
        at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:877)
        at org.elasticsearch.xpack.watcher.history.HistoryTemplateTransformMappingsTests.testTransformFields(HistoryTemplateTransformMappingsTests.java:74)
2020-01-21 15:42:22 +01:00
Nik Everett 788836ea3f
Revert "Begin moving date_histogram to offset rounding (backport of #50873) (#50978)" (#51239)
This reverts commit 9a3d4db840. It was
subtly broken in ways we didn't have tests for.
2020-01-21 08:50:02 -05:00
David Roberts 0fa7db9a95 [ML] Make datafeeds work with nanosecond time fields (#51180)
Allows ML datafeeds to work with time fields that have
the "date_nanos" type _and make use of the extra precision_.
(Previously datafeeds only worked with time fields that were
exact multiples of milliseconds.  So datafeeds would work
with "date_nanos" only if the extra precision over "date" was
not used.)

Relates #49889
2020-01-21 09:59:50 +00:00
Nhat Nguyen 43ed244a04
Account soft-deletes in FrozenEngine (#51192) (#51229)
Currently, we do not exclude soft-deleted documents when opening index
reader in the FrozenEngine.

Backport of #51192
2020-01-20 17:07:29 -05:00
Adrien Grand 1a73d8329c Disable xpack/15_basic/Usage stats for mappings.
Relates #51127
2020-01-20 18:05:26 +01:00
Andrei Stefan 2908b7e5fc
SQL: add support for passing query parameters in REST API calls (#51029) (#51222)
* REST PreparedStatement-like query parameters are now supported in the form of an array of non-object, non-array values where ES SQL parser will try to infer the data type of the value being passed as parameter.

(cherry picked from commit 45b8bf619aecb1c03d7bc0cf06928dcc36005a66)
2020-01-20 16:40:19 +02:00
Andrei Stefan 543cc85b78
Add trace logging for responses coming from server (#50530) (#51221)
(cherry picked from commit 38eb485deffa175c7eb0b55a42a3e309f8a9802d)
2020-01-20 16:39:46 +02:00
Andrei Stefan df36169220
SQL: change the way unsupported data types fields are handled (#50823) (#51220)
The hierarchy of fields/sub-fields under a field that is of an
unsupported data type will be marked as unsupported as well. Until this
change, the behavior was to set the unsupported data type field's
hierarchy as empty.

Example, considering the following hierarchy of fields/sub-fields
a -> b -> c -> d, if b would be of type "foo", then b, c and d will
be marked as unsupported.

(cherry picked from commit 7adb286c4c485b9e781f88b0a2f98cab9ec5b7e2)
2020-01-20 16:23:43 +02:00
Hendrik Muhs 51134d9738 check custom meta data to avoid NPE (#51163)
check custom meta data to avoid NPE, fixes a problem introduced in #51072

fixes #51153
2020-01-20 13:53:42 +01:00
Tim Vernum a0ca82422c
Mute TimeSeriesLifecycleActionsIT.waitForSnapshot (#51208)
This test was recently un-muted, but is still failing

Relates: #50781
Backport of: #51203
2020-01-20 20:19:29 +11:00
Nik Everett 977b53ab91
Fix flaky usage tracking test (#51169) (#51179)
We added tracking of index feature usage in #51031 but due to some copy
and paste errors the test fails on some seeds. This fixes those errors.
2020-01-17 16:53:13 -05:00
Jason Tedor 9ce4d2b901
Initial autoscaling commit (#51161)
This commit merely adds the skeleton for the autoscaling project, adding
the basics to include the autoscaling module in the default
distribution, opt-in to code formatting, and a placeholder for the docs.
2020-01-17 15:31:12 -05:00
Lee Hinman 731c96b507
[7.x] Use separate policies for tests in SnapshotLifecycleRest… (#51181)
These policies store statistics, but since stats updating is asynchronous, it's
possible for the update from one test to bleed into a separate one. This change
switches the tests to use separate policy ids so that their stats are tracked
independently. It also relaxes the checking constraint in one of the tests.

Hopefully this:
Resolves #48531
Resolves #48017
2020-01-17 13:26:40 -07:00
Jay Modi 107989df3e
Introduce hidden indices (#51164)
This change introduces a new feature for indices so that they can be
hidden from wildcard expansion. The feature is referred to as hidden
indices. An index can be marked hidden through the use of an index
setting, `index.hidden`, at creation time. One primary use case for
this feature is to have a construct that fits indices that are created
by the stack that contain data used for display to the user and/or
intended for querying by the user. The desire to keep them hidden is
to avoid confusing users when searching all of the data they have
indexed and getting results returned from indices created by the
system.

Hidden indices have the following properties:
* API calls for all indices (empty indices array, _all, or *) will not
  return hidden indices by default.
* Wildcard expansion will not return hidden indices by default unless
  the wildcard pattern begins with a `.`. This behavior is similar to
  shell expansion of wildcards.
* REST API calls can enable the expansion of wildcards to hidden
  indices with the `expand_wildcards` parameter. To expand wildcards
  to hidden indices, use the value `hidden` in conjunction with `open`
  and/or `closed`.
* Creation of a hidden index will ignore global index templates. A
  global index template is one with a match-all pattern.
* Index templates can make an index hidden, with the exception of a
  global index template.
* Accessing a hidden index directly requires no additional parameters.

Backport of #50452
2020-01-17 10:09:01 -07:00
Jay Modi 96e8f67425
Upgrade to the latest OWASP HTML sanitizer (#50765) (#51166)
This commit upgrades the OWASP HTML sanitizer used by watcher to the
latest version and also upgrades guava, which it depends on. The guava
upgrade also requires the addition of a new dependency that guava
itself requires as of version 27.0. The sanitizer's behavior has changed to
re-write these templated values with a comment that results in this output
`{<!-- -->{ctx.metadata.name}}`. This would be an issue if we attempted to
sanitize the template, but the code that uses the sanitizer runs the rendered
string through the sanitizer, which means that the templated values have
been replaced already.

Relates #50395
2020-01-17 10:00:33 -07:00
Ioannis Kakavas 4fc865e579
Don't fallback to anonymous for tokens/apikeys (#51042) (#51159)
This commit changes our behavior so that when we receive a
request with an invalid/expired/wrong access token or API Key
we do not fallback to authenticating as the anonymous user even if
anonymous access is enabled for Elasticsearch.
2020-01-17 18:56:02 +02:00
David Roberts 295665b1ea [ML] Add audit warning for 1000 categories found early in job (#51146)
If 1000 different category definitions are created for a job in
the first 100 buckets it processes then an audit warning will now
be created.  (This will cause a yellow warning triangle in the
ML UI's jobs list.)

Such a large number of categories suggests that the field that
categorization is working on is not well suited to the ML
categorization functionality.
2020-01-17 16:28:45 +00:00
Przemysław Witek da73c9104e
[ML] Fix tests randomly failing on CI (#51142) (#51150) 2020-01-17 14:58:58 +01:00
Dimitris Athanasiou b70ebdeb96
[7.x][ML] DF Analytics _explain API should skip object fields (#51115) (#51147)
Object fields cannot be used as features. At the moment _explain
API includes them and even worse it allows it does not error when
an object field is excluded. This creates the expectation to the
user that all children fields will also be excluded while it's not
the case.

This commit omits object fields from the _explain API and also
adds an error if an object field is included or excluded.

Backport of #51115
2020-01-17 14:02:59 +02:00
Przemysław Witek b1a526d5e9
[7.x] [ML] Update DFA progress document in the index the document belongs to (#51111) (#51117) 2020-01-17 08:12:54 +01:00
Hendrik Muhs 13343b15c9 [Transform] Improve force stop robustness in case of an error (#51072)
If a transform config got lost (e.g. because the internal index disappeared) tasks could not be
stopped using transform API. This change makes it possible to stop transforms without a config,
meaning to remove the background task. In order to do so force must be set to true.
2020-01-17 07:42:21 +01:00
Ioannis Kakavas d0554fd317
Fail gracefully on invalid token strings (#51014) (#51096)
When we receive a request with an Authorization header that contains
a Bearer token that is not generated by us or that is malformed in
some way, attempting to decode it as one of our own might cause a
number of exceptions that are not IOExceptions. This commit ensures
that we catch and log these too and call onResponse with `null, so
that we can return 401 instead of 500.

Resolves: #50497
2020-01-16 17:00:17 +02:00
Bogdan Pintea fb65ef3f2d
SQL: Extend the optimisations for equalities (#50792) (#51098)
* Extend the optimizations for equalities

This commit supplements the optimisations of equalities in conjunctions
and disjunctions:
* for conjunctions, the existing optimizations with ranges are extended
with not-equalities and inequalities; these lead to a fast resolution,
the conjunction either being evaluate to a FALSE, or the non-equality
conditions being dropped as superfluous;
* optimisations for disjunctions are added to be applied against ranges,
inequalities and not-equalities; these lead to disjunction either
becoming TRUE or the equality being dropped, either as superfluous or
merged into a range/inequality.

* Adress review notes

* Fix the bug around wrongly optimizing 'a=2 OR a!=?', which only yields
TRUE for same values in equality and inequality.
* Var renamings, code style adjustments, comments corrections.

* Address further review comments. Extend optim.

- fix a few code comments;
- extend the Equals OR NotEquals optimitsation (a=2 OR a!=5 -> a!=5);
- extend the Equals OR Range optimisation on limits equality (a=2 OR
  2<=a<5 -> 2<=a<5);
- in case an equality is being removed in a conjunction, the rest of
  possible optimisations to test is now skipped.

* rename one var for better legiblity

- s/rmEqual/removeEquals

(cherry picked from commit 62e7c6a010f10cd7893ee5c99bad8b8d2a693436)
2020-01-16 14:32:34 +01:00
Tom Veasey 32ec934b15
[7.x][ML] Assert top classes are ordered by score (#51028)
Backport #51003.
2020-01-16 12:23:15 +00:00
markharwood ff0a45f882
Fix NPE in PinnedQuery call to DisjunctionMaxScorer. (#51047) (#51064)
Fix NPE in PinnedQuery call to DisjunctionMaxScorer. (#51047)
Added test and fix that tests for score type.
Closes #51034
2020-01-16 10:41:43 +00:00
Rory Hunter 80d925e225
Auto-format buildSrc (#51043)
Backport / reimplementation of #50786 on 7.x.

Opt-in `buildSrc` for automatic formatting. This required a config tweak
in order to pick up all the Java sources, and as a result more files are
now found in the Enrich plugin, that were previously missed.

I also moved the 2 Java files in `buildSrc/src/main/groovy` into the Java
directory, which required some follow-up changes.
2020-01-16 10:26:27 +00:00
Adrien Grand 45d7bdcfd7
Add analysis components and mapping types to the usage API. (#51062)
Knowing about used analysis components and mapping types would be incredibly
useful in order to know which ones may be deprecated or should get more love.

Some field types also act as a proxy to know about feature usage of some APIs
like the `percolator` or `completion` fields types for percolation and the
completion suggester, respectively.
2020-01-16 09:56:41 +01:00
Tim Vernum ac6602a156
Fix windows newline issue in test (#51082)
Fixes HttpCertificateCommandTests.testTextFileSubstitutions on Windows

Backport of: #51030
2020-01-16 17:01:58 +11:00
Yang Wang c1a6d5d9ff
Encrypt generated key with AES (#51019) (#51076)
Replace DES with AES to align with modern encryption standards
Backport also fixs Files.readString API that is not available in Java 8

Resolves: #50843
2020-01-16 14:47:21 +11:00
Lee Hinman 2d1c28a45d
[7.x] Fix AllocateRoutedStepTests reusing keys for random valu… (#51058)
In these tests there was a very small chance that keys could collide,
which causes test failures.

Resolves #49307
2020-01-15 11:36:34 -07:00
Lee Hinman e395cf3419
Guard against null settings in CCRIndexLifecycleIT (#51008) (#51054)
It's possible that the index could return no settings and thus throw a
`NullPointerException`.

I wasn't able to reproduce the original issue, but this should guard
against in the future.

Resolves #50646

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-01-15 11:21:18 -07:00
Lee Hinman ad60f0015e
Address failures in SnapshotLifecycleRestIT.testFullPolicySnapshot (#51013)
This test failed a couple of different ways, related to timing, as well
as concurrent snapshots, and also naming.

This commit splits the giant `assertBusy` into separate parts so that we don't
perform ~5 different requests and tests in the same loop. It also gives each
test a unique repository so that no other test can accidentally re-use
snapshots.

Resolves #50358 (hopefully!)
2020-01-15 09:47:41 -07:00
Rory Hunter 2f069d8f3f
Tweak formatter config for long generic lines (#51027)
Backport of #50909. The current formatting config allows some long
generic declarations to break the 140 character limit. Tweak the config
to wrap such lines.
2020-01-15 13:17:37 +00:00
David Roberts 1536c3e622 [TEST] Increase ML distributed test job open timeout (#50998)
There have been occasional failures, presumably due to
too many tests running in parallel, caused by jobs taking
around 15 seconds to open.  (You can see the job open
successfully during the cleanup phase shortly after the
failure of the test in these cases.)  This change increases
the wait time from 10 seconds to 20 seconds to reduce the
risk of this happening.
2020-01-15 08:58:55 +00:00
Martijn van Groningen e76c3d4d32
Tidy up enrich processors: (#50957)
* Fix generics usages.
* Sealed match processor class.
2020-01-15 08:51:22 +01:00
Tomas Della Vedova 5b6fa79fd8
[ML] Removed key value from the catch regex test (#50977) (#51021) 2020-01-15 08:50:59 +01:00
Tim Vernum e41c0b1224
Deprecating kibana_user and kibana_dashboard_only_user roles (#50963)
This change adds a new `kibana_admin` role, and deprecates
the old `kibana_user` and`kibana_dashboard_only_user`roles.

The deprecation is implemented via a new reserved metadata
attribute, which can be consumed from the API and also triggers
deprecation logging when used (by a user authenticating to
Elasticsearch).

Some docs have been updated to avoid references to these
deprecated roles.

Backport of: #46456

Co-authored-by: Larry Gregory <lgregorydev@gmail.com>
2020-01-15 11:07:19 +11:00
Nik Everett fc5fde7950
Add "did you mean" to ObjectParser (#50938) (#50985)
Check it out:
```
$ curl -u elastic:password -HContent-Type:application/json -XPOST localhost:9200/test/_update/foo?pretty -d'{
  "dac": {}
}'

{
  "error" : {
    "root_cause" : [
      {
        "type" : "x_content_parse_exception",
        "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?"
      }
    ],
    "type" : "x_content_parse_exception",
    "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?"
  },
  "status" : 400
}
```

The tricky thing about implementing this is that x-content doesn't
depend on Lucene. So this works by creating an extension point for the
error message using SPI. Elasticsearch's server module provides the
"spell checking" implementation.
s
2020-01-14 17:53:41 -05:00
Nik Everett 9a3d4db840
Begin moving date_histogram to offset rounding (backport of #50873) (#50978)
We added a new rounding in #50609 that handles offsets to the start and
end of the rounding so that we could support `offset` in the `composite`
aggregation. This starts moving `date_histogram` to that new offset.
2020-01-14 16:50:27 -05:00
Benjamin Trent 72c270946f
[ML][Inference] Adding classification_weights to ensemble models (#50874) (#50994)
* [ML][Inference] Adding classification_weights to ensemble models

classification_weights are a way to allow models to
prefer specific classification results over others
this might be advantageous if classification value
probabilities are a known quantity and can improve
model error rates.
2020-01-14 12:40:25 -05:00
Tom Veasey de5713fa4b
[ML] Disable invalid assertion (#50988)
Backport #50986.
2020-01-14 17:35:00 +00:00
Armin Braun 16c07472e5
Track Snapshot Version in RepositoryData (#50930) (#50989)
* Track Snapshot Version in RepositoryData (#50930)

Add tracking of snapshot versions to RepositoryData to make BwC logic more efficient.
Follow up to #50853
2020-01-14 18:15:07 +01:00
David Kyle 7f309a18f1
[7.x][ML] Explicitly require a OriginSettingClient in ML results iterators (#50981)
In classes where the client is used directly rather than through a call to 
executeAsyncWithOrigin explicitly require the client to be OriginSettingClient 
rather than using the Client interface. 

Also remove calls to deprecated ClientHelper.clientWithOrigin() method.
2020-01-14 17:14:39 +00:00
Dimitris Athanasiou 1d8cb3c741
[7.x][ML] Add num_top_feature_importance_values param to regression and classi… (#50914) (#50976)
Adds a new parameter to regression and classification that enables computation
of importance for the top most important features. The computation of the importance
is based on SHAP (SHapley Additive exPlanations) method.

Backport of #50914
2020-01-14 16:46:09 +02:00
Hendrik Muhs 0178c7c5d0
[7.x][Transform] correctly retrieve checkpoints from remote indices (#50903) (#50969)
uses remote client(s) to correctly retrieve index checkpoints from remote clusters
2020-01-14 15:09:14 +01:00
Przemysław Witek 9c6ffdc2be
[7.x] Handle nested and aliased fields correctly when copying mapping. (#50918) (#50968) 2020-01-14 14:43:39 +01:00
David Kyle 69a3626ee1 Mute SnapshotLifecycleRestIT testFullPolicySnapshot
Relates to #50358
2020-01-14 13:46:37 +01:00
Daniel Mitterdorfer 263083b882
Mute HttpCertificateCommandTests.testTextFileSubstitutions (#50965) (#50966)
Relates #50964
2020-01-14 12:40:34 +01:00
Tim Vernum 2bb7b53e41
Add certutil http command (#50952)
This adds a new "http" sub-command to the certutil CLI tool.

The http command generates certificates/CSRs for use on the http
interface of an elasticsearch node/cluster.
It is designed to be a guided tool that provides explanations and
sugestions for each of the configuration options. The generated zip
file output includes extensive "readme" documentation and sample
configuration files for core Elastic products.

Backport of: #49827
2020-01-14 21:24:21 +11:00
Tim Vernum b02b073a57
Increase Size and lower TTL on DLS BitSet Cache (#50953)
The Document Level Security BitSet Cache (see #43669) had a default
configuration of "small size, long lifetime". However, this is not
a very useful default as the cache is most valuable for BitSets that
take a long time to construct, which is (generally speaking) the same
ones that operate over a large number of documents and contain many
bytes.

This commit changes the cache to be "large size, short lifetime" so
that it can hold bitsets representing billions of documents, but
releases memory quickly.

The new defaults are 10% of heap, and 2 hours.

This also adds some logging when a single BitSet exceeds the size of
the cache and when the cache is full.

Backport of: #50535
2020-01-14 18:04:02 +11:00
Tim Vernum 33c29fb5a3
Support Client and RoleMapping in custom Realms (#50950)
Previously custom realms were limited in what services and components
they had easy access to. It was possible to work around this because a
security extension is packaged within a Plugin, so there were ways to
store this components in static/SetOnce variables and access them from
the realm, but those techniques were fragile, undocumented and
difficult to discover.

This change includes key services as an argument to most of the methods
on SecurityExtension so that custom realm / role provider authors can
have easy access to them.

Backport of: #50534
2020-01-14 15:26:41 +11:00
Tim Vernum 90ba77951a
Fix memory leak in DLS bitset cache (#50946)
The Document Level Security BitSet cache stores a secondary "lookup
map" so that it can determine which cache entries to invalidate when
a Lucene index is closed (merged, etc).

There was a memory leak because this secondary map was not cleared
when entries were naturally evicted from the cache (due to size/ttl
limits).

This has been solved by adding a cache removal listener and processing
those removal events asyncronously.

Backport of: #50635
2020-01-14 13:19:05 +11:00
Tim Vernum 1577a0e617
Validate field permissions when creating a role (#50917)
When creating a role, we do not check if the exceptions for
the field permissions are a subset of granted fields. If such
a role is assigned to a user then that user's authentication fails
for this reason.

We added a check to validate role query in #46275 and on the same lines,
this commit adds check if the exceptions for the field
permissions is a subset of granted fields when parsing the
index privileges from the role descriptor.

Backport of: #50212

Co-authored-by: Yogesh Gaikwad <bizybot@users.noreply.github.com>
2020-01-14 12:37:45 +11:00
Tim Vernum c2acb8830a
Add max_resource_units to enterprise license (#50910)
The enterprise license type must have "max_resource_units" and may not
have "max_nodes".

This change adds support for this new field, validation that the field
is present if-and-only-if the license is enterprise and bumps the
license version number to reflect the new field.

Includes a BWC layer to return "max_nodes: ${max_resource_units}" in
the GET license API.

Backport of: #50735
2020-01-14 12:37:05 +11:00
Przemko Robakowski a18736b46d
[7.x] ILM action to wait for SLM policy execution (#50454) (#50943)
* ILM action to wait for SLM policy execution (#50454)

This change add new ILM action to wait for SLM policy execution to ensure that index has snapshot before deletion.

Closes #45067

* Fix flaky TimeSeriesLifecycleActionsIT#testWaitForSnapshot test

This change adds some randomness and cleanup step to TimeSeriesLifecycleActionsIT#testWaitForSnapshot and testWaitForSnapshotSlmExecutedBefore tests in attempt to make them stable.

Reletes to #50781

* Formatting changes

* Longer timeout

* Fix Map.of in Java8

* Unused import removed
2020-01-14 01:34:33 +01:00
Lee Hinman 91689e793d
[7.x] Refresh cached phase policy definition if possible on ne… (#50941)
* Refresh cached phase policy definition if possible on new policy

There are some cases when updating a policy does not change the
structure in a significant way. In these cases, we can reread the
policy definition for any indices using the updated policy.

This commit adds this refreshing to the `TransportPutLifecycleAction`
to allow this. It allows us to do things like change the configuration
values for a particular step, even when on that step (for example,
changing the rollover criteria while on the `check-rollover-ready` step).

There are more cases where the phase definition can be reread that just
the ones checked here (for example, removing an action that has already
been passed), and those will be added in subsequent work.

Relates to #48431
2020-01-13 14:31:41 -07:00
Bogdan Pintea f04b4cbee8
SQL: Optimisation fixes for conjunction merges (#50703) (#50933)
* SQL: Optimisation fixes for conjunction merges

This commit fixes the following issues around the way comparisions are
merged with ranges in conjunctions:
* the decision to include the equality of the lower limit is corrected;
* the selection of the upper limit is corrected to use the upper bound
of the range;
* the list of terms in the conjunction is sorted to have the ranges at
the bottom; this allows subsequent binary comarisions to find compatible
ranges and potentially be merged away. The end guarantee being that the
optimisation takes place irrespective of the order of the conjunction
terms in the statement.

Some comments are also corrected.

* adress review observation on anon. comparator

Replace anonymous comparator of split AND Expressions with a lambda.

(cherry picked from commit 9828cb143a41f1bda1219541f3a8fdc03bf6dd14)
2020-01-13 21:51:29 +01:00
Ioannis Kakavas ba37e3c4a0
Disable DiagnosticTrustManager in FIPS 140 (#49888)
This commit changes the default behavior for
xpack.security.ssl.diagnose.trust when running in a FIPS 140 JVM.

More specifically, when xpack.security.fips_mode.enabled is true:

- If xpack.security.ssl.diagnose.trust is not explicitly set, the
    default value of it becomes false and a log message is printed
    on info level, notifying of the fact that the TLS/SSL diagnostic
    messages are not enabled when in a FIPS 140 JVM.
- If xpack.security.ssl.diagnose.trust is explicitly set, the value of
    it is honored, even in FIPS mode.

This is relevant only for 7.x where we support Java 8 in which
SunJSSE can still be used as a FIPS 140 provider for TLS. SunJSSE
in FIPS mode, disallows the use of other TrustManager implementations
than the one shipped with SunJSSE.
2020-01-13 17:04:23 +02:00
Larry Gregory cc8aafcfc2
[7.x] - Adding GET/PUT ILM cluster privileges to `kibana_syste… (#50878)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-01-13 08:36:48 -05:00
Benjamin Trent eb8fd44836
[ML][Inference] minor fixes for created_by, and action permission (#50890) (#50911)
The system created and models we provide now use the `_xpack` user for uniformity with our other features

The `PUT` action is now an admin cluster action

And XPackClient class now references the action instance.
2020-01-13 07:59:31 -05:00
Albert Zaharovits 4e837599b3 Nit fix test randomInt bound
Relates 2b789fa3e6
2020-01-13 13:28:20 +02:00
Albert Zaharovits 2b789fa3e6
Make .async-search-* a restricted namespace (#50294)
Hide the `.async-search-*` in Security by making it a restricted index namespace.
The namespace is hard-coded.
To grant privileges on restricted indices, one must explicitly toggle the
`allow_restricted_indices` flag in the indices permission in the role definition.
As is the case with any other index, if a certain user lacks all permissions for an
index, that index is effectively nonexistent for that user.
2020-01-13 12:20:54 +02:00
Tim Vernum 985c95dcca
Populate OpenIDConnect metadata collections (#50893)
The OpenIdConnectRealm had a bug which would cause it not to populate
User metadata for collections contained in the user JWT claims.

This commit fixes that bug.

Backport of: #50521
2020-01-13 18:02:22 +11:00
Benjamin Trent fa116a6d26
[7.x] [ML][Inference] PUT API (#50852) (#50887)
* [ML][Inference] PUT API (#50852)

This adds the `PUT` API for creating trained models that support our format.

This includes

* HLRC change for the API
* API creation
* Validations of model format and call

* fixing backport
2020-01-12 10:59:11 -05:00
Lee Hinman 63472d30c7
[7.x] Fix SLM check for restore in progress (#50868) (#50876)
* Fix SLM check for restore in progress (#50868)

* Fix SLM check for restore in progress

This commit fixes the check in SLM where the `RestoreInProgress`
metadata was checked for existence. Rather than check existence we
should instead check the `isEmpty` method. Prior to this, a successful
restore for a repository that used SLM retention would prevent SLM
retention from running in subsequent invocations, due to SLM thinking
that a restore was still running.

* Fix 7.x-isms
2020-01-10 14:27:55 -07:00
Julie Tibshirani 3bac1dc414 Adjust the skip version in flattened field telemetry tests.
We forgot to adjust the version when backporting the commit to 7.x.
2020-01-10 10:36:41 -08:00
Benjamin Trent 5afa0b71e9
[ML][Inference] Unify top_classes object field names with analytics (#50858) (#50861) 2020-01-10 12:00:37 -05:00
Dimitris Athanasiou 422422a2bc
[7.x][ML] Reuse SourceDestValidator for data frame analytics (#50841) (#50850)
This commit removes validation logic of source and dest indices
for data frame analytics and replaces it with using the common
`SourceDestValidator` class which is already used by transforms.
This way the validations and their messages become consistent
while we reduce code.

This means that where these validations fail the error messages
will be slightly different for data frame analytics.

Backport of #50841
2020-01-10 14:24:13 +02:00
Nik Everett ae40e22452
Drop "funny" functions building parsers (#50715) (#50814)
Replaces the "funny"
`Function<String, ConstructingObjectParser<T, Void>>` with a much
simpler `ConstructingObjectParser<T, String>`. This makes pretty much
all of our object parsers static.
2020-01-09 15:53:03 -05:00
Jake Landis de6f132887
[7.x] Foreach processor - fork recursive call (#50514) (#50773)
A very large number of recursive calls can cause a stack overflow
exception. This commit forks the recursive calls for non-async
processors. Once forked, each thread will handle at most 10
recursive calls to help keep the stack size and thread count
down to a reasonable size.
2020-01-09 13:21:18 -06:00
Benjamin Trent cc0e64572a
[ML][Inference][HLRC] Add necessary lang ident classes (#50705) (#50794)
This adds the necessary named XContent classes to the HLRC for the lang ident model. This is so the HLRC can call `GET _ml/inference/lang_ident_model_1?include_definition=true` without XContent parsing errors.

The constructors are package private as since this classes are used exclusively within the pre-packaged model (and require the specific weights, etc. to be of any use).
2020-01-09 10:33:38 -05:00
Benjamin Trent 3e014d39c2
[Transform] fail to start/put on missing pipeline (#50701) (#50795)
If a pipeline referenced by a transform does not exist, we should not allow the transform to be created. 

We do allow the pipeline existence check to be skipped with defer_validations, but if the pipeline still does not exist on `_start`, the pipeline will fail to start.

relates:  #50135
2020-01-09 10:33:22 -05:00
Martijn van Groningen f75d99149b
Wrap triggering of a watch inside an assertBusy(...) invocation
This test replaces the watch index after watcher got started.
This triggers watches being reloaded and while this happens the
trigger engine is paused, which disallows watches from being
triggered. At this time there are no watches in the .watches
index and I think this is just unlucky timing.

Reloading of watches happens in the background and
the watch state can be started when that happens.
For normal schedule trigger engines this is not an issue,
because watches that are meant to be triggered are triggered
when the engine triggers the next time. However for the
mock scheduled trigger engine this is different,
because watches are triggered programatically and
there is no retry in this test.

I think just adding `timeWarp().trigger("mywatch");` inside
a `assertBusy(...)`` is the right fix here.  If it fails
because the mock schedule trigger engine is paused then
the test will try again. In the mean time the the watches
can be reloaded, which then resumes the mock scheduled trigger engine.

Closes #50658
2020-01-09 09:05:20 +01:00
Ioannis Kakavas d2189b9d80
Mute SamlAuthenticatorTests in Azulu Zulu (#50779)
See #49742
2020-01-09 09:41:04 +02:00
Christoph Büscher b1b4282273 Make Multiplexer inherit filter chains analysis mode (#50662)
Currently, if an updateable synonym filter is included in a multiplexer filter,
it is not reloaded via the _reload_search_analyzers because the multiplexer
itself doesn't pass on the analysis mode of the filters it contains, so its not
recognized as "updateable" in itself. Instead we can check and merge the
AnalysisMode settings of all filters in the multiplexer and use the resulting
mode (e.g. search-time only) for the multiplexer itself, thus making any synonym
filters contained in it reloadable.  This, of course, will also make the
analyzers using the multiplexer be usable at search-time only.

Closes #50554
2020-01-08 22:12:01 +01:00
Lee Hinman 8dc6e98819
[7.x] Make InitializePolicyContextStep retryable (#50685) (#50760)
This commits makes the "init" ILM step retryable. It also adds a test
where an index is created with a non-parsable index name and then fails.

Related to #48183
2020-01-08 13:13:57 -07:00
Nhat Nguyen 90e66a7b97 Mute testPolicyCRUD
Tracked at #44997
2020-01-08 13:25:40 -05:00
Adrien Grand 4f2299c714
Upgrade to Lucene 8.4.0. (#50518) (#50750) 2020-01-08 18:53:59 +01:00
Lee Hinman 615532b4f8 Mute TimeSeriesLifecycleActionsIT.testHistoryIsWritten* (#50755)
Related to #50353
2020-01-08 10:35:44 -07:00
Adrien Grand 31158ab3d5
Add per-field metadata. (#50333)
This PR adds per-field metadata that can be set in the mappings and is later
returned by the field capabilities API. This metadata is completely opaque to
Elasticsearch but may be used by tools that index data in Elasticsearch to
communicate metadata about fields with tools that then search this data. A
typical example that has been requested in the past is the ability to attach
a unit to a numeric field.

In order to not bloat the cluster state, Elasticsearch requires that this
metadata be small:
 - keys can't be longer than 20 chars,
 - values can only be numbers or strings of no more than 50 chars - no inner
   arrays or objects,
 - the metadata can't have more than 5 keys in total.

Given that metadata is opaque to Elasticsearch, field capabilities don't try to
do anything smart when merging metadata about multiple indices, the union of
all field metadatas is returned.

Here is how the meta might look like in mappings:

```json
{
  "properties": {
    "latency": {
      "type": "long",
      "meta": {
        "unit": "ms"
      }
    }
  }
}
```

And then in the field capabilities response:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms" ]
      }
    }
  }
}
```

When there are no conflicts, values are arrays of size 1, but when there are
conflicts, Elasticsearch includes all unique values in this array, without
giving ways to know which index has which metadata value:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms", "ns" ]
      }
    }
  }
}
```

Closes #33267
2020-01-08 16:21:18 +01:00
Andrei Dan 3915d4c055
Make the UpdateRolloverLifecycleDateStep retryable (#50702) (#50730)
This makes the "update-rollover-lifecycle-date" step, which is part of the
rollover action, retryable. It also adds an integration test to check the
step is retried and it eventually succeeds.

(cherry picked from commit 5bf068522deb2b6cd2563bcf80f34fdbf459c9f2)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-08 11:45:26 +01:00
Christoph Büscher d8c907d648 Remove _reload_search_analyzer experimental status (#50696)
Removing the experimental status in the docs and the rest specs.
2020-01-08 10:35:19 +01:00
Tim Vernum 293661d62c
Security should not reload files that haven't changed (#50724)
In security we currently monitor a set of files for changes:

- config/role_mapping.yml (or alternative configured path)
- config/roles.yml
- config/users
- config/users_roles

This commit prevents unnecessary reloading when the file change actually doesn't change the internal structure.

Backport of: #50207

Co-authored-by: Anton Shuvaev <anton.shuvaev91@gmail.com>
2020-01-08 15:13:47 +11:00
Mayya Sharipova c1c0b47d5e
Specify the indexname in searches (#50717)
vector REST tests occasionally fail on 7.x because
we don't receive the expected response headers with deprecation warnings.
This happens as searchers were executed against all indices including
internal indices, whose shards did not produce expected warnings.

This PR ensures that searchers are executed only against expected
indices.

Closes #50716
2020-01-07 17:06:52 -05:00
Benjamin Trent 060e0a6277
[ML][Inference] Add support for models shipped as resources (#50680) (#50700)
This adds support for models that are shipped as resources in the ML plugin. The first of which is the `lang_ident` model.
2020-01-07 09:21:59 -05:00
Hendrik Muhs 98ca9500e8
implement a workaround for remote cluster validation (#50460)
In 7.x an internal API used for validating remote cluster does not throw, see #50420 for the 
details. This change implements a workaround for remote cluster validation, only for 7.x branches.

fixes #50420
2020-01-07 13:51:51 +01:00
Przemysław Witek 4116452d90
Implement testStopAndRestart for ClassificationIT (#50585) (#50698) 2020-01-07 13:41:37 +01:00
David Roberts 35453e2b0e [ML] Improve uniqueness of result document IDs (#50644)
Switch from a 32 bit Java hash to a 128 bit Murmur hash for
creating document IDs from by/over/partition field values.
The 32 bit Java hash was not sufficiently unique, and could
produce identical numbers for relatively common combinations
of by/partition field values such as L018/128 and L017/228.

Fixes #50613
2020-01-07 10:24:45 +00:00
David Roberts 46d600c446 [ML] Fix off-by-one error in ml_classic tokenizer end offset (#50655)
The end offset of a tokenizer is supposed to point one past the
end of the input, not to the end character of the input.  The
ml_classic tokenizer was erroneously doing the latter.
2020-01-07 10:14:59 +00:00
Lee Hinman 552edd862e
[7.x] Add aditional logging for ILM history store tests (#5062… (#50678)
* Add aditional logging for ILM history store tests (#50624)

These tests use the same index name, making it hard to read logs when
diagnosing the failures. Additionally more information about the current
state of the index could be retrieved when failing.

This changes these two things in the hope of capturing more data about
why this fails on some CI nodes but not others.

Relates to #50353
2020-01-06 15:24:24 -07:00
Nik Everett 7fd84a03a0
Drop references to deprecated logger (#50474) (#50681)
This drops all remaining references to `BaseRestHandler.logger` which
has been deprecated for something like a year now. I replaced all of the
references with locally declared loggers which is so much less spooky
action at a distance to me.
2020-01-06 16:34:07 -05:00
Benjamin Trent 06cea5136e
[ML] construct new random generator on each persistence call (#50657) (#50684)
Sharing a random generator may cause test failures as non-threadsafe random generators are periodically utilized in tests (see: https://github.com/elastic/elasticsearch/issues/50651)

This change constructs a calls `Randomness.get()` within the  `bulkIndexWithRetry` method so that the returned `Random` object is only used in a single thread. Before, the member variable could have been used between threads, which caused test failures.
2020-01-06 16:26:29 -05:00
Benjamin Trent 5ab9e75e28
[7.x] [ML][Inference] lang_ident model (#50292) (#50675)
* [ML][Inference] lang_ident model (#50292)

This PR contains a java port of Google's CLD3 compact NN model https://github.com/google/cld3

The ported model is formatted to fit within our inference model formatting and stored as a resource in the `:xpack:ml:` plugin and is under basic license.

The model is broken up into two major parts:
- Preprocessing through the custom embedding (based on CLD3's embedding layer)
- Pushing the embedded text through the two layers of fully connected shallow NN. 

Main differences between this port and CLD3:
- We take advantage of Java's internal Unicode handling where possible (i.e. codepoints, characters, decoders, etc.)
- We do not trim down input text by removing duplicated tokens
- We do not encode doubles/floats as longs/integers.
2020-01-06 16:24:03 -05:00
Benjamin Trent f52af7977d
[ML][Inference] minor cleanup for inference (#50444) (#50676) 2020-01-06 14:05:04 -05:00
Nik Everett 1b28af489f
Fix bare warnings on RollupJobTests (#50633) (#50677)
Silences some ugly warnings.
2020-01-06 14:03:30 -05:00
Albert Zaharovits 9ae3cd2a78
Add 'monitor_snapshot' cluster privilege (#50489) (#50647)
This adds a new cluster privilege `monitor_snapshot` which is a restricted
version of `create_snapshot`, granting the same privileges to view
snapshot and repository info and status but not granting the actual
privilege to create a snapshot.

Co-authored-by: j-bean <anton.shuvaev91@gmail.com>
2020-01-06 13:15:55 +02:00
Nik Everett 2362c430cd
Clean up wire test case a bit (#50627) (#50632)
* Adds JavaDoc to `AbstractWireTestCase` and
`AbstractWireSerializingTestCase` so it is more obvious you should prefer
the latter if you have a choice
* Moves the `instanceReader` method out of `AbstractWireTestCase` becaue
it is no longer used.
* Marks a bunch of methods final so it is more obvious which classes are
for what.
* Cleans up the side effects of the above.
2020-01-05 16:20:38 -05:00
Nik Everett 45663ac1a8
Use Void context on parsers where possible (#50573) (#50617)
*Most* of our parsing can be done without passing any extra context into
the parser that isn't already part of the xcontent stream. While I was
looking around at the places that *do* need a context I found a few
places that were declared to need a context but don't actually need it.
2020-01-03 13:28:55 -05:00
Christoph Büscher 6c8868e955 Mute TimeSeriesLifecycleActionsIT.testHistoryIsWrittenWithSuccess
Also muting TimeSeriesLifecycleActionsIT.testHistoryIsWrittenWithFailure.

Tracked in #50353
2020-01-03 18:32:03 +01:00
Andrei Dan 3c971f2911
ILM retryable async action steps (#50522) (#50591)
This adds support for retrying AsyncActionSteps by triggering the async
step after ILM was moved back on the failed step (the async step we'll
be attempting to run after the cluster state reflects ILM being moved
back on the failed step).

This also marks the RolloverStep as retryable and adds an integration
test where the RolloverStep is failing to execute as the rolled over
index already exists to test that the async action RolloverStep is
retried until the rolled over index is deleted.

(cherry picked from commit 8bee5f4cb58a1242cc2ef4bc0317dae6c8be49d3)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-03 16:19:58 +02:00
Dimitris Athanasiou ca0828ba07
[7.x][ML] Implement force deleting a data frame analytics job (#50553) (#50589)
Adds a `force` parameter to the delete data frame analytics
request. When `force` is `true`, the action force-stops the
jobs and then proceeds to the deletion. This can be used in
order to delete a non-stopped job with a single request.

Closes #48124

Backport of #50553
2020-01-03 13:46:02 +02:00
Przemysław Witek 8917c05df8
[7.x] Synchronize processInStream.close() call (#50581) 2020-01-03 10:23:51 +01:00
Lee Hinman 0d78aa2708
Don't dump a stacktrace for invalid patterns when executing elasticsearch-croneval (#49744) (#50578)
Co-authored-by: bellengao <gbl_long@163.com>
2020-01-02 16:57:51 -07:00
Nik Everett b36a8ab141
Make some ObjectParsers final (#50471) (#50556)
We have about 800 `ObjectParsers` in Elasticsearch, about 700 of which
are final. This is *probably* the right way to declare them because in
practice we never mutate them after they are built. And we certainly
don't change the static reference. Anyway, this adds `final` to a bunch
of these parsers, mostly the ones in xpack and their "paired" parsers in
the high level rest client. I picked these just to have somewhere to
break the up the change so it wouldn't be huge.

I found the non-final parsers with this:
```
diff \
  <(find . -type f -name '*.java' -exec grep -iHe 'static.*PARSER\s*=' {} \+ | sort) \
  <(find . -type f -name '*.java' -exec grep -iHe 'static.*final.*PARSER\s*=' {} \+ | sort) \
  2>&1 | grep '^<'
```
2020-01-02 10:47:38 -05:00
Przemysław Witek 4ecabe496f
Mute testStopAndRestart test case (#50551) 2020-01-02 15:28:20 +01:00
Christoph Büscher 1599af8428 Fix type conversion problem in Eclipse (#50549)
Eclipse 4.13 shows a type mismatch error in the affected line because it cannot
correctly infer the boolean return type for the method call. Assigning return
value to a local variable resolves this problem.
2020-01-02 14:29:20 +01:00
Tim Vernum cad0f6bf28
Do not load SSLService in plugin contructor (#50519)
XPackPlugin created an SSLService within the plugin contructor.
This has 2 negative consequences:

1. The service may be constructed based on a partial view of settings.
   Other plugins are free to add setting values via the
   additionalSettings() method, but this (necessarily) happens after
   plugins have been constructed.

2. Any exceptions thrown during the plugin construction are handled
   differently than exceptions thrown during "createComponents".
   Since SSL configurations exceptions are relatively common, it is
   far preferable for them to be thrown and handled as part of the
   createComponents flow.

This commit moves the creation of the SSLService to
XPackPlugin.createComponents, and alters the sequence of some other
steps to accommodate this change.

Backport of: #49667
2019-12-30 14:42:32 +11:00
Armin Braun cec02da0ac
Fix Source Only Snapshot REST Test Failure (#50456) (#50459)
We are matching on the exact number of shards in this test, but may run into
snapshotting more than the single index created in it due to auto-created indices like
`.watcher`.
Fixed by making the test only take a snapshot of the single index used by this test.

Closes #50450
2019-12-23 12:24:08 +01:00
Igor Motov 339d10c16f Geo: Switch generated GeoJson type names to camel case (#50400)
Switches generated GeoJson type names to camel case
to conform to the standard.

Closes #49568
2019-12-20 15:37:22 -05:00
Lee Hinman c3c9ccf61f
[7.x] Add ILM histore store index (#50287) (#50345)
* Add ILM histore store index (#50287)

* Add ILM histore store index

This commit adds an ILM history store that tracks the lifecycle
execution state as an index progresses through its ILM policy. ILM
history documents store output similar to what the ILM explain API
returns.

An example document with ALL fields (not all documents will have all
fields) would look like:

```json
{
  "@timestamp": 1203012389,
  "policy": "my-ilm-policy",
  "index": "index-2019.1.1-000023",
  "index_age":123120,
  "success": true,
  "state": {
    "phase": "warm",
    "action": "allocate",
    "step": "ERROR",
    "failed_step": "update-settings",
    "is_auto-retryable_error": true,
    "creation_date": 12389012039,
    "phase_time": 12908389120,
    "action_time": 1283901209,
    "step_time": 123904107140,
    "phase_definition": "{\"policy\":\"ilm-history-ilm-policy\",\"phase_definition\":{\"min_age\":\"0ms\",\"actions\":{\"rollover\":{\"max_size\":\"50gb\",\"max_age\":\"30d\"}}},\"version\":1,\"modified_date_in_millis\":1576517253463}",
    "step_info": "{... etc step info here as json ...}"
  },
  "error_details": "java.lang.RuntimeException: etc\n\tcaused by:etc etc etc full stacktrace"
}
```

These documents go into the `ilm-history-1-00000N` index to provide an
audit trail of the operations ILM has performed.

This history storage is enabled by default but can be disabled by setting
`index.lifecycle.history_index_enabled` to `false.`

Resolves #49180

* Make ILMHistoryStore.putAsync truly async (#50403)

This moves the `putAsync` method in `ILMHistoryStore` never to block.
Previously due to the way that the `BulkProcessor` works, it was possible
for `BulkProcessor#add` to block executing a bulk request. This was bad
as we may be adding things to the history store in cluster state update
threads.

This also moves the index creation to be done prior to the bulk request
execution, rather than being checked every time an operation was added
to the queue. This lessens the chance of the index being created, then
deleted (by some external force), and then recreated via a bulk indexing
request.

Resolves #50353
2019-12-20 12:33:36 -07:00
Benjamin Trent 71ff330c4e
[ML][Inference] updates specs with new params + docs (#50373) (#50441) 2019-12-20 12:13:45 -05:00
Martijn van Groningen 9646f3abad
Disable slm in AbstractWatcherIntegrationTestCase (#50422)
SLM isn't required tests extending from this base class and
only add noise during test suite teardown.

Closes #50302
2019-12-20 15:51:46 +01:00
Przemysław Witek 3e3a93002f
[7.x] Fix accuracy metric (#50310) (#50433) 2019-12-20 15:34:38 +01:00
Przemysław Witek 14d95aae46
[7.x] Make each analysis report desired field mappings to be copied (#50219) (#50428) 2019-12-20 15:10:33 +01:00
Przemysław Witek 5bb668b866
[7.x] Get rid of maxClassesCardinality internal parameter (#50418) (#50423) 2019-12-20 14:24:23 +01:00
Hendrik Muhs 40bce49a7f mute SourceDestValidatorTests.testRemoteSourceDoesNotExist 2019-12-20 11:25:43 +01:00
Hendrik Muhs 7c10e9b0e7 [Transform] improve checkpoint reporting (#50369)
fixes empty checkpoints, re-factors checkpoint info creation (moves builder) and always reports
last change detection

relates #43201
relates #50018
2019-12-20 10:49:53 +01:00
Hendrik Muhs de14092ad2 [Transform] refactor source and dest validation to support CCS (#50018)
refactors source and dest validation, adds support for CCS, makes resolve work like reindex/search, allow aliased dest index with a single write index.

fixes #49988
fixes #49851
relates #43201
2019-12-20 10:49:53 +01:00
Marios Trivyzas f1a6b675f7 SQL: Fix issue with CAST and NULL checking. (#50371)
Previously, during expression optimisation, CAST would be considered
nullable if the casted expression resulted to a NULL literal, and would
be always non-nullable otherwise. As a result if CASE was wrapped by a
null check function like IS NULL or IS NOT NULL it was simplified to
TRUE/FALSE, eliminating the actual casting operation. So in case of an
expression with an erroneous casting like CAST('foo' AS DATETIME) IS NULL
it would be simplified to FALSE instead of throwing an Exception signifying
the attempt to cast 'foo' to a DATETIME type.

CAST now always returns Nullability.UKNOWN except from the case that
its result evaluated to a constant NULL, where it returns Nullability.TRUE.
This way the IS NULL/IS NOT NULL don't get simplified to FALSE/TRUE
and the CAST actually gets evaluated resulting to a thrown Exception.

Fixes: #50191
(cherry picked from commit 671e07a931cd828661e226cba22a5d38804a17a5)
2019-12-20 10:24:35 +02:00
Tim Brooks cb73fb0f9b
Backport remote proxy mode stats and naming (#50402)
* Update remote cluster stats to support simple mode (#49961)

Remote cluster stats API currently only returns useful information if
the strategy in use is the SNIFF mode. This PR modifies the API to
provide relevant information if the user is in the SIMPLE mode. This
information is the configured addresses, max socket connections, and
open socket connections.

* Send hostname in SNI header in simple remote mode (#50247)

Currently an intermediate proxy must route conncctions to the
appropriate remote cluster when using simple mode. This commit offers
a additional mechanism for the proxy to route the connections by
including the hostname in the TLS SNI header.

* Rename the remote connection mode simple to proxy (#50291)

This commit renames the simple connection mode to the proxy connection
mode for remote cluster connections. In order to do this, the mode specific
settings which we namespaced by their mode (ex: sniff.seed and
proxy.addresses) have been reverted.

* Modify proxy mode to support a single address (#50391)

Currently, the remote proxy connection mode uses a list setting for the
proxy address. This commit modifies this so that the setting is
proxy_address and only supports a single remote proxy address.
2019-12-19 18:02:48 -07:00
Stuart Tettemer 689df1f28f
Scripting: ScriptFactory not required by compile (#50344) (#50392)
Avoid backwards incompatible changes for 8.x and 7.6 by removing type
restriction on compile and Factory.  Factories may optionally implement
ScriptFactory.  If so, then they can indicate determinism and thus
cacheability.

**Backport**

Relates: #49466
2019-12-19 12:50:25 -07:00
Przemysław Witek cc4bc797f9
[7.x] Implement `precision` and `recall` metrics for classification evaluation (#49671) (#50378) 2019-12-19 18:55:05 +01:00
Igor Motov c77ca98928 Geo: Switch generated WKT to upper case (#50285)
Switches generated WKT to upper case to
conform to the standard recommendation.

Relates #49568
2019-12-18 17:29:08 -05:00
Dimitris Athanasiou d3c83cd55a
[7.x][ML] Refresh state index before completing data frame analytics job (#50322) (#50324)
In order to ensure any persisted model state is searchable by the moment
the job reports itself as `stopped`, we need to refresh the state index
before completing.

This should fix the occasional failures we see in #50168 and #50313 where
the model state appears missing.

Closes #50168
Closes #50313

Backport of #50322
2019-12-18 22:19:59 +00:00